Eleanor D'Arcy

Life as a STOR-i MRes Student: Lent Term

Eleanor D'Arcy — Fri, 01 May 2020 09:37:58 +0000

To continue with my ‘Life as a STOR-i MRes Student’�� blogging thread, I have decided to reflect on my previous term (Lent term). This provides me with an opportunity to review the work and activities that I have been involved with between Christmas and Easter 2020. To find out more about life as an MRes student before Christmas, please read my previous post from this thread.

Upon returning from the Christmas break, I was reunited with my colleagues and friends at the STOR-i annual conference (see post). This was a great opportunity to network with key researchers from Statistics, Operational Research and Industry.

Lent term focussed on independent work with a view to give us an insight into life as a PhD student. This involved completing two research projects working with an academic in the field. I worked on

Slot Scheduling in Air Transportation with Professor Konstantinos Zografos (see my blog post here)
Missing Data with Dr Robin Mitra

These projects provided a great opportunity to work with an expert in their field as well as carry out independent research.��

Another enjoyable part of Lent Term was the problem solving days. This involves a company visiting STOR-i with an industrial problem that may lend itself to either statistics and/or OR, then we work in teams with an aim to provide guidance on a solution. The three problem solving days were:

Tesco
Machine Learning Predictions and Optimisation: Fuel Pricing��
BBC
Temporal clustering
Electricity North West
Assessing the Plausibility of Data

During Lent term we also had masterclass’ which involved external academics presenting work in their research area. This was another opportunity to network as well as learn about important areas in statistics and OR. The three masterclass’ were:

Model-based Clustering and Classification with Professor Brendan Murphy, University College Dublin��
(See the blog post I wrote on this��here)
Public Sector OR with Professor Laura Albert, University of Wisconsin Madison
(See my relevant blog post��here)
Bayesian Optimisation with Professor Peter Frazier, Cornell University��
(This was done virtually due to the coronavirus outbreak, many thanks to Peter for making this work)��

Unfortunately the final masterclass (Ranking and Selection for Simulation Optimisation with Professor Barry Nelson from Northwestern University) was cancelled due to the outbreak of the COVID-19 pandemic. This world-wide emergency meant that Lent term came to an abrupt and premature end as we, like many others, were asked to work from home. This meant that many activities and deadlines were postponed. However, working from home has provided me with an opportunity to demonstrate a different style of learning. STOR-i have supported us in working from home by transferring much of our contact and planned events online.

Lent term also involved many social aspects as well. The ��̽̽App Netball Team won their first (and only) match of the season. We had a term jam-packed with birthdays among the MRes, so there was plenty of celebration! I also went to the ��̽̽App Undergraduate Conference where many of my friends in undergraduate degrees presented their work.

Whilst Lent term ended in the strangest of circumstances, I appreciate the new working from home skills I have acquired and I feel very grateful for the online platform in which we have been able to continue as close to normal as possible. I thoroughly enjoyed Lent Term and I feel more equipped than ever to continue on my academic journey into a PhD.

STOR-i Masterclass: Professor Laura Albert

Eleanor D'Arcy — Mon, 20 Apr 2020 08:50:18 +0000

Public Sector OR

At the end of February, Professor visited us at STOR-i to give a two day masterclass on Pubic Sector Operational Research. Laura is an Industrial and Systems Engineering Professor at the University of Wisconsin-Madison. At the time of the masterclass, Laura was on sabbatical in Germany at RWTH Aachen University. Her research focusses on applied optimisation in the public sector in the US; applications include homeland security, disasters, emergency response, public services and healthcare. Some current projects are:

Emergency medical service deployment and dispatch,
Cyber-security and trustworthy computing,
Next-generation policing models to divert opioid users from the criminal justice system.

Laura also authors the blogs and

History of Public Sector OR

The masterclass initiated with an introduction to Public Sector OR, detailing some of the historical applications. Following a period of civil unrest during the 1960s in the US, cities faced many challenges: crime, fire alarms. solid waste and drug use. Dr. Al Blumstein chaired the Commission’s Science and Technology Task Force (CMU) to address fundamental societal problems. With no extra money in the budget for public sector organisations, an increase in problem size meant there was only one solution: Operational Research. This is when the golden age of public safety research began.

Following this, some early contributions to public sector OR were made.��Much of this research was put into practice and influenced policy. These papers appeared in the best operations research journals and received major awards.

What is Public Sector OR?

Public Sector Operational research is a problem whose outputs are subject to public scrutiny

Public sector OR is concerned with complex systems that encompass people, processes, vehicles and critical infrastructure. It can include problems in the following areas:

Public health and safety
Police, fire, emergency services and public health
Community development
Planning, transportation
Human services
Public assistance, welfare, drugs and alcohol treatment, homeless services
Nonprofit management
Management of community-oriented service providers

Developing models to deal with these issues often involves multiple stakeholders or decision-makers and requires many objectives, often with conflicting aims. These models should aim to balance equity with efficiency, whilst remaining below some predetermined budget. Here are some examples of such models:

Food bank distribution networks,
Airport location or expansion using multi-criteria decision analysis,
Military procurement decisions,
Delivering relief aid,
Post-disaster reconstruction,
School bus schedules,
Public library location and management,
Undesirable facility location and management,
Public transport routes.

In the following sections, I will outline examples of public sector OR models that Laura presented during the masterclass

Small Scale: Facility Location Models

Suppose we want to site p ambulances at stations in a region to “cover” the most calls in 9 minutes. Here, there are two decisions to make: where to locate the stations and which calls are assigned to which station? This is modelled as an optimisation problem to achieve some balance between cost and service. Here, we maximise or minimise an objective subject to capacity constraints. Specifically, we consider a discrete problem where the locations are at predefined points using an integer program. In this problem, there are multiple distance criteria:

The total distance between calls and their assigned stations (this is usually demand weighted),
The maximum distance between a call and its assigned station,
The coverage – this is the number of calls covered if the distance is within some specified radius.

The model must also restrict the number of stations being built by considering the fixed cost associated with opening an ambulance station (including construction, leasing and labour costs). Remember: we want no more than p stations. Laura presented 5 models:

(Uncapacitated) Fixed-charge location problem:
minimise fixed cost + demand weighted distance
P-median problem:
minimise demand weights distance
such that locate less than p stations
P-center problem:
minimise maximum distance
such that locate less than p stations
Set covering location problem:
minimise number of stations
such that cover all calls
Maximum covering location problem:
maximise covered demands
such that locate less than p stations

These models must also ensure that all calls are satisfied and calls are not assigned to a closed station. In order to cover the most calls in 9 minutes, the maximum coverage problem poses most appropriate. However, there are additional features that could be included to improve the model:

Different call volumes at different locations,
Non-deterministic travel times,
Each ambulance responds to the same number of calls,
Ambulances are not always available to backup coverage.

Even when these additional features are accounted for in the model, there still remains two sources of uncertainty: ambulance unavailability and probabilistic travel times. Models that incorporate both sources of uncertainty generate a configuration that covers up to 26% more demand at no extra cost.

Such facility location problems are not restricted to just ambulance station location but many other areas within the public sector:

Fire stations,
Airline hubs,
Blood banks,
Hazardous waste disposal sights,
Schools,
Bus stops.

Large Scale: Emergency Response for Homeland Security and Disaster Management

Laura also discussed applications within OR but on a much greater scale in terms of disaster management. Disasters can include those that are natural (e.g. earthquakes, droughts, tsunamis, etc.), terrorist induced (e.g. cyber attacks or nuclear blasts), technological and accidental (e.g. nuclear power plants or power outages). Disasters tend to follow a common lifecycle:

��

Disaster Lifecycle

Each stage in the cycle (except vulnerability) lends itself to OR; we detail each stage and some applications:

Vulnerability is the potential for physical harm and social disruption.
– Vulnerability does not typically lend itself to OR applications��
Mitigation includes actions taken prior to the disaster to prevent or reduce the impact.
– Checkpoint screening for security
– Network design
– Pre-locating medical facilities and response stations
Preparedness also includes actions taken prior to a disaster but this time, to aid in response and recovery.
– Pre-positioning crews and supplies in advance of a disaster
– Evacuation planning
– Emergency crew scheduling
Emergency response includes actions during and after a disaster to protect and maintain systems, rescue and respond to casualties and survivors, and restore essential public services.
– Urban search and rescue
– Routing and distribution of supplies and commodities
– Hospital evacuation
Recovery includes efforts to reestablish pre-disaster systems and services.
– Debris clean up and removal
– Roads, bridge and facility repair and restoration
– Replanting and restoration of forests and wetlands affected by a natural disaster

The model criteria of disaster models differ slightly from that of a standard model. Rather than quality, cost, profit, and distance, we are now concerned with loss of life, morbidity, coverage, and delivery of critical commodities.

I would like to thank Prof. Laura Albert for delivering this masterclass. I really enjoyed learning about different OR models applied to the public sector.

STOR-i Internship 2018: Estimating Diffusivity in the Ocean

Eleanor D'Arcy — Mon, 06 Apr 2020 10:57:46 +0000

During Summer 2018, I was a research intern at STOR-i. This involved working on a project supervised by a 1^st year PhD student that focussed on their research area. The internship was extremely rewarding and helped me to gain an invaluable insight into life as a PhD student. I have provided an overview of my project in the research section of my website, but I have decided to detail this further as I think this is a very interesting application of statistical methods.

My project was titled “Estimating Diffusivity in the Ocean” and was supervised by Sarah Oscroft. Diffusivity is the rate at which particles spread out over time in a fluid. This has many important applications, for example:

– Planning aid in a search and rescue mission,
– Predicting how oil will spread after an oil spill to reduce the impact on animals and ecosystems,��
– Discovering how plastic waste in the ocean will spread.

Diffusivity is a very important measure in oceanography that cannot be exactly evaluated, so instead it is estimated. It is fundamental that such estimates are accurate and reliable. Current estimators use ideas from physics and fluid dynamics, however, they prove inconsistent across data sets and require improvement.

In order to analyse and evaluate current estimators, we studied real ocean data. The data was collected by the Global Drifter Program who maintain over 1000 drifters globally. A drifter is a measuring instrument in the ocean that floats on the surface and tracks currents by satellite. Over 40 years, the Global Drifter Program has collected over 100 million observations. We focussed on information regarding location and velocity. For example, the graph below shows the velocity of a single drifter in both the longitudinal and latitudinal directions. This particular drifter is located in the North Atlantic Ocean and travelled from the east of Canada towards the west of Portugal over approximately 14 months.

The above graph is called a��time series,��this is simply a sequence of observations over a series of time. Time series is an area in statistics that formed the foundation for my project. We modelled the ocean’s velocity as a particular time series model; an AR(1) process. This means the current value of velocity depends on the value it took in the previous time period plus some error term. The statistical properties of such a process helped form the estimator. It is worth noting that an AR(1) process is not an exact model for the ocean due to many external factors, but it is widely used in oceanography.

Spectral analysis is another area fundamental to calculating diffusivity. Specifically, we used a spectral density function to redefine our time series as a function of the contribution of each frequency. This is done using sine and cosine waves of different frequencies and identifying how well they contribute to our time series. The frequency can increase to infinity so we estimate the spectral density using a finite function of frequency; the periodogram.

Using the physical definition of diffusivity, algebraic manipulation and�� AR(1) statistical properties, diffusivity can be reformulated in terms of something estimable; the spectral density function. This gives us a��periodogram-based estimator of diffusivity.

Firstly, we simulated data from an AR(1) process since the diffusivity of this process exists and can be exactly derived – this is the only case where we can compare our estimate to the actual diffusivity. The graph below shows the exact diffusivity (y-axis) of an AR(1) process (red) against our estimate applied to the simulated data (blue). The estimate follows a similar pattern to the actual diffusivity but takes longer to reach a steady state. This is to be expected as we know the estimate improves with the number of samples. However, the estimate reaches a value close to the actual diffusivity at the last time point and this is the value we are interested in. These results demonstrate that the estimator works well.

We then moved on to looking at real data from the Global Drifter Program, specifically we studied 11 drifters in the North Atlantic Ocean (see below). We observed these drifters over 400 hours at hourly intervals. Firstly, we found the estimated diffusivity of each drifter over the time period, but these values don’t have much meaning. By averaging the estimate for each drifter, we have found an estimate for this part of the ocean which is more purposeful. Additionally, we found diffusivity increases rapidly and then steadies out in a similar fashion to the AR(1) process – this supports the idea that the AR(1) process is a suitable model for the ocean.

In summary, we know that diffusivity requires estimating because it cannot be exactly evaluated. To formulate our estimator, we model the ocean as a statistical AR(1) process where the value of the velocity now, depends on the value in the previous time period and we found that the estimate for the AR(1) process is comparable with the real data. To find out more, you can view my academic��poster��for this topic

Slot Scheduling in Air Transportation

Eleanor D'Arcy — Mon, 23 Mar 2020 09:42:00 +0000

Following Alexandre Jacquillat’s talk at the STOR-i Annual Conference 2020 on Analytics for Operations, Scheduling and Pricing in Air Transportation, I was inspired to investigate this topic some more. I was particularly interested in the concept of slot scheduling to better use scarce airport capacity in order to improve the efficiency of the air transportation system. After reading some of the relevant literature and attending a talk by Konstantinos G. Zografos, I decided to write my first STOR-i research report on models proposed to deal with slot allocation inefficiencies. I have detailed the one-page summary of my report below:

Summary of Research Report

The slot scheduling problem has recently received a great deal of consideration in the literature due its size and complexity. As demand for air transportation rises but opportunities for the expansion of infrastructure remain limited, demand management measures are fundamental to help balance supply and demand. Supply side solutions, through airport capacity expansion or enhancement, are capital intensive and require a long term horizon for implementation. Such operations are also often subject to physical or political constraints. Instead, demand management is recognised as the principal instrument to deal with delays in air transport since such solutions are immediate and easily implementable. Slot scheduling is a method of managing demand through best allocating scarce airport resources.

Prior to the summer or winter scheduling season, airlines request slots at an airport; a slot allows them to use all of the infrastructure necessary for landing and take-off. For airports who are designated as `coordinated’, due to supply-demand imbalances, a coordinator is responsible for allocating slots. Currently slot schedules exhibit large deviations from requested slot times. Airport capacity is usually expressed in terms of the number of available slots and the demand for these slots often exceeds capacity, but this capacity is rarely used optimally. Slot scheduling models aim to best use capacity so that all airlines are allocated slots as close to their requests as possible, subsequently slots are used more efficiently and delays are minimised. There is large room for improvement in the current slot allocation process.

The first mathematical model to be compliant with scheduling regulations was proposed in 2012. This model aims to minimise the distance between requested and allocated slot times subject to an artificial measure of capacity and turnaround time constraints, at a single airport. This ensures capacity is not exceeded, so delays are minimised, and allows the aircraft sufficient time on the ground to prepare for the next flight. Using this simple formulation, the resulting schedule demonstrates large improvements on current procedures. Following from this, other models have been developed to also incorporate fairness and accessibility restrictions. These encourage flights to remotely located airports and aim to ensure no airline suffers greater displacement from their requested slots. This means all airlines are treated equally and all airports, regardless of size, are accessible. Other models aim to minimise similar objectives, but consider a network of airports. This means that dependencies between airports are accounted for in order to avoid the multiplier effect of delays once one flight is interrupted. Considering a network of airports creates a larger and more complex problem, but this helps to formulate a more realistic representation of the situation at hand.

This report reviews the current slot allocation procedure, detailing each stage necessary to formulate a slot schedule. Additionally, we discuss different allocation models in the surrounding literature, at the single and network level, and use computational results to compare them to the existing methods, as well as one another. Finally, we aim to identify any gaps in the research that present interesting ideas for future investigation.

STOR-i Masterclass: Professor Brendan Murphy

Eleanor D'Arcy — Mon, 09 Mar 2020 15:21:00 +0000

Model-Based Clustering and Classification

A few weeks ago,��visited ��̽̽App to present a two-day masterclass to all STOR-i students on Model-Based Clustering and Classification. Brendan is Full Professor and Head of School in School of Mathematics and Statistics at University College Dublin. His research interests include clustering, classification and latent variable modelling, particularly Brendan is interested in applications from social sciences, food science, medicine and biology. Currently, he is the editor for Social Sciences and Government for the Annals of Applied Statistics and he has recently co-authored a research monograph on Model-Based Clustering and Classification

Intro

Brendan kick-started the masterclass by providing an introduction to clustering. Cluster analysis aims to find meaningful groups in data in order to find clusters whose members have something in common that they do not share with members of other groups.��Clustering dates back to the beginning of language – at least – when objects were grouped according to common characteristics. For example, Aristotle classified animals into groups based on observations, in ‘History of Animals’ from the 4th century BC.

Hierarchical Clustering

In the 1950s, various hierarchical clustering methods were introduced. These aim to build a tree of clusters so that you start with n observations divided into n clusters (every observation is its own, individual cluster), then you find the two ‘closest’ clusters and group them so that there are now n-1 clusters, then you continue in this way until everyone is in a cluster. In order to do this, you need a measure of distance between observations (dissimilarity) and a measure of distance between clusters (linkage). The choice of these measures can heavily influence the results. Hierarchical clustering doesn’t always perform well even though it is commonly used.

K-means Clustering

Another method of clustering was developed in the late 1950s: k-means clustering. Here, we describe clusters by the average of the observations within it. This is an iterative algorithm repeated until convergence, split into two steps:

Allocation: assign observations to the cluster that is closest��
Update: the cluster summaries (i.e. the mean)

Brendan demonstrated k-means clustering in action, by clustering the colours on pixals in an image on Alexandra Square, ��̽̽App. We start with a single cluster (k=1) and the results look pretty grey, as the number of clusters increases the photograph becomes more identifiable. Even with 2 clusters, buildings, shadows and people are all visible since light and dark areas have been separated. By the time we hit 10 clusters, the image is starting to look similar to the original and for 100 clusters, the image is indistinguishable from the original.

Model-Based Clustering

The first successful model-based clustering method was also developed in the 1950s by Paul Lazarsfeld for multi-variate discrete data. The model he proposed is now known as the Latent Class Model – he used the term ‘latent’ for unknown cluster allocations.

The dominant model for model-based clustering of continuous data was developed in 1963 by John Wolfe, this is known as the Gaussian Mixture Model.

Model-based clustering assumes that observations arise from a finite mixture model and that each observation has a probability that it came from each group, g – these probabilities are called the mixing proportions. The data within each group is modelled and we can combine this model, with the mixing proportions, to define an overall model for the data. Many modes of estimating these models are available, Brendan focussed on the .

A Gaussian mixture model models each observation as a multivariate Gaussian distribution. Therefore the clusters correspond to Gaussian densities and have elliptical shapes. We use the EM algorithm to fit these Gaussian mixture models. The example below fit these clusters in just 7 iterations of the algorithm.

STOR-i Annual Conference 2020: Christine Currie

Eleanor D'Arcy — Mon, 24 Feb 2020 13:06:54 +0000

Making Random Things Better: Optimisation of Stochastic Systems

At the STOR-i Annual Conference earlier this year, Professor Christine Currie presented two interesting applications of optimisation, where the elements of the system are subject to uncertainty. Optimisation involves finding a value that maximises or minimises a function, this is often a complicated and random function. In many real-life situations, it is optimistic to assume all data elements in the system are known quantities and that none are subject to uncertainty. Christine outlined examples in the passenger transportation industry and in healthcare where the inputs and outputs of the system are uncertain.

Christine at the STOR-i Conference

Christine is Associate Professor of Operational Research in Mathematical Sciences and Director CORMSIS (the Centre for Operational Research, Management Science and Information Systems) at the University of Southampton. Additionally, she is Editor-in-Chief for the Journal of Simulation. Christine’s research is concerned with simulation optimisation, mathematical modelling of epidemics, optimal pricing and applications of simulation in healthcare.

Airline Revenue Management Example

Firstly, Christine delivered an optimisation example for network revenue management of tickets for an airline. This is modelled as an optimisation problem because the aim is to maximise revenue subject to capacity, demand and sales. It is difficult to set prices for these tickets since it is uncertain how many will be sold, we say there is stochastic demand. If it was certain that there would be a high demand for a certain flight, then it is likely that the airline will set higher prices to maximise revenue. To complicate this further, airlines often sell different classes and packages of tickets, so it must be decided at what prices different seats will be sold. Since the seats have no value once the plane departs, we have a perishable inventory and it is fundamental to sell all tickets in order to maximise revenue.

Whilst the demand is uncertain, bounds can be assumed and a probability distributed assigned to it. Therefore, Christine suggests maximising the minimum revenue or minimising the maximum regret instead of just maximising revenue. This approach uses the bounds and distribution of demand, so it is not required to know demand exactly.

Healthcare Example

Christine provided us with a different example within healthcare to illustrate the idea of subset selection. Here, patients are waiting for transport to an acute hospital ward before rehabilitation. The ward is set up with an arbitrary number of bays and within each bay, there is an arbitrary number of beds but each bay must be single sex. These make up the constraints for the optimisation problem where the primary objective is to minimise waiting time. Additionally, the secondary aim is to maximise bed utilisation and minimise patient transfers around the ward.

Clearly, this quickly turns into a complicated optimisation problem. This requires a complex decision to be made with multiple objectives and a large number of options (including patients and beds). Christine’s work suggests providing the system expert with a subset of solutions rather than a single optimal solution, this avoids a situation where the expert is presented with a solution they cannot implement. Therefore, rather than a single solution, it is proposed that a subset of the system is chosen such that we are within some proportion of the optimum solution with some probability.

I thoroughly enjoyed listening to applications of optimisation in real-life situations; it was interesting to see how Christine approached the challenges of uncertainty and complexity. To read more about Christine’s research you can visit her . Additionally, please read my blog post about the STOR-i conference and all of the talks to find out more.

STOR-i Annual Conference 2020: Alexandre Jacquillat

Eleanor D'Arcy — Mon, 10 Feb 2020 15:49:34 +0000

Analytics for Operations, Scheduling and Pricing in Air Transportation.

Alexandre Jacquillat opened the STOR-i Annual Conference 2020 in early January. I thoroughly enjoyed his talk on Analytics for Operations, Scheduling and Pricing in Air Transportation so I decided to write an overview of the work he delivered. Alexandre demonstrated applications of both Statistics and Operational Research methods to the transportation industry. I had not witnessed such an application prior to this and I really appreciated this practical use of two topics I am currently studying. To find out more about the STOR-i conference, please read my post that details the event and all of the talks.

Alexandre at the STOR-i Conference

Alexandre is an Assistant Professor of Operations Research and Statistics at the MIT Sloan School of Management. His research applies to areas in transportation with the aim to promote efficient scheduling, operations and pricing practices. Alexandre is the recipient of several research awards, including the George B. Dantzig Dissertation Award and the Best Paper Prize in Transportation Science and Logistics from INFORMS.

This talk focussed on work that lies at the interface between analytics and transport, both of which are dynamic and growing industries. Specifically, the air transportation industry currently faces many challenges because they operate at or above capacity in order to avoid wasting any resource or time. This approach leads to flight delays and incurs costs for the airline provider. As the volume of flights increases, there is likely to be more delays set against more sales and profit. However, with fewer flights and hence fewer sales, it is likely that the number of delays will decrease. Using various results from analytical projects, Alexandre explained how they aim to support operations, scheduling and pricing practices in air transportation. In turn, this will improve the efficiency, reliability and profitability of the industry as a whole.

Operations

Alexandre started by discussing how his work has supported operations within the air transportation industry. This involves ensuring making the best use of available capacity. I mentioned above that airports operate at maximum capacity in order to avoid wasting any resource, but this often leads to delays and becomes costly. Alexandre proposes modeling this as an optimisation problem, this aims to minimise flight and passenger delays subject to flight, passenger and capacity constraints. Previously, passenger delays were not accounted for in this problem. When considering a network of flights, there is not a one-to-one correspondence between passenger and flight delays because passengers often travel through connecting flights. Therefore, a minor flight delay may cause a passenger to miss their connection and results in a major overall passenger delay once they reach their final destination. Including a new layer of passenger delays to the model proves to make flight operations more consistent.

Scheduling

Secondly, Alexandre presented ideas to optimally schedule flights. Airlines request a time slot for each of their flights and often they are allocated to a different slot because the demand for a specific slot tends to exceed the capacity. In order to minimise the overall displacement from an airline’s request, Alexandre presented a large-scale optimisation approach to slot allocation. By breaking the large-scale problem into smaller chunks, this delivered high quality, fast solutions compared to the current process. This method delivered optimal, or near-optimal, solutions at some of the largest schedule-coordinated airports. Additionally, Alexandre proposed an integrated model of both scheduling and operations that optimises scheduling in a network of airports but also captures the interdependencies between flight schedules and air traffic flow management.

Pricing

Lastly, Alexandre focussed on tackling the pricing practices in the industry. Often, flights between two destinations are priced the same even though one flight is direct and another has some number of stops for a significant length of time. This means that prices are not competitive. Alexandre outlined an experiment conducted with a global airline to assess a new, competitive pricing strategy. This demonstrated that making some minor changes to the baseline pricing practices results in a significant increase in revenue.

Global Flight Routes in 2009

Alexandre concluded with some ideas for future research prospects, to read more about Alexandre’s research feel free to visit his . I really enjoyed this talk and it encouraged me to read some more into the area, consequently, I have decided to write my first STOR-i research project on ‘Mathematical Models and Algorithms for Allocating Scarce Airport Resources’. The main focus of this project is to review relevant literature, therefore I intend to further investigate aspects of Alexandre’s research.

Future Reading

I have included a couple of papers by Alexandre that I have found interesting relating to the slot scheduling problem in air transportation

Jacquillat, A. and Odoni, A.R., 2015. An integrated scheduling and operations approach to airport congestion mitigation. Operations Research, 63(6), pp.1390-1410.
Available at:
Jacquillat, A. and Vaze, V., 2018. Interairline equity in airport scheduling interventions. Transportation Science, 52(4), pp.941-964.
Available at:

STOR-i Annual Conference 2020: An Overview

Eleanor D'Arcy — Tue, 28 Jan 2020 16:21:02 +0000

At the beginning of January, STOR-i CDT hosted its ninth annual conference where key researchers from Statistics, Operational Research and Industry presented a range of interesting talks. Additionally, STOR-i PhD students showcased their work during an evening poster session. This event provided opportunities to network with individuals from a variety of institutions and industries.

initialised the conference by presenting his results from various analytical projects concerned with supporting the operations, scheduling and pricing in air transportation. Alexandre is an Assistant Professor of Operations Research and Statistics at the MIT Sloan School of Management. His research applies to areas in transportation with the aim to promote efficient scheduling, operations and pricing practices.

Henry Moss, a third year STOR-i PhD student, then proposed MUMBO (Multi-task Max-value Bayesian Optimisation), this is used to perform efficient optimisation (maximising or minimising a function subject to constraints) by evaluating low-cost functions related to our target function. Henry’s research lies at the intersection of Statistics and Operational Research, with a focus on Bayesian Optimisation.

The morning session concluded with a talk from , an Associate Professor at the Department of Biostatistics, this is part of the Oslo Centre for Biostatistics and Epidemiology, at the University of Oslo. Valeria presented some recent advances in a statistical model that works well in handling uncertainty in ranking and different kinds of data; the Mallows Rank Model. Her research spans several areas of Mathematics and Statistics;

Functional data analysis with applications in physiology and biostatistics,
Machine learning for describing people mobility,
Statistical genomics of cancer and high-dimensional data models

After the lunch break, discussed some lesser-known instances of his research in the theory, algorithms and applications of mathematical optimisation. Miguel holds the Chair of Operational Research at the School of Mathematics, University of Edinburgh. His research focuses on the application of optimisation to problems in power systems management and smart grid design.

, another third year STOR-i PhD student, proposed her procedures for generating valid linear inequalities that are added to optimisation problems in order to reduce the required solution time and improve solvers’ performance. Georgia also works with Morgan Stanley on a variety of large-scale optimisation problems.

Next, , the Howard Levene Professor of Statistics at Columbia University, delivered some of his research in time series of counts. He focussed on relaxing the assumption about the probability mass functions relating the observations to a state variable. Typically, this is chosen to be a Poisson or Negative Binomial distribution, but Richard detailed how to reduce this to a more general form and the consequences of this change.

Tom Flowerdew gave the penultimate talk of the day about applying a statistical learning model to the fraud detection processes. Tom is a STOR-i alumnus who now works for Featurespace, a world-leader in Adaptive Behavioural Analytics. They work for banks and other financial institutions by scoring transactions based on their risk of fraud. Tom explained some of the challenges with this and how statistical models are employed to tackle these problems successfully. To find out more about Featurespace, click .

To close the first day of the conference, presented two applications of optimisation where some elements of the system are subject to uncertainty. The examples discussed were concerned with the passenger transport industry and healthcare. Christine presented methods that account for the randomness of both inputs (e.g. demand for plane tickets or number of patients requiring a bed on a hospital ward) and outputs of a system (e.g. plane tickets sold or number of patients on a ward) when optimising another element (e.g. maximising revenue or minimising waiting time). Christine is Associate Professor of Operational Research in Mathematical Sciences and Director of CORMSIS (Centre for Operational Research, Management Science and Information Systems) at the University of Southampton.

, a reader in statistics at Brunel University London, started the second day by explaining how possible dependencies in a firms’ risk or default should be accounted for in commonly used credit risk models. Veronica explored how transaction data can be used for such models and the advantages that this may bring in terms of predictive power. She then proposed a model to capture dependencies as well as an algorithm to manage the high-dimensionality data and some computational challenges.

then proposed data science models that strike a balance between accuracy and interpretability so that they provide explanations on the task to the user who interacts with the models. Dolores is a Professor in Operations Research at Copenhagen Business School; her areas of expertise include Supply Chain Management, Data Mining and Revenue Management.

Another STOR-i alumnus presented next, who is a postdoctoral researcher at Universitat Pompeu Fabra in Barcelona. Ciara’s research interests include online learning, multi-armed bandits and reinforcement learning. In this talk, Ciara focussed on multi-armed bandits, where at each time step a player selects an action and receives some reward from selecting it, the aim is to maximise total reward. It is commonly assumed that this reward is constant, but Ciara proposed algorithms that perform well when the reward is not constant and depends on the history of the players’ actions.

concluded the conference by proposing a new method for model-based clustering of continuous data. Brendan is Full Professor and Head of School in the School of Mathematics and Statistics at University College Dublin. He has research interests in clustering, classification and latent variable modeling with applications from social sciences, food science, medicine and biology.

The STOR-i conference then closed with a lunch buffet which provided another opportunity to network with any external attendees. I really enjoyed listening to all of the talks and learning about many interesting applications of topics covered during the MRes so far. It was a great occasion to meet professors from other universities, industry professionals and STOR-i alumni as well as hearing from current students in both the presentations and the poster session. I intend to write further posts in the coming weeks that will discuss some of the talks in more detail.

Life as a STOR-i MRes Student: Michaelmas Term

Eleanor D'Arcy — Tue, 14 Jan 2020 15:46:27 +0000

Introducing my ‘Life as a STOR-i MRes Student’�� blogging thread, starting with an overview of Michaelmas term. I decided to complete a blogging thread on my MRes year because it will provide me with an opportunity to reflect on my time as a student at STOR-i as well inform others about the course. I started the STOR-i MRes programme in October 2019, I have already learnt many new skills and topics and thoroughly enjoyed myself during the term.��

In the first week of term, we started with an introductory day to meet some of the staff and students, as well as to learn what STOR-i is all about and what the programme entails. This was followed by a two-day excursion to the Lake District; this was a great opportunity to build relationships with my peers as well as staff and other students. The Away Day included many team-building activities, from designing a team logo and printing it onto a t-shirt to canoeing across Lake Windermere in an orienteering challenge. By the time we came back to Lancaster to start the term properly, we had really got to know each other and were able to support each other through the rest of the term.

��

STOR-i Away Day 2019

The first half of Michaelmas term consisted of four modules to introduce us to the basic knowledge required for the rest of the year. This included Inference and Modelling, Stochastic Simulation, Deterministic Optimisation and Probability and Stochastic Processes. Within each topic, we had lectures, weekly exercises and workshops to assist with our understanding. Alongside these modules, we attended lectures regarding training for research and industry.

In the second half of term, we completed contemporary topic sprints on a weekly basis. This involved attending a lecture on a Monday morning to gain an overview of the topic and our task, then we would research the topic in teams for 4 days and present it back to an expert in the field later in the week. This was a great way to build on our teamwork, research and presentation skills. The topics were

Statistical Leaning for Decision lead by Professor David Leslie,
Modelling Paradigms for Complex and Novel Data forms lead by Professor Idris Eckley,
Computational Statistics lead by Professor Paul Fearnhead,
Optimising under Uncertainty lead by Professor Adam Letchford.

Following the sprints, we chose one topic to explore further and write a report on our findings. I chose Optimisation under Uncertainty and wrote my report on the Stochastic Knapsack Problem with Random Weights. This is an optimisation problem where we have a set of items with different, unknown weights but known reward. We want to choose a set of items to go in the knapsack (which has some maximum capacity) so that the total reward of the knapsack is maximised. Whilst writing this report, I read a lot of the relevant literature and broadened my knowledge on the topic as a whole.

Outside of my studies, I attended many of the events that STOR-i hosted, including a cheese and wine evening as well as the STOR-i Christmas Meal. This allowed me to meet more students and relax outside of studying. Additionally, I started playing netball for ��̽̽App Graduate College, this has been a great way to meet other postgraduate students from different disciplines. I have thoroughly enjoyed the MRes programme so far and look forward to seeing what the Lent term brings.

Whilst Michaelmas term was busy and challenging, I have thoroughly enjoyed every moment in learning new topics and making new friends.