# Model

GLEAM produces realistic simulations of the global spread of infectious diseases. It integrates three layers: real-world data on the global population; real-world data on the mobility of this population; an individual based stochastic mathematical model of the infection dynamics.

The real-world population and mobility data is used to determine when and where people will interact and potentially transmit the infection. This data divides the world into a grid of small square cells. Satellite and census sources are used to calculate the population density in each of these cells, which are then clustered into subpopulations centered on their local transportation hub.

Using robust statistical methods, GLEAM executes the simulation in a sequence of time steps (representing full days). Within each population cluster, the spread of the infection among individuals is governed by the characteristics of the disease and the containment and mitigation responses specified in the epidemic model. Infection is transmitted between population clusters when people commute to work or school, or travel longer distances on national and international flights.

On high performance computers GLEAM executes thousands of simulations making it possible to generate for each populations the statistical ensemble of possible epidemic evolutions and analytics for quantities such as newly generated cases, seeding events, time of arrival of the infection. and others.

Once complete, the final simulation output can be displayed and analyzed in a variety of ways using the GLEAMviz desktop application.

## The population layer: census data and population densities

GLEAM acquires population data from the websites of the Gridded Population of the World and the Global Urban-Rural Mapping projects, which are run by the Socioeconomic Data and Application Center (SEDAC) of Columbia University.

In these datasets the world is divided into a grid of cells and assigned an estimated population value. GLEAM uses cells that are approximately 25 x 25 km, dividing the globe into over 250,000 populated cells.

We know the coordinates of each cell and also those of all the commercial airports in the World Airport Network. By considering the distance between the cells and airports we assign each cell to a ‘local’ airport; this process generates over 3,300 subpopulations, each centered on a local transportation hub.

## The mobility layer: commuting and flight patterns

### Airport network

GLEAM uses a set of twelve different flight networks (one for each month) derived from the worldwide booking datasets from the Official Airline Guide (OAG) database. This database contains more than 3,800 commercial airports in about 230 countries, and includes over 4,000,000 connections representing the estimated bookings between any two of these airports for each month.

The airport network data reveals significant variations in both the number of destinations per airport and in the number of passengers per connection. There are some airports with lots of connections and large volumes (where we typically have to catch our connecting flights), and many airports with few connections and low volumes. This characteristic is sometimes called the “long tail”, and has a significant impact on how infections spread around the globe.

### Commuting network

Our commuting database has been compiled using information obtained from the national statistics offices of more than 40 countries in five continents, covering more than 78 000 administrative regions. These data sources, which use different semantics and organizational structures with varying degrees of detail have been standardized before being integrated.

Our fully integrated dataset contains over five million commuting connections between GLEAM’s geographic subpopulations, capturing the irregular network structure that affects the local diffusion of infections between neighboring subpopulations.

## Disease dynamics layer: modelling the course of infection

The GLEAM engine simulates the infection dynamics according to the characteristics of the disease coupled with any prevention and intervention measures. Examples of disease characteristics are: incubation times, the proportion of asymptomatic yet infectious individuals, mortality rates and immunity.

The infection characteristics are defined in a so-called ‘compartmental model’. Each individual fits, at any given point in time, within a certain ‘compartment’ that corresponds with a particular disease-related state (being susceptible, symptomatic or vaccinated, for example). These compartments are connected by paths that define how individuals may pass from one state to another (from susceptible to latent when being infected, for example) while associated parameters determine the likelihood that such transitions take place.

GLEAM uses stochastic algorithms mathematically defined through individual based stochastic chain binomial and multinomial processes to calculate the proportion of the population within each compartment for each subpopulation, and how these proportions change over time as individuals transition from one compartment to the next.