Note: To view the non-technical executive summary, click here. Техникийн бус гүйцэтгэх товч дүгнэлтийг харахын тулд энд дарна уу.

#collapse-hide
# Imports and settings are here
import pandas as pd
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter, AutoMinorLocator)

%matplotlib inline
pd.set_option('display.float_format', lambda x: '%.5f' % x)

plt.style.use('fivethirtyeight')

Framing the Risk Assessment

Mongolia is a unique case with the ongoing COVID-19 pandemic. As of the writing of this article, there is no community transmission of the virus. All cases currently in the country are in quarantine. Primarily because of the success Mongolia has had in preventing community transmission, many outside the country are pointing to the Mongolian example as something to learn from.

Here is a timeline of events since January. The first part of the table is summarized from an article from The Lancet 1. Events from April 15th onward were compiled from local news reports.

Dates Actions taken by the Mongolian Government
January 6-12, 2020 First public precautions introduced by the Ministry of Health
January 20-26, 2020 Educational institutions at all levels are temporarily closed until March 30, 2020 (later extended to September 1); travel restrictions applied to China
January 27-February 2, 2020 Incoming travelers from countries with COVID-19 cases must go through a period of quarantine. Tsagaan Sar restrictions put in place.
February 10-16, 2020 Quarantine camps are prepared.
February 17-23, 2020 Disinfection protocols for port trucks and trains are created. Domestic travel ban applied for Tsagaan Sar (Feb 23-27). This ban is later extended until March 3.
March 9, 2020 First confirmed case detected on foreign national who recently entered country. All businesses except grocery stores closed until March 16. Another domestic travel ban put in place. All international travel banned on March 10.
April 15, 2020 Resorts are allowed to reopen and are expected to clean and sanitize regularly.
April 30, 2020 Bars and nightclubs allowed to reopen with requirement of 1-1.5 meter distance between patrons. Several establishments violate rules and city mayor shuts down bars and nightclubs again on May 4.
May 1, 2020 Cultural and sports centers reopened.
May 4, 2020 Training centers allowed to reopen with social distancing and capacity restrictions.
June 22, 2020 Libraries and museums reopen.
July 1, 2020 State Emergency Commission allows nearly all businesses (except schools and universities) to reopen with certain restrictions.
July 29, 2020 State Emergency Commission allows schools and universities to reopen on September 1. Cabinet amends the decision to have 3 days in class and 2 online. Universities will be allowed to open October 5.

Much of the success of the government has been attributed to the quick shutdown of international travel and broad restrictions domestically. Beginning with the international travel ban on March 10, the government has been attempting to repatriate thousands of Mongolian citizens who were abroad when the ban was in place. As a result the large majority of those in quarantine are Mongolians. Recently it was disclosed that a limited number of foreign workers are being allowed in to work on economically vital projects in the country including the Oyu Tolgoi underground expansion and infrastructure projects.

The recent Naadam holiday saw huge amounts of domestic travel with little worry of coronavirus. While the national Naadam celebration was televised to prevent large crowds, local Naadam celebrations in the countryside saw medium sized gatherings of 100+. Yet the idea of risk remains in peoples minds. The recent decision by the government to have class only 3 days a week is an indicator of this.

The problem with the current concept of "risk" in the public is that it is not quantified and not based on facts. My aim for this analysis is to quantify the current risk of COVID-19 in Mongolia by using currently known facts. To do this I will answer a question that frames the issue of COVID-19 risk in a simpler way:

What is the likelihood of a single undiscovered COVID-19 case in the community?

This is a question that, with some assumptions, we can get a concrete answer for. It isn't the only way to frame this problem, and certainly not the most comprehensive. However, it does give a single answer that can be used to launch into a larger discussion of risk, restrictions on businesses and education, and even international travel restrictions.

To begin we first must declare the list of key terms and assumptions I will use going forward. Then we will assess the risk of a single undiscovered COVID-19 case with two methods:- Determining the likelihood that a person infected with COVID-19 would remain undetected, and then go into the general population while still infectious.- Analyze several "What-if?" scenarios to determine if they could have already happened without the government or public being aware.

Assumptions

To create our models we must first create some assumptions. Each are taken from research and reflect realistic values for the current situation.

  • Asymptomatic rate is 80%. This is based on WHO guidelines that 80% of infections are mild or asymptomatic 2. This is most likely higher than reality, but nevertheless is the best number we have to work with. Other studies have shown anywhere from a 30%-50% asymptomatic rate 3.
  • False negative rate is 1%. This means that for every 10 people tested that have COVID-19, 1 person will receive a negative test. Mongolia uses a one-step RT-PCR test looking N and ORF1B genes 1. A recent study has shown that this type of test is nearly 100% sensitive (100 - sensitivity % = false negative %) 4. To be conservative I have made this 1%.
  • False positives are not considered. As there have been no positive tests indicating community transmission in Mongolia, there is no need to assess false positives.
  • 8,000 people have been through quarantine in Mongolia. According to a recent Lancet article 1, more than 8,000 people have been through quarantine since quarantines started in March 2020.
  • Multiple tests are performed on quarantined travelers. The same Lancet article stated that those in quarantine are tested periodically throughout their time in quarantine. We will assume at least two tests during the three week period.
  • Those with symptoms are tested. If a person presents with influenza like symptoms they are referred for testing 1. In addition, there is random sampling of the general population.

Now that we have established our key terms and assumptions, we can move on to modeling.

Can an Infection in Quarantine be Missed?

We know that as much as 80% of COVID-19 cases are asymptomatic or mild (perhaps mild enough to not seek medical help). However, we also know that those in quarantine are tested periodically. In our assumptions we stated that each individual would be tested at least 2 times. Let's see if it is possible that someone infected would be able to leave quarantine without having a positive test.

To determine this we will use the false negative rate of the test. A false negative would occur when someone actually has an active COVID-19 infection, but the test administered returns a negative result. The type of test being used in Mongolia (the RT-PCR test) is one of the most sensitive tests possible, and is widely used and understood to give accurate results. A recent survey of this type of test from various vendors found that the sensitivity of the test is nearly 100% 4. Here is a table from Parikh et al. 5 showing the relationship between false negatives and sensitivity.

As an example, if 100 people known to have a disease are tested and 75 people test positive and 25 people test negative, then the sensitivity of the test is 75%.

$$ (75+25) = 0.75 \ \ or \ \ 75\% $$

To determine the false negative rate we can subtract the true positives from the total, then divide by the total.

$$ (100 - 75) / 100 = 0.025 \ \ or \ \ 25\% $$

Even though the tests used for COVID-19 in Mongolia are basically 100% sensitive, we can assume a 99% sensitivity for this example to account for possible errors in testing. This would mean there may be a false negative rate of 1%. Now we can determine the probability that an infection is missed during two consecutive tests by multiplying the probabilities together.

0.01 * 0.01
0.0001

That is 0.0001 or 0.01%. By having two tests the probability of a missed infection is reduced by 99%. If there are three tests we can multiple the individual probabilities together.

0.01 * 0.01 * 0.01
1.0000000000000002e-06

That is 1.0000000000000002x10-6, or 0.0001%. By adding one more test we reduce the the probability of a missed infection by 99.99%.

Now we can simulate the probability that a person was able to leave quarantine without being detected. This can be done by simulating Bernoulli trials. Bernoulli trials are a random experiment where there are two possible outcomes, 0 and 1 (failure or success). In our case a "success" is actually a bad thing, someone infected not being detected.

Here is a function written in Python to perform Bernoulli trials.

def perform_bernoulli_trials(n, p):
    """Perform n Bernoulli trials with success probability p and return number of successes."""
    # Initialize number of successes: n_success
    n_success = 0

    # Perform trials
    for i in range(n):
        # Choose random number between zero and one: random_number
        random_number = np.random.random()

        # If less than p, it's a success  so add one to n_success
        if random_number < p:
            n_success += 1

    return n_success

Let's run this function with 1,000 known positive cases and a probability "success" of 0.01%. Essentially we will be flipping a coin 1,000 times with a probability of "success" being 0.01%. Because there is randomness in the function, you will most likely get a different answer each time the function is run.

# Simulate 1,000 cases with a probability of missing a case being 0.01%
perform_bernoulli_trials(1000, 0.0001)
0

Our result is 0. You are almost guaranteed to get a 0 every time you run this. It is important to remember that this model is stochastic. This means that the model has randomness built in, and that each time you run it your answer is not guaranteed. Instead you can look at the distribution of many runs of the model to understand all possible outcomes and then measure the frequency of those outcomes to determine probability. If this is a confusing mess of words for you, then keep going as I will demonstrate this process and it will hopefully become more clear.

Previously we simulated 1,000 positive cases, but it turns out we don't have 1,000 positive cases. As of July 28, 2020, there were 289 confirmed cases of COVID-19 in Mongolia 6. So, to be conservative, let's simulate 300 cases of COVID-19 going through quarantine and being tested two times. We will run the simulation 10,000 times. This will give us a distribution of possible outcomes that we can use to calculate a probability that 1 or more people would be able to leave quarantine and receive a false negative test twice.

# Set a random number seed so the code is reproducible.
np.random.seed(42)

# Create an empty list of missed infections that will be filled with our simulation next.
n_missed_infections = np.empty(10000)

# Simulate 300 positive cases going through quarantine 10,000 times with a probability of missing a case being 0.01%.
for i in range(10000):
    n_missed_infections[i] = perform_bernoulli_trials(300, 0.0001)

#collapse-hide
# Plot Code
fig, ax = plt.subplots(figsize=(16,9))
ax.hist(n_missed_infections, bins=[0,1,2,3], rwidth=0.5, align='left', density=True)
ax.set_yticks([0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8, 0.9])
ax.set_xlabel('Number of missed infections per 300 positive cases, simulated 10,000 times', fontsize=20)
ax.set_ylabel('Probability', fontsize=20)
ax.set_title('Distribution of Possible Outcomes', fontsize=26);

We can directly calculate the probability of a missed infection with the assumptions above by counting the number of simulations resulting in greater than or equal to 1 divided by the total number of simulations.

np.sum(n_missed_infections >= 1) / len(n_missed_infections)
0.0288

This means that given 300 known positive cases there is a 2.8% chance that two consecutive COVID-19 tests would return a false negative and one case would not be detected. That means we have a 97.2% likelihood that all COVID-19 cases were caught in quarantine.

So what does this number really mean? More skeptical readers may think, so you mean there's a chance?

Yes, there is a non-zero chance that a person may have been missed. Converted to odds, 2.8% is roughly 1 in 34 chance. Focusing on that 2.8% will probably create more irrational fear than create motivation for preparation. Let's put things in perspective. Consider a serious flood that has an average probability of occurring once over the next 34 years. How important is it to act now to ensure this possible flood does not destroy homes, roads, or other critical infrastructure? Planning and preparation are probably the most reasonable responses, not fear.

What-if?

2.8% is not our final answer. We have to take it one step further by looking at what would happen in an infectious person was in public. Since early March all international travel has been banned, but there has been movement across the border. There is some risk in this movement, like with nearly everything in life, so let's quantify it so we can better understand whether we need to worry or not. We will look at three different scenarios and assess the probability of undiscovered infections in each.

  1. What if an infected person escaped from quarantine?
  2. COVID-19 has been inside Mongolia since before the travel ban.
  3. An infected bird flies in from China and infects a rural community.
    Note: Scenario three is decidedly not realistic. Some websites reported that this is possible, which was then fact-checked and found to not have merit. Let’s consider it anyways as it is a useful thing to look at potential spread in rural communities.

Scenario 1 - What if an infected person escaped from quarantine?

Now that we know the risk of missing an infection, let's take things a step further and simulate what might happen if a single infectious case actually did go into the community. To do this we will use a model type called SIR (susceptible, infected, recovered) that is common in projecting infectious disease spread. It models the spread of a virus in a closed population assuming that everyone in the population is susceptible to the disease. While it isn't the most complex or even accurate model, it is useful in quickly creating projections of infectious disease spread in a population.

Important: The goal of this is not to project what might happen in the future, but to assume that it already happened and that the cases were not discovered. Our goal will be to use the infection numbers from the SIR model to evaluate the probability that the spread COVID-19 could remain undetected in the population.
As Ulaanbaatar is the main population center of Mongolia, we will model the spread of COVID-19 there. We will use code adapted from Learning Scientific Programming with Python book and Jonathan Soma's Python Disease Modeling Repository 7 8.

First we will define the function that will calculate the differential equation of the SIR model.

# The SIR model differential equations.
def deriv(state, t, N, beta, gamma):
    S, I, R = state
    # Change in S population over time
    dSdt = -beta * S * I / N
    # Change in I population over time
    dIdt = beta * S * I / N - gamma * I
    # Change in R population over time
    dRdt = gamma * I
    return dSdt, dIdt, dRdt

Let's define some key variables we will use to build our SIR model:

  • R0. Also called the basic reproduction number. It is the number of people a single person is expected to infect. This number does by itself consider time, so should not be considered a rate.
    • This number is probably the most important number when modeling COVID-19. Social distancing measures are designed to reduce this number in order to flatten the curve in an epidemic.
    • The WHO estimates that the R0 for COVID-19 is between 2 and 4 9.
  • Effective contact rate. This is the chance that an infectious person will transmit a disease to a susceptible person they come in contact with.
  • Recovery rate. The average number of days an infected person could pass on the disease. Expressed as 1/γ = n days.

    Note: By dividing the effective contact rate by the recovery rate, we can get the R0 value.
    We will use the following parameters in order to arrive at a R0 value of 2. This is on the lower end of the scale as assessed by the World Health Organization 9.
  • Effective contact rate = 0.5 - This means that an infected person has a 50% chance of transmitting the disease to a susceptible person.

  • Recovery rate = 4 days - This does not mean that a person is cured in 4 days. It simply means that after 4 days a person, on average, would not be infecting because they would be aware of the infection or for other reasons.
  • R0 = 2 - By taking 0.5 / (1/4) we arrive at an R0 of 2.
  • Population = 1.5 million - This is roughly the population of Ulaanbaatar. We will model the population of Ulaanbaatar because it is where half the population of the city lies and is the major urban area of the country.
  • Time period = 120 days - A time period of 120 days was chosen as it is sufficient time to see the rise and fall of infection numbers.
effective_contact_rate = 0.5
recovery_rate = 1/4

# Everyone not infected or recovered is susceptible
total_pop = 1500000
recovered = 0
infected = 1
susceptible = total_pop - infected - recovered

# A list of days, 0-120
days = list(range(0, 120))

# We can calculate the R0 by dividing the effective contact rate by the recovery rate.
print("R0 is", effective_contact_rate / recovery_rate)
R0 is 2.0

Now we can integrate the differential equations to calculate the number of infections from day 0 to 120 based on our variables defined above.

# Use Scipy's differential equation method
ret = odeint(deriv,
             [susceptible, infected, recovered],
             days,
             args=(total_pop, effective_contact_rate, recovery_rate))
S, I, R = ret.T

# Build a dataframe of S, I, and R
df = pd.DataFrame({
    'suseptible': S,
    'infected': I,
    'recovered': R,
    'day': days
})

# Look at days 0-10
df.head(11)
suseptible infected recovered day
0 1499999.00000 1.00000 0.00000 0
1 1499998.43195 1.28402 0.28403 1
2 1499997.70256 1.64872 0.64872 2
3 1499996.76601 2.11700 1.11700 3
4 1499995.56345 2.71827 1.71828 4
5 1499994.01934 3.49033 2.49034 5
6 1499992.03667 4.48166 3.48168 6
7 1499989.49088 5.75454 4.75458 7
8 1499986.22204 7.38895 6.38901 8
9 1499982.02479 9.48755 8.48766 9
10 1499976.63548 12.18217 11.18235 10

We can see that after 10 days 1 infection will turn into more than 12 infections. Viruses spread in an exponential manner, so the the number of infections will rise rapidly. We can visualize the full time period from day 0 to 120 with the plot below.

#collapse-hide
fig, ax = plt.subplots(figsize=(16,9))
ax.plot(df['day'], df['infected'])
ax.set_xlabel('Days', fontsize=20)
ax.set_ylabel('Number Infected', fontsize=20)
ax.set_title('SIR Model of COVID-19 Spread', fontsize=26);

In this model, the peak of the number of infections is at day 56 with 229,887 infections. This shows the potential danger of even one person infected with COVID-19 being in the community.

While this is only a simulation of what could happen, it is already very hard to believe that 229,887 infections in the public of Mongolia could be somehow missed by random testing and hospitals. Even with an asymptomatic rate of 80%, that still leaves 20% which would have severe enough symptoms to warrant a hospital visit (around 45,977 people).

Let's look at a short time period though. If an infected person has only been in the community for 20 days, how many infections might there be?

#collapse-hide
df[df['day'] == 20]
suseptible infected recovered day
20 1499704.25917 148.35584 147.38499 20

At day 20 there would already be 148 infections. How many of those would be asymptomatic and how many would show symptoms?

print("Asymptomatic cases: ", 148 * .80)
print("Sympomatic cases: ", 148 * .20)
Asymptomatic cases:  118.4
Sympomatic cases:  29.6

So we would have 29 symptomatic cases. Given the existing procedure of testing symptomatic cases presented at hospitals, only one positive test of someone outside quarantine would be needed to at least hint at community transmission of COVID-19 in Mongolia. So far there has not been a single positive test of anyone outside a controlled environment, so this situation would seem to be highly unlikely.

In order to ensure our logic is right, we can determine the probability that all 29 symptomatic cases would return a false negative. To calculate the probability of N false negatives, we can take the probability of a single false negative and raise it to the Nth power.

# A double asterisk is used to create exponents in Python
0.01 ** 29
1.0000000000000006e-58

In this case the probability of 29 false negatives is extremely small. So small it is essentially impossible. To give some perspective, I wanted to compare this probability to something very unlikely, being struck by lightning.

I found that the odds of being struck by lightning in a given year are 1 in 1.222 million. Converting to probability we get 6.69x10-13. This is still a long way away from 1.00x10-58. Let's calculate a roughly equivalent probability of lightning strikes.

# Odds of being struck by lightning are 1 in 1.222 million. Source: https://www.weather.gov/safety/lightning-odds
lightning_strike_odds = 1 / 1222000

# We can convert the odds to probability by dividing by one plus the odds.
lightning_strike_prob = lightning_strike_odds / (1 + lightning_strike_odds)

print("Probability of being struck by lightning in a given year: ", lightning_strike_prob)
print("Probability of being struck by lightning TWICE in a given year: ", lightning_strike_prob**2)
print("Probability of being struck by lightning TEN TIMES in a given year: ", lightning_strike_prob**10)
Probability of being struck by lightning in a given year:  8.183299359002163e-07
Probability of being struck by lightning TWICE in a given year:  6.696638839904521e-13
Probability of being struck by lightning TEN TIMES in a given year:  1.3467419459456997e-61

The probability of being struck by lightning twice in a given year is still less at 5.28x10-19. This is still a significantly higher probability than 1.00x10-58. It turns out that it would take 10 lightning strikes in a given year to have a roughly equal probability to having 29 false negatives.

So, if a single infected individual was somehow in the population, and they were able to avoid being detected for 20 days, there is basically a 0% probability that they and those they infected would not be discovered. Converted to odds, 0.01^29 is 1 in ten octodecillion. That's 57 zeros behind the 1.

Scenario 2 - The virus has been in the community since before quarantine started.

The very first infection recorded in Mongolia was from a French citizen who came to Mongolia for work. They ignored requirements to self-quarantine and came into contact with more than 100 individuals. Luckily no one else was infected, and the person was isolated. This brings up a fear that perhaps COVID-19 has been in Mongolia since January.

In Scenario 1, we already determined that the probability of a missed large scale infection is essentially 0% 20 days after the introduction of a single infectious individual. We can restate this differently by saying that any supposed community infections more than 20 days in the past is essentially irrelevant. The likelihood that they occurred or were any danger to the larger population is 0%.

If this doesn't convince you, let's look at a method developed to discover undetected COVID-19 cases in populations where there is insufficient testing. Called excess mortality, it has become an excellent method to determine more realistic numbers of COVID-19 deaths in countries or areas where there is less testing or delayed official infection numbers 10.

We can create our own calculated of excess mortality for Mongolia by calculating the average number of deaths (of all causes) per month in Mongolia and then comparing the numbers to January to June of 2020. If we see higher numbers than average, this is excess mortality. If roughly average or below average, we can determine that there was in fact no excess mortality. Data was collected from the National Statistics Office at 1212.mn 11.

# Import data from 1212.mn
deaths = pd.read_csv('notebook_data/covid-19-risk/DT_NSO_2100_027V2_-_2020-08-03_-_www.1212.mn.csv', thousands=',')

# Reformat dataframe to be in long form and drop extra column. 
deaths = deaths.melt(id_vars=['Aimag'], var_name='date', value_name='deaths').drop(columns=['Aimag'])

# View a sample of 5 rows of the dataframe.
deaths.sample(5)
date deaths
7 2016-08 1355
33 2018-10 1461
12 2017-01 1357
31 2018-08 1391
20 2017-09 1360
# Get the last two characters of the date column and save to a new column called month.
deaths['month'] = deaths['date'].str[-2:]

deaths.sample(5)
date deaths month
49 2020-02 1298 02
6 2016-07 1341 07
23 2017-12 1411 12
7 2016-08 1355 08
20 2017-09 1360 09
# Calculate the average number of deaths for each month, excluding January to June 2020.
deaths.iloc[0:48].groupby(by='month').mean()
deaths
month
01 1433.25000
02 1303.00000
03 1436.00000
04 1355.75000
05 1435.75000
06 1306.00000
07 1390.75000
08 1379.50000
09 1371.50000
10 1409.00000
11 1316.75000
12 1495.00000

Now that we have our average number of deaths per month, we can compare these to the death numbers for January to June 2020.

#collapse-hide
fig, ax = plt.subplots(figsize=(16,9))
ax.plot(np.arange(1,13), deaths.iloc[0:48].groupby(by='month').mean()['deaths'], label='Historic Avg. 2016-2019')
ax.plot(np.arange(1,7), deaths.iloc[48:]['deaths'], label='Reported Deaths')
ax.set_xlabel('Month', fontsize=20)
ax.set_ylabel('Deaths', fontsize=20)
ax.legend()
ax.set_title('Average Deaths Per Month vs Jan-Jun 2020', fontsize=28);

It is clear that for January to June 2020 actual deaths were equal to or lower than historic averages. This means that we see no excess mortality. It is safe to say that there was no undiscovered outbreak of COVID-19 in Mongolia from January to June 2020.

Scenario 3 - An infected migrating bird flies in from China and infects nomads.

For our last scenario, let's assume that a migratory bird infected with COVID-19 infects a herder while he (let's also assume it is a he) is bring his cows home for the night.

Important: There is no evidence that bird to human transmission is possible.

We will create two models:

  1. The herder decides to go to the Aimag center while he is infectious. Model the spread of the disease and the probability of being undiscovered.
  2. The herder doesn't go anywhere, and only sees 2 people outside his immediate family each week. Model the spread of the disease.

Infection in aimag center

We will use the same methodology as Scenario 1. First we will model the spread of the disease with a SIR model and then check the probability of missed infections. Because this is still a city, we will use the same effective contact rate and recovery rate.

The primary difference here is the population. We will use a population of 12,000, as it is similar to the population of several aimag centers in Mongolia.

effective_contact_rate = 0.5
recovery_rate = 1/4

# Everyone not infected or recovered is susceptible
total_pop = 12000
recovered = 0
infected = 1
susceptible = total_pop - infected - recovered

# A list of days, 0-120
days = list(range(0, 120))

# We can calculate the R0 by dividing the effective contact rate by the recovery rate.
print("R0 is", effective_contact_rate / recovery_rate)
R0 is 2.0
# Use Scipy's differential equation method
ret = odeint(deriv,
             [susceptible, infected, recovered],
             days,
             args=(total_pop, effective_contact_rate, recovery_rate))
S, I, R = ret.T

# Build a dataframe of S, I, and R
df_aimag = pd.DataFrame({
    'suseptible': S,
    'infected': I,
    'recovered': R,
    'day': days
})

# Look at days 0-10
df_aimag.head(11)
suseptible infected recovered day
0 11999.00000 1.00000 0.00000 0
1 11998.43202 1.28396 0.28402 1
2 11997.70282 1.64850 0.64868 2
3 11996.76665 2.11648 1.11687 3
4 11995.56487 2.71718 1.71796 4
5 11994.02220 3.48817 2.48962 5
6 11992.04218 4.47761 3.48021 6
7 11989.50111 5.74717 4.75172 7
8 11986.24054 7.37580 6.38366 8
9 11982.05761 9.46450 8.47789 9
10 11976.69282 12.14228 11.16490 10

#collapse-hide
fig, ax = plt.subplots(figsize=(16,9))
ax.plot(df_aimag['day'], df_aimag['infected'])
ax.set_xlabel('Days', fontsize=20)
ax.set_ylabel('Number Infected', fontsize=20)
ax.set_title('SIR Model of COVID-19 Spread - Aimag Center of Pop 12,000', fontsize=26);

#collapse-hide
df_aimag[df_aimag['day'] == 20]
suseptible infected recovered day
20 11714.47849 141.53496 143.98655 20

We see that the peak of infections in a smaller population occurs earlier (around 35 days). After 20 days there are 141 infections. Let's see how many are symptomatic and asymptomatic.

print("Asymptomatic cases: ", 141 * .80)
print("Sympomatic cases: ", 141 * .20)
Asymptomatic cases:  112.80000000000001
Sympomatic cases:  28.200000000000003

With 28 symptomatic cases, let's calculate the probability that they all would have a false negative test.

0.01 ** 28
1.0000000000000006e-56

Again we see an incredibly small probability. It is clear that any outbreak in a populated area like an aimag center would be identified relatively quickly.

Infection in a rural area

Now let's look at the SIR model in a rural area with a low population density. We use the following variables:

  • Effective contact rate - The issue with this number is that nomadic families have very occasional contact with those outside their immediate family. We will use a rate of .3 as a stand in for less contact. This means that the infected person has a 30% chance of infecting another person each day.
  • Recovery rate - This will stay at 4 days.
  • Population - We will assume a local population of 100 people.

It is important to note that we have little data on the spread of COVID-19 in rural areas. Also, the SIR model does not factor in distance between the population. Nevertheless, let's go ahead with the model and see the results.

effective_contact_rate = 0.3
recovery_rate = 1/4

# Everyone not infected or recovered is susceptible
total_pop = 100
recovered = 0
infected = 1
susceptible = total_pop - infected - recovered

# A list of days, 0-120
days = list(range(0, 120))

# We can calculate the R0 by dividing the effective contact rate by the recovery rate.
print("R0 is", effective_contact_rate / recovery_rate)
R0 is 1.2

Our R0 is 1.2. Any R0 less than 1 means that the disease would not spread in the population and would die out. With an R0 of 1.2, we should expect a very slow spread.

# Use Scipy's differential equation method
ret = odeint(deriv,
             [susceptible, infected, recovered],
             days,
             args=(total_pop, effective_contact_rate, recovery_rate))
S, I, R = ret.T

# Build a dataframe of S, I, and R
df_rural = pd.DataFrame({
    'suseptible': S,
    'infected': I,
    'recovered': R,
    'day': days
})

# Look at days 0-10
df_rural.head(11)
suseptible infected recovered day
0 99.00000 1.00000 0.00000 0
1 98.69642 1.04765 0.25593 1
2 98.37953 1.09655 0.52393 2
3 98.04909 1.14661 0.80430 3
4 97.70492 1.19775 1.09732 4
5 97.34689 1.24985 1.40326 5
6 96.97488 1.30280 1.72232 6
7 96.58885 1.35644 2.05471 7
8 96.18879 1.41062 2.40059 8
9 95.77476 1.46518 2.76006 9
10 95.34687 1.51993 3.13319 10

#collapse-hide
fig, ax = plt.subplots(figsize=(16,9))
ax.plot(df_rural['day'], df_rural['infected'])
ax.set_xlabel('Days', fontsize=20)
ax.set_ylabel('Number Infected', fontsize=20)
ax.set_title('SIR Model of COVID-19 Spread - Rural Area with Population of 100', fontsize=26);

We see that the virus would spread to 2 people and then effectively die out. While this model does not consider distance between the population, it is clear that with only occasional contact with the surrounding population the disease would not spread. In this case there is no cause for concern from an isolated rural infection.

Determining the Risk

My goal for this analysis is not to say that the risk of coronavirus in Mongolia is over. On the contrary, there are still infections being announced each week of those in quarantine arriving from abroad. In addition, some countries still have major challenges in curbing rising infections.

Instead, my goal was to assess the risk of community transmission over the past several months given what we know about quarantine and testing procedures in Mongolia. While this analysis is far from a comprehensive assessment of risk for the entire country, it does answer some of the most common questions I've heard from Mongolians worried about coronavirus. A few examples:

  • All it takes is one infected person to cause a disaster.
    • We have found that if one infected person was actually in public, we would almost surely have found out by now.
  • How do we know someone can't just sneak out of quarantine?
    • If someone did sneak out of quarantine, we would very shortly see several people with symptoms at hospitals. These cases would very quickly discovered and contact tracing would rapidly occur. Just remember the figures, these cases would be quickly discovered.
  • Can you really trust that the government is doing its job?
    • With regards to the quarantine, I believe the evidence I have given supports that the government is in fact doing its job. For other areas, I would defer to journalists. It is in their area of responsibility to question the government and inform the Mongolian people in this respect.
  • Better safe than sorry.
    • There is risk in nearly anything we do. I've shown that you are more likely to be struck by lightning several times in one year than for a cluster of COVID-19 infections to be missed. If this incredibly small risk is causing you to change your behavior, then you also may want to rethink walking outside in the rain.

My hope is that this work and the work of others can start a discussion of how best to move forward given the times we are in. Staying closed off indefinitely is most surely a plan that will have very serious consequences. Yet exactly how restrictions should be relaxed and things reopened is a larger discussion for the Mongolian people and the government.

About the Author

Robert Ritz is a data scientist and educator based in Ulaanbaatar, Mongolia. He has a passion for data science for social good. You can contact him at robertritz@outlook.com.