12 P.V. Brennan , L.P. Brennan

Abstract

The reproduction number, R0, is commonly used, and sometimes misused, in conjunction with the classic Kermack and McKindrick theory based on the assumption of homogeneity, in order to estimate herd immunity threshold (HIT). This provides a crude first estimate of HIT, with more elaborate modelling required to arrive at a more realistic value. Early estimates of HIT for Covid-19 were based on this simplistic homogeneous approach, yielding high HIT values that have since been revised downwards with more sophisticated network modelling taking account of R0 heterogeneity and with reference to the low HIT found from serological sampling in Stockholm County. The aim of this paper is to describe a simple model in which host susceptibility is directly linked to the heterogeneous R0 distribution, to shed further light on the mechanisms involved and to arrive at a bimodal R0 distribution consistent with the Covid-19 HIT observed in Stockholm County.

1. Introduction

The herd immunity threshold (HIT) is a widely-used concept to estimate the infection rate within a population at which infection ceases to grow exponentially [1, 2]. A basic expression is frequently used for this estimate based on the assumption of homogeneity: each member of the population having equal reproduction number, R0. However, this assumption is simplistic and unrealistic in a great many cases and it is often observed that the actual HIT is substantially lower than given by this elementary approach [3].

This work describes a simple model to take account of heterogeneity of R0 in the estimate of HIT, by assuming that host susceptibility is directly proportional to reproduction number. A variety of R0 distributions are explored to examine the effect, followed by working backwards from real data to arrive at a bimodal distribution that yields a HIT close to that observed with Covid-19 in Stockholm County.

2. Method

The HIT, the proportion of immunity within a given population beyond which the effective reproduction number is unity, is easily deduced and given by:

pc = R0 −1 (1) R0

This is based on a very simple, naïve model of a homogeneous population in which a given infected individual is equally likely to infect R0 other individuals, all of whom are susceptible hosts at the outset. It is assumed that the entire population has the same R0 value, i.e. R0 is constant with p, the cumulative infection variable, and the same susceptibility to infection.

In reality, R0 must vary, since some people are more likely than others to transmit infection due to occupation, environment, lifestyle and other factors. For instance, an infected nurse may be many times as likely to infect others as a single person working from home. Hence

1 Dept of Electronic & Electrical Engineering, University College London. 2 Dept of Mechanical Engineering, University College London.

1

medRxiv preprint doi: https://doi.org/10.1101/2020.05.19.20104596.this version posted May 22, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license .

there is actually a statistical distribution of R0 across the population [4, 5]. If R0 is variable (heterogeneous) but host susceptibility is assumed to remain constant then it is valid to use the mean value of R0 in the population to calculate the herd immunity threshold,

R −1 1

pc= 0 where R0=∫R0(p)dp (2)

R0 0

and it is common practice to do this by default [6]. In practice, however, host susceptibility is also variable and in general those with a higher R0 value are likely to exhibit greater susceptibility to infection for the very same reasons that they are more likely to transmit infection to others, for instance by working in professions such as medical practice, social care or hospitality where they are frequently in contact with other infected people.

Figure 1. Illustration of change in R-distribution after δp of the population becomes infected with susceptibility proportional to R0.

A reasonable premise and first approximation that is used in this work is to assume that host susceptibility is in direct proportion to the associated R0 value., i.e. the probability of becoming infected is proportional to the probability of infecting others. Based on this principle, the R0 distribution after a small proportion δp of the population have become infected may be obtained by adjustment of the p-values. This is illustrated in Figure 1 with reference to a population with two R0 values, R01 and R02 (1 and 3 for illustration) in initial proportion p01 : p02. Infection occurs in proportion to the respective R0 values, giving a change in distribution as follows:

⎛Rδp⎞⎛Rδp⎞

p11=p01⎜1− 01 ⎟ andp12=p02⎜1− 02 ⎟ (3)

⎝ p01R01 + p02R02 ⎠ ⎝ p01R01 + p02R02 ⎠

It is clear from Figure 1 that cohorts with higher R0 values have a proportionately higher share of infection, biasing the distribution towards lower R0 values as the infection rate increases. It follows from Equation (3) that the R-value after δp infections is given, in the general case, by the relation,

⎛ ⎞

⎜ Rδp⎟

mn 0nm

∑N 2 δppmnR0n

N

R =∑p 1− 0n R =R− n=1 (4)

m+1

n=1

N⎟ ⎜ ∑p R ⎟

Rm

mn 0n ⎝n=1 ⎠

2

medRxiv preprint doi: https://doi.org/10.1101/2020.05.19.20104596.this version posted May 22, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license .

This process is repeated in a recursive fashion, each time updating the probability distribution, pnm, according to Equation (3). The herd immunity threshold is reached when the reproduction number falls to 1. A non-recursive approximation to the last result is obtained by regarding δp as the total change in population, Δp, giving for a continuous distribution,

1

Δ p ∫ R 02 ( p ) d p

R = R0 − 0 (5) R0

and, by equating this to 1, a herd immunity threshold of:

pc(susceptibility−adjusted ) = (R0 −1)R0 (6)

For the homogeneous case of a constant reproduction number across the population, the last result converges to that of Equation (1), as would be expected.

3. Results

The recursive, method described in Section 2 (Equations (3) and (4)) is used to explore the HITs obtained with a range of candidate R0 distributions in relation to the HIT obtained with a homogeneous population. A mean R0 of 2.4 is taken for all cases. Continuous distributions are used, with probability density functions presented at the outset and calculated at the threshold of herd immunity, the area beneath the curve being unify at the outset and equal to the HIT at herd immunity.

Figure 3 shows the application of the approach to a homogeneous population. The mean R0 value is of course 2.4, and the herd immunity thresholds are 7/12 = 58.3% in both the homogeneous and heterogeneous, susceptibility-adjusted models since all individuals have equal R0 values and host susceptibilities. The R value declines linearly with infection rate from 2.4, reaching 1 at 58.3% infection rate, as expected from the simple homogeneous model.

1

∫ R 02 ( p ) d p 0

3

medRxiv preprint doi: https://doi.org/10.1101/2020.05.19.20104596.this version posted May 22, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license .

Figure 3. Homogeneous and susceptibility-adjusted herd immunity thresholds, constant R0 value.

Figure 4 shows a population with uniformly-distributed R0 over the range 0 to 4.8. At herd immunity, the distribution is biased heavily towards the lower R0 values with a steadily- declining trend with R0. There is a significant reduction in HIT from 58.3% to 46.1% with the corresponding equivalent R0 value (which would produce the same HIT from the homogeneous model) of 1.85.

4

Figure 4. Homogeneous and susceptibility-adjusted herd immunity thresholds, uniformly-distributed R0.

Figure 5 shows the case of a Rayleigh-distributed population with a mean value of 2.4. Infection occurs more rapidly in the higher R0 values, skewing the distribution towards lower values. This results in a modest reduction in HIT from 58.3% to 49.2% with the corresponding equivalent R0 value that would produce the same HIT from the homogeneous model of 1.97. Note that in this case, as in all cases, the total area under the red and green curves is 1 and the area under the green curve is the HIT. It is clear that the HIT value can only be reduced by this mechanism, regardless of the R0 distribution, relative to the homogeneous model.

5

Figure 5. Homogeneous and susceptibility-adjusted herd immunity thresholds, Rayleigh-distributed R0.

The final case is an attempt to arrive at a distribution close to the observed 17% HIT from random serological testing of Covid-19 in Stockholm County [7]. This requires a quite severe bimodal or multi-modal distribution. An example of such a bimodal distribution fitting the Stockolm data is shown in Figure 6, comprising a Rayleigh region with a mean close to 1 and a Gaussian region with a mean of 12, in the ratio 85:15, not far from the 80:20 Pareto rule often cited in the context of super-spreaders [8]. This, in practice, may represent the situation where the bulk of the population has a relatively low R0, whilst a minority cohort in professions or situations with much greater exposure to infection, such as medical or public- facing occupations, have a much higher R0, so-called ‘super-spreaders’ [9].

From Figure 6, the R-value trend with the postulated bimodal distribution is seen to decline very rapidly with infection rate and reaches herd immunity at just 17.6% prevalence. At the onset of herd immunity, most of the high-R0 cohort are infected whilst only a small minority of the low-R0 cohort are infected. The R0 value found in Stockholm County and used here was 2.4, which would correspond to a HIT of 58.3% using the elementary Kermack and McKindrick approach based on a homogeneous population, and was indeed the estimate made by Ferguson [10], some 3.4 times the observed value.

6

Figure 6. Homogeneous and susceptibility-adjusted herd immunity thresholds, bimodal distributed R0.

The recursive model described in Section 3 is extended to calculate the total infection rate versus time/generation for the bimodal case representing Stockholm County, with the result as shown in Figure 7. The final infection rate is 32.7% in the absence of intervention or 28.6%, 24.4% and 17.9% with intervention equivalent to arresting the cases at infection rates of 5%, 10% and 15%, respectively. This suggests that intervention to temporarily slow or halt the infection rate just below the HIT is effective at limiting the final infection rate to just over the HIT. This would require careful timing and monitoring of infection rate across the population, which may not be possible in practice.

7

Figure 7. Infection rate versus time/generation estimate based on Stockholm County data and the susceptibility-adjusted model.

4. Discussion and Conclusion

An enhanced yet simple model for herd immunity threshold has been described based on a non-uniform reproduction number distribution and the assumption that host susceptibility is directly proportional to reproduction number. Results for a number of R0 distributions show that the herd immunity threshold is reduced relative to the homogeneous model and may be substantially reduced, for instance by a factor of 3.4 from the Stockholm County data used here, if there is a large variation in R0 distribution across the population. The herd immunity threshold is strongly influenced not just by the mean R0 value but by its distribution, and simply using the mean R0 value for a given population as done by Ferguson [10] for Covid- 19 is likely to lead to an unrealistic overestimate.

An attempt has been made to postulate a distribution that yields the HIT of 17% observed from serological sampling of Covid-19 in Stockholm County. This requires a quite severe bimodal distribution with a cohort of super-spreaders having much higher R0 and associated susceptibility than the bulk of the population. Whether this situation is realistic cannot easily be determined, but it serves to illustrate the principle and offer insight into the likely R0 distribution responsible for the observed result. In addition to demographic R0 heterogeneity, spatial and other heterogeneities are likely to contribute to modulation of the HIT, as described by network models [5], so it is likely that the simple mechanism modelled here is a partial explanation for the observed low HIT seen for Covid-19 in Stockholm County.

The significance for infections such as Covid-19 is that in order to accurately estimate HIT it is crucial to take account not just of the basic R0 value averaged over a given population but also of its distribution, which results in a reduced and possibly substantially-reduced HIT estimate.

8

References

1. G. Gonçalves, ‘Herd immunity: recent uses in vaccine assessment.’ Expert Rev. Vaccines 7, 1493-1506, 2008.

2. P. Fine, K. Eames, D. L. Heymann, ‘Herd immunity: a rough guide,’ Clin. Infect. Dis. 52, 911-916, 2011.

3. T. Britton, F. Ball, P. Trapman, ‘The disease-induced herd immunity level for COVID-19 is substantially lower than the classical herd immunity level’. arXiv:2005.03085, 2020.

4. M. Gabriela M. Gomes, Rodrigo M. Corder, Jessica G. Kin, Kate E. Langwig, Caetano Souto-Maior, Jorge Carneiro, Guilherme Gonçalves, Carlos Penha-Gonçalves, Marcelo U. Ferreira, Ricardo Aguas, ‘Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold’, medRxiv preprint doi: https://doi.org/10.1101/2020.04.27.20081893, 12 May 2020.

5. Laurent Hébert-Dufresne, M. Althouse, Samuel V. Scarpino, Antoine Allard, ‘Beyond R0: Heterogeneity in secondary infections and probabilistic epidemic forecasting’, medRxiv preprint doi: https://doi.org/10.1101/2020.02.10.20021725, 16 April 2020.

6. K. O. Kwok, F. Lai, W. I. Wei, S. Y. S. Wong, J. Tang, ‘Herd immunity – estimating the level required to halt the COVID-19 epidemics in affected countries’, J. Infect. 10.1016/j.jinf.2020.03.027, 2020.

7. https://www.folkhalsomyndigheten.se/contentassets/2da059f90b90458d8454a04955d1697f/skattning- peakdag-antal-infekterade-covid-19-utbrottet-stockholms-lan-februari-april-2020.pdf, April 2020.

8. R. A. Stein, ‘Super-spreaders in infectious diseases’, International Journal of Infectious Diseases, Volume 15, Issue 8, Pages e510-e513, https://doi.org/10.1016/j.ijid.2010.06.020, August 2011.

9. J.O. Lloyd Smith, S.J. Schreiber, P.E. Kopp, W.M. Getz, ‘Superspreading and the effect of individual variation on disease emergence’, Nature 438, 355, 2005.

10. Neil M Ferguson et al., Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. Imperial College COVID-19 Response Team Report 9, https://spiral.imperial.ac.uk:8443/handle/10044/1/77482, 16 March 2020.

9