# Incidence (epidemiology)

In epidemiology, incidence is a measure of the probability of occurrence of a given medical condition in a population within a specified period of time. Although sometimes loosely expressed simply as the number of new cases during some time period, it is better expressed as a proportion or a rate[1] with a denominator.

## Incidence proportion

Incidence proportion (IP), also known as cumulative incidence, is defined as the probability that a particular event, such as occurrence of a particular disease, has occurred before a given time.[2]

It is calculated dividing the number of new cases during a given period by the number of subjects at risk in the population initially at risk at the beginning of the study. Where the period of time considered is an entire lifetime, the incidence proportion is called lifetime risk.[3]

For example, if a population initially contains 1,000 persons and 28 develop a condition since the disease first occurred until a certain point in time, the cumulative incidence proportion is 28 cases per 1,000 persons, i.e. 2.8%.

IP is related to incidence rate (IR) and duration of exposure (D) as follows:[4]

${\displaystyle IP(t)=1-e^{-IR(t)\cdot D}\,.}$

## Incidence rate

The incidence rate is a measure of the frequency with which a disease or other incident occurs over a specified time period.[5][6] It is also known as the incidence density rate or person-time incidence rate,[7] when the denominator is the combined person-time of the population at risk (the sum of the time duration of exposure across all persons exposed).[8]

In the same example as above, the incidence rate is 14 cases per 1000 person-years, because the incidence proportion (28 per 1,000) is divided by the number of years (two). Using person-time rather than just time handles situations where the amount of observation time differs between people, or when the population at risk varies with time.[9]

Use of this measure implies the assumption that the incidence rate is constant over different periods of time, such that for an incidence rate of 14 per 1000 persons-years, 14 cases would be expected for 1000 persons observed for 1 year or 50 persons observed for 20 years.[10] When this assumption is substantially violated, such as in describing survival after diagnosis of metastatic cancer, it may be more useful to present incidence data in a plot of cumulative incidence, over time, taking into account loss to follow-up, using a Kaplan-Meier Plot.

## Incidence vs. prevalence

Incidence should not be confused with prevalence, which is the proportion of cases in the population at a given time rather than rate of occurrence of new cases. Thus, incidence conveys information about the risk of contracting the disease, whereas prevalence indicates how widespread the disease is. Prevalence is the proportion of the total number of cases to the total population and is more a measure of the burden of the disease on society with no regard to time at risk or when subjects may have been exposed to a possible risk factor. Prevalence can also be measured with respect to a specific subgroup of a population (see: denominator data). Incidence is usually more useful than prevalence in understanding the disease etiology: for example, if the incidence rate of a disease in a population increases, then there is a risk factor that promotes the incidence.

For example, consider a disease that takes a long time to cure and was widespread in 2002 but dissipated in 2003. This disease will have both high incidence and high prevalence in 2002, but in 2003 it will have a low incidence yet will continue to have a high prevalence (because it takes a long time to cure, so the fraction of individuals that are affected remains high). In contrast, a disease that has a short duration may have a low prevalence and a high incidence. When the incidence is approximately constant for the duration of the disease, prevalence is approximately the product of disease incidence and average disease duration, so prevalence = incidence × duration. The importance of this equation is in the relation between prevalence and incidence; for example, when the incidence increases, then the prevalence must also increase. Note that this relation does not hold for age-specific prevalence and incidence, where the relation becomes more complicated.[11]

### Example

Consider the following example. Say you are looking at a sample population of 225 people, and want to determine the incidence rate of developing HIV over a 10-year period:

• At the beginning of the study (t=0) you find 25 cases of existing HIV. These people are not counted as they cannot develop HIV a second time.
• A follow-up at 5 years (t=5 years) finds 20 new cases of HIV.
• A second follow-up at the end of the study (t=10 years) finds 30 new cases.

If you were to measure prevalence you would simply take the total number of cases (25 + 20 + 30 = 75) and divide by your sample population (225). So prevalence would be 75/225 = 0.33 or 33% (by the end of the study). This tells you how widespread HIV is in your sample population, but little about the actual risk of developing HIV for any person over a coming year.

To measure incidence you must take into account how many years each person contributed to the study, and when they developed HIV. When it is not known exactly when a person develops the disease in question, epidemiologists frequently use the actuarial method, and assume it was developed at a half-way point between follow-ups. In this calculation:

• At 5 yrs you found 20 new cases, so you assume they developed HIV at 2.5 years, thus contributing (20 * 2.5) = 50 person-years of disease-free life.
• At 10 years you found 30 new cases. These people did not have HIV at 5 years, but did at 10, so you assume they were infected at 7.5 years, thus contributing (30 * 7.5) = 225 person-years of disease-free life. That is a total of (225 + 50) = 275 person years so far.
• You also want to account for the 150 people who never had or developed HIV over the 10-year period, (150 * 10) contributing 1500 person-years of disease-free life.

That is a total of (1500 + 275) = 1775 person-years of life. Now take the 50 new cases of HIV, and divide by 1775 to get 0.028, or 28 cases of HIV per 1000 population, per year. In other words, if you were to follow 1000 people for one year, you would see 28 new cases of HIV. This is a much more accurate measure of risk than prevalence.

• Attack rate
• Attributable risk
• Denominator data
• Rate ratio

## References

1. "INCIDENCE - Epidemiology". Encyclopaedia Britannica. Retrieved 3 April 2020.
2. Dodge, Y. (2003). The Oxford Dictionary of Statistical Terms. OUP. ISBN 0-19-920613-9.
3. Rychetnik L, Hawe P, Waters E, Barratt A, Frommer M (July 2004). "A glossary for evidence based public health". J Epidemiol Community Health. 58 (7): 538–45. doi:10.1136/jech.2003.011585. PMC 1732833. PMID 15194712.
4. Bouyer, Jean; Hémon, Denis; Cordier, Sylvaine; Derriennic, Francis; Stücker, Isabelle; Stengel, Bénédicte; Clavel, Jacqueline (2009). Épidemiologie principes et méthodes quantitatives. Paris: Lavoisier.
5. Hargrave, Marshall. "What Does the Incidence Rate Measure?". Investopedia. Retrieved 2019-09-30.
6. Monson, Richard R. (1990-04-25). Occupational Epidemiology, Second Edition. CRC Press. p. 27. ISBN 978-0-8493-4927-0.
7. Last, John M., ed. (2001). A Dictionary of Epidemiology (4 ed.). New York, NY: Oxford University Press. ISBN 978-0-19-514169-6.
8. "Principles of Epidemiology - Lesson 3 - Section 2". Centers for Disease Control and Prevention. 2012-05-18. Retrieved 2021-01-13.
9. Coggon D, Rose G, Barker DJ (1997). "Quantifying diseases in populations". Epidemiology for the Uninitiated (4th ed.). BMJ. ISBN 978-0-7279-1102-5.
10. Dunn, Olive Jean; Clark, Virginia A. (2009). Basic statistics: a primer for the biomedical sciences (4th ed.). Hoboken, N.J.: John Wiley & Sons. pp. 3–5. ISBN 9780470496855. Retrieved 9 May 2016.
11. Brinks R (2011) "A new method for deriving incidence rates from prevalence data and its application to dementia in Germany", arXiv:1112.2720