česky | english | sitemap

Czech Gynaecological and Obstetrical Society | Ministry of Health | Institute of Biostatistics and Analyses, Masaryk University
GlaxoSmithKline     Roche |

Other sources of information
Cervical cytology (in Czech)www.cipek.cz
Cervical cytology
Cancer Screening in the European Unionec.europa.eu
Cancer Screening in the European Union (2017)
[ 10,6 MB]
SVOD - Epidemiology of malignant tumours in the Czech Republicwww.svod.cz
Epidemiology of malignant tumours
in the Czech Republic
NCI Bethesda System
HPV College


Incidence and prevalence predictions for cervical cancer in the Czech Republic in 2015

L. Dušek, R. Vyzula, J. Abrahámová, J. Fínek, L. Petruželka, J. Vorlíček, O. Májek, J. Koptíková, T. Pavlík, J. Mužík

1. Methodology of predictions

The aim of cancer incidence and prevalence predictions is to provide background information for a rational discussion on the costs of cancer therapy and on the numbers of potentially treated patients. The specific outputs of the project involve:

  • an audit of population-based data of the Czech National Cancer Registry, and the definition of a reference dataset,
  • an estimate of cancer incidence rates in 2015 and the numbers of newly diagnosed patients who would be provided cancer treatment as the primary therapy,
  • estimates of cancer prevalence according to various clinical stages in 2015, and estimated numbers of patients whose cancers would relapse or progress, and would therefore be treated in 2015,
  • methodical standards and reference data which would enable further elaboration of performed analyses (other diagnoses, other drugs, other modalities of cancer therapy, regional analyses, etc.).

1.1. Source data

Survival analyses are based on valid population-based data which has been obtained from legally stipulated administrators. Data is analysed in its de-identified form, i.e. without direct and indirect identification of an individual. Namely, the following data sources were used:

  • Czech National Cancer Registry (data administrator and provider: Institute of Health Information and Statistics of the Czech Republic, IHIS) – an epidemiological database on malignant tumours containing more than 1.8 million records since 1977. Reference dataset defined for the period 1995–2010 (which is more relevant for recent development) involves records on more than 1,000,000 patients. Population-based data is available up to the year 2010; the epidemiological situation in the following years is the subject of predictions, as described below.
  • Demographic data on the Czech population and the Death Records Database (data administrator and provider: Czech Statistical Office, CZSO) constitute an indispensable background information for the predictive assessment of epidemiological data. This data is used for the adjustment of age-standardised predictions of incidence rates.
  • Estimates by an expert panel at the Czech Society for Oncology: the probability of disseminated relapses at various stages of disease and at various times after the completion of primary treatment, and the probability of undergoing various modalities of anticancer treatment in different types of cancer. Such detailed information cannot be reliably obtained from population-based data.

An audit on CNCR data for the period 1995–2010 has revealed the high quality of this database, its data having been found to be representative enough. Even the minimum number of parameters from this population-based registry provides sufficient data needed for prediction analyses:

Records on diagnosis, including its date and manner:

  • assessment of the cancer load in specific regions or catchment areas of hospitals by newly diagnosed cancer patients, including the relevant trends and prognoses
  • performance of diagnostics in the evaluated area

Records on date of death:

  • assessment of the overall results of health care (overall survival)
  • estimates of prevalence rates, including the relevant trends and prognoses

Diagnostic records (clinical stage, TNM classification of the tumour):

  • efficiency of diagnostic methods, their ability to detect less advanced stages
  • relevant survival estimates, related to patient's condition at the time of diagnosis
  • estimates of health care load and the related costs

1.2. Definition of a reference dataset for the purpose of assessing cancer care results and cost analysis

The reliability of analyses is based on exact definition and specification of the employed dataset. In order to define a dataset for health care results and costs, data from population-based registries must be drawn with certain limitations.

  • First of all, the data must be recent enough, reflecting the current situation of the Czech health care system; historical trends might be very misleading. Most importantly, data employed for analyses must describe the patients who were actually treated in a health care facility (the number of malignant tumours diagnosed at autopsy does have its epidemiological significance, but it does not influence the cost assessment in any way).

The Czech National Cancer Registry has been employed for our analyses. The extent of the analysed data has been limited to the period 1995–2005, as this recent CNCR data contains valid records corresponding with the recent versions of TNM classification. Data from this period represents a sample large enough to provide trustworthy analyses (Figure 1). Records of patients with incomplete diagnosis resulting from refusal to treatment, complications or early death must be removed, as this data would distort the analyses of anticancer treatment costs.

The audit of available population-based data, therefore, results in a reference dataset of high-quality and trustworthy records which describe the treatment and health care results in patients in which diagnosis was properly completed and the treatment duly commenced. Figure 1 shows that even the subsequent separation of treated and untreated patients still provides a sample large enough to render reliable analyses.

Figure 1. Definition of reference dataset for the purpose of assessing cancer care results (Czech National Cancer Registry, 1995–2010).

1.3. Brief methodical description of performed calculations

The above-mentioned reference dataset (see Figure 1) being available, it can be used to estimate all components required for clinically relevant prediction analyses. The primary objective of these calculations is to reach a reliable estimate of the number of patients living in a given period and needing anticancer treatment. The knowledge of the clinical stage of the disease is of critical importance, as the proportion of clinical stages in living patients, combined with the knowledge of possible treatment scenarios, makes it possible to estimate expected costs. The following estimates are performed prospectively, as data from population-based registries are always available with a certain delay:

  1. Estimate of incidence rates. The estimate is done both for the entire dataset and for separate clinical stages. The methodology is based on long-term epidemiological trends, adjusting them with respect to demographic changes in the population. A standard calculation only takes into account the incidence rate of primary tumours as the first malignant disease in a given patient.
  2. Estimate of prevalence rates of treated patients. The prospective estimate of prevalence rate combines the estimated number of newly diagnosed patients in the years to come with the probability of x-year survival in patients diagnosed in the past. This multi-component estimate, therefore, combines regression estimates of the incidence rates with analyses of x-year survival, taking into consideration that only a certain proportion of patients diagnosed (and treated) in the past would survive in the assessed year (total prevalence rate), and that only a certain proportion of tumours in those patients would recur or progress, implying that only those patients might need anticancer treatment in a given year (see also Figure 2).
  3. Estimates of x-year survival rates. Considering the fact that significant changes have been observed as regards survival rates of cancer patients in the period 1977-2010, the estimates of x-year survival rates have been done using the principle of the so-called “moving window”. In this methodology, the x-year survival rates are estimated successively, using the cohort analysis defined by a 5-year time interval (e.g., cohorts of patients diagnosed in the years 2001–2005, 2000–2004, etc.). Each of these cohorts provides information on one-year survival rate to x-year survival rate, where x–1 is the number of years starting from the first year within the given interval to the last available year reported in CNCR. The width of interval defining one cohort was set to five years, as this is a standard width used in population-based survival analyses (Berrino et al., 2007).
  4. Estimate of frequency (probability) of cancer relapses or progressions in a given year. This is a very important parameter, essential to estimate the number of patients treated for a relapse or progression of the primary tumour. For this purpose, the data on mortality from malignant tumours was extracted from CNCR and the Death Records Database, as this data provides the exact date and the cause of death. Records on patient's death as a result of malignant tumour actually make it possible to derive the frequency of relapses, and thus also the probability of their occurrence, in the 1st, 2nd ...xth year since the primary diagnosis. These estimates are very relevant with respect to the assessment of health care costs, as terminal relapses (or progressions in advanced clinical stages) lead to disseminated cancers which are very expensive to treat. Population-based predictions are transferred for further processing to an expert panel.

Figure 2. The multi-component population-based estimate of the number of patients who might need anticancer treatment in a given year.

1.4. Limits of prediction analyses and possible risks of bias

All predictions given below have resulted from population-based data and central registries. This fact implies that there is a certain probability of inaccuracies; therefore, all point estimates are supplemented with a 90% confidence interval. Each individual point estimate must be interpreted inseparably from these probability limits which specify its statistical reliability and can prevent possible misinterpretations.


2. Epidemiological estimates: all cancers including other primary tumours in the same patient

 2.1. Predictive estimates of the overall incidence in 2015

Incidence prediction take account of all cancers reported to the Czech National Cancer Registry. 90% confidence intervals (in brackets) are provided for all estimates.

Cervical cancer
in 2010
Incidence prediction1 for year 2015
(90% confidence interval)
Stage I 483 511 (454; 567)
Stage II 118 116 (84; 148)
Stage III 214 213 (177; 249)
Stage IV 116 156 (125; 187)
Stage unknown2 63 29 (17; 41)
TOTAL 994 1,025 (857; 1,192)

1 The values in the table are predictions of the overall incidence including other primary tumours diagnosed in previously treated cancer patients. The predictions are supplemented with a 90% confidence interval.
2 Objective reasons for not reporting the disease stage involve diagnosis based on autopsy/DCO, early deaths, treatment not started for contraindications, patient’s refusal to treatment. If reason for not reporting the stage is not provided, the record is considered as incomplete. Records missing the information on clinical stage are not involved in the expected number of patients who might need anticancer treatment.


2.2. Predictive estimates of the overall prevalence in 2015 – calculation with adjustments to survival models

Estimates of the overall prevalence include the estimated numbers of newly diagnosed cancers in 2015, and the estimated numbers of living cancer patients who were diagnosed and treated in previous years (calculated according to survival probability models). The resulting estimates were adjusted in a way which reflects cancer progression to disseminated stages. Patients who were previously diagnosed in stages I, II or III, and whose cancers would probably relapse or progress to a disseminated stage, have been included in the predicted prevalence of the clinical stage IV. This model calculates the probability of relapses to stage IV only, because plausible population-based data is not available for a detailed monitoring of relapses to other stages. However, this limitation does not have a significant influence on population-based pharmacoeconomic indicators, because this particular model is focused on the monitoring of drugs indicated for disseminated conditions. 90% confidence intervals (in brackets) are provided for all estimates.

Estimates given below are therefore adjusted according to survival probability models and according to models for relapses/progressions of primary tumours.

Cervical cancer
Prevalence prediction1 for year 2015
(90% confidence interval)
Stage I 12,675 (12,490; 12,860)
Stage II 2,344 (2,264; 2,424)
Stage III 2,007 (1,933; 2,081)
Stage IV 614 (573; 655)
Stage unknown2 1,176 (1,120; 1,232)

18,816 (18,590; 19,042)

1 The values in the table are predictions of the overall prevalence including other primary tumours diagnosed in previously treated cancer patients. The predictions are supplemented with a 90% confidence interval.
2 Objective reasons for not reporting the disease stage involve diagnosis based on autopsy/DCO, early deaths, treatment not started for contraindications, patient’s refusal to treatment. If reason for not reporting the stage is not provided, the record is considered as incomplete. Records missing the information on clinical stage are not involved in the expected number of patients who might need anticancer treatment.


2.3. Summary estimate of cervical cancer patients potentially treated in 2015

The table below provides a summary of predicted numbers of potentially treated patients, which are derived from incidence trends, prevalence trends, and survival probability models for the year 2015. The estimates are based solely on valid population-based data, which involve a histological verification of the tumour, and which had the clinical stage established at the time of primary diagnosis. The table involves numbers of all persons potentially treated with cancer therapy (based on information on treatment according to CNCR data from the period 2007–2011, according to clinical stage. 90% confidence intervals (in brackets) are provided for all estimates.

Cervical cancer
Predictions of newly diagnosed and treated patients in 2015

Predicted numbers of patients treated in 2015
in clinical stage IV

Newly diagnosed and treated patients in clinical stage IV Treated relapses and progressions in patients diagnosed in previous years
Stage I 497 (442; 551) 114 (91; 136) 176 (154; 198)
Stage II 110 (80; 141)
Stage III 197 (164; 231)
TOTAL 804 (686; 923) 290 (245; 334)
1,094 (931; 1,257)


3. Literature

  1. Agresti A. Categorical data analysis, 2nd ed. John Wiley & Sons, Inc., New York 2012, ISBN 978-0-4713-6093-3.
  2. Brenner H, Arndt V. Long-term survival rates of patients with prostate cancer in the prostate-specific antigen screening era: population-based estimates for the year 2000 by period analysis. Journal of Clinical Oncology 2005; 23(3): 441–447.
  3. Berrino F, De Angelis R, Sant M, Rosso S, Bielska-Lasota M, Coebergh JW, Santaquilani M; EUROCARE Working group. Survival for eight major cancers and all cancers combined for European adults diagnosed in 1995-99: results of the EUROCARE-4 study. The Lancet Oncology 2007; 8(9): 773–783.
  4. Cantor AB. Projecting the standard error of the Kaplan-Meier estimator. Statistics in Medicine 2001; 20(14): 2091–2097.
  5. Capocaccia R, De Angelis R. Estimating the completeness of prevalence based on cancer registry data. Statistics in Medicine 1997; 16(4): 425–440.
  6. Capocaccia R, Colonna M, Corazziari I, De Angelis R, Francisci S, Micheli A, Mugno E, EUROPREVAL Working Group. Measuring cancer prevalence in Europe: the EUROPREVAL project. Annals of Oncology 2002; 13(6): 831–839.
  7. Dickman P, Hakulinen T. Population-based cancer survival analysis, draft, 2003.
  8. dos Santos Silva, I. Cancer Epidemiology: Principles and Methods. International Agency for Research on Cancer, Lyon (France) 2013. ISBN 92-832-0405-0.
  9. Dušek L, Žaloudík J (Eds.). Hodnocení zdravotnických technologií v onkologii. Klinická onkologie 2004; 17 (Suppl. 1), 104 s. ISSN 0862–495X.
  10. Dušek L, Žaloudík J, Indrák K, (Eds.). Informační zázemí pro využití onkologických populačních dat v ČR. Klinická onkologie 2007; 20 (Suppl. 1), 196 s. ISSN 0862-495X.
  11. Dyba T, Hakulinen T. Comparison of different approaches to incidence prediction based on simple interpolation techniques. Statistics in Medicine 2000; 19(13): 1741–1752.
  12. Gail MH, Kessler L, Midthune D, Scoppa S. Two approaches for estimating disease prevalence from population-based registries of incidence and total mortality. Biometrics 1999; 55(4): 1137–1144.
  13. Hakulinen T, Dyba T. Precision of incidence predictions based on Poisson distributed observations. Statistics in Medicine 1994; 13(15): 1513–1523.
  14. Lutz JM, Francisci S, Mugno E, Usel M, Pompe-Kirn V, Coebergh JW, Bieslka-Lasota M; EUROPREVAL Working Group. Cancer prevalence in Central Europe: the EUROPREVAL Study. Annals of Oncology 2003; 14(2): 313–322.
  15. Mariotto AB, Yabroff KR, Feuer EJ, De Angelis R, Brown M. Projecting the number of patients with colorectal carcinoma by phases of care in the US: 2000-2020. Cancer Causes Control 2006; 17(10): 1215–1226.
  16. Mariotto A, Warren JL, Knopf KB, Feuer EJ. The prevalence of patients with colorectal carcinoma under care in the U.S. Cancer 2003; 98(6): 1253–1261.
  17. Møller B, Weedon-Fekjaer H, Haldorsen T. Empirical evaluation of prediction intervals for cancer incidence. BMC Medical Research Methodology 2005; 5: 21.
  18. Verdecchia A, De Angelis G, Capocaccia R. Estimation and projections of cancer prevalence from cancer registry data. Statistics in Medicine 2002; 21(22): 3511–3526.


Last updated on 20 February 2015