Validation methods
| Background info: Validation Methods |
|---|
| To validate a particular type of score, longitudinal prospective studies are often conducted. |
| The observed values or natural incidences of cardiovascular diseases in a sample population is compared with the predicted |
| Statistical methods of comparing the predicted and observed values are discussed below. |
Calibration
- The calibration statistic measures the extent to which risk, as predicted by the scale, matches observed risk. This is determined by the ratio of predicted to observed risk and can be visualized graphically by a scatter-plot including the 45 o line of perfect agreement.
- Value of 1 indicates good agreement (e.g. perfect calibration gives a ratio of 1.00)
- E.g. overestimation of 80% gives a is indicated by a calibration index of 1.80
- Note: a ratio of predicted over observed of 1.00 does not necessary signify perfect estimation if it is the average of over- and under-estimations for different (age/national) groups. To combat this problem, some papers provide a stratified list of calibration results for different risk groups (Collins GS, Altman DG (2009) ).
Brier score
- Measure of accuracy
- Ranges from 0 to 1
- Average squared deviation between the probability of the event of interest occurring for an individual and the value 1 (if the event occurs) or 0 (if the event does not occur) for that individual
- Lower score represents higher accuracy (see Collins GS, Altman DG (2009))
Discrimination
- The discrimination statistic estimates the ability of the scale to distinguish between those who will and those will not go on to have a cardiovascular event during a pre-determined follow-up period.
- Typically, this is estimated by the area under the Receiver Operating Characteristic (ROC) curve (ranging from 0.5: use of scale no better than guessing to 1: use of scale leads to perfect discrimination) (Collins GS, Altman DG (2009))
- A value of 1.0 indicates perfect classification of cases and non-cases, whereas 0.5 means that the equation correctly orders subjects 50% of the time (i.e. no better than chance)
- ROC curve is plot of sensitivity over 1-specificity
- Sensitivity: true positive rate
- Specificity: true negative rate
- Only works if the graphs do not cross
- pAUC (partial AUC) is used when graphs cross; a cut of point must be determined
- D statistic
- Higher values indicate higher discrimination
- More information in Royston P, Sauerbrei W (2004)
- R2 statistic
- More information in Royston P (2006)