Fragility Index


  • The Fragility Index is the minimum number of patients whose status would have to change from a nonevent to an event that is required to turn a statistically significant result to a non-significant result
  • The smaller the Fragility Index, the more fragile the trial’s outcome
  • The Fragility Index is a useful metric for demonstrating how easily statistical significance based on a threshold P-value may be overturned
  • Much of the published medical literature, especially in critical care, is built upon ‘statistically fragile’ trials


Threshold p-values are widely used in the medical literature to determine statistical significance despite important limitations

  • results with similar P-values do not indicate a similar likelihood of being real if there are large differences in the size of the trials or number of events in the trials being compared
  • when one P-values when is above and one below the threshold value (eg, P = 0.051 and P = 0.049), the latter, but not the former, is typically interpreted as indicating a real treatment effect despite there being minimal absolute difference between the two p-values

95% Confidence Intervals have similar problems to threshold p-values

  • they are often viewed dichotomously as indicating significance if they do not cross 1
  • smaller, more fragile trials can have tighter 95CIs that are more distant from 1 than larger, less fragile trials


Fragility Index can be calculated as follows (from Ridgeon et al, 2016):

  • trial results are arranged in a two-by-two contingency table
  • an event is iteratively added to the group with the smaller number of events (although removing a nonevent from the same group to maintain the total group size) until the p value produced by Fisher exact test equaled or exceeded 0.05
  • The number of events added to reach this threshold is the Fragility Index


Ridgeon et al, 2016

  • The authors attempted to calculate the fragility index for all MCRCTs in critical care medicine reporting mortality; they found 56 MCRCTs that met their criteria
  • Findings
    • The median fragility index was 2 (interquartile range, 1-3.5)
    • greater than 40% of trials had a fragility index of less than or equal to 1
    • 12.5% of trials reported loss to follow-up greater than their fragility index
    • Trial sample size was positively correlated (less fragile), and reported p value was negatively correlated (more fragile), with fragility index 
    • An overview of the 56 eligible MCRCTs is available in one of the online supplements
  • The authors conclude that
    • findings in critical care trials often depend on a small number of events
    • critical care clinicians should be wary of basing decisions on trials with a low fragility index.
    • fragility index should be reported for future trials in critical care to aid interpretation and decision making by clinicians


Walsh et al, 2014

  • The authors calculated the Fragility Index for 399 eligible RCTs  in high-impact medical journals that reported a statistically significant result for at least one dichotomous or time-to-event outcome in the abstract
  • The journals included were: NEJM, The Lancet, JAMA, BMJ and Annals of Internal Medicine
  • Findings
    • the RCTs had:
      • median sample size of 682 patients (range: 15–112,604)
      • median of 112 events (range: 8–5,142)
    • 53% reported a P-value <0.01
    • median Fragility Index was 8 (range: 0–109)
    • 25% had a Fragility Index of 3 or less
    • In 53% of trials, the Fragility Index was less than the number of patients lost to follow-up
  • Commentary:
    • note that the trials included in this study were not necessarily multi-center studies and were not restricted to having mortality as a statistically significant outcome
  •  Conclusion:
    • The statistical significance of RCTs in major medical journals often hinges on the outcomes of a small number of events, suggesting that the results are ‘fragile’
    • This is supported by high rates of medical reversal when trials are repeated or subsequent larger, multi-center trials are performed


This section is based on a discussion with Paul Young:

Interpretation of the Fragility Index, and the importance of loss to follow-up, should be taken in context


  • The NICE-SUGAR trial had a Fragility Index of 11 and 82 patients were lost to follow-up
    • the conclusion was measured in that the authors only stated that intensive insulin therapy is not better than conventional insulin therapy and may be harmful
    • the number of events that need to be changed to make this interpretation incorrect is very large – i.e. you need to make the significance swing the opposite direction, i.e.  significance would have to swing in the opposite direction.
  • The CRASH-2 trial had a Fragility Index of 48 and 84 patients were lost to follow-up
    • the loss to follow-up is one of a number of issues that weakens the strong drive to translate the findings of this study into clinical practice
    • other issues are that only 3% of CRASH-2 patients came from countries with modern trauma centres and the thromboembolic risk in trauma patients in these centers is likely to be high

Overall, the issue of loss to follow-up appears to be less of an issue in critical care trials compared to non-critical care trials published in high impact journals.

References and links


Journal articles

  • Feinstein AR. The unit fragility index: an additional appraisal of “statistical significance” for a contrast of two proportions. Journal of clinical epidemiology. 43(2):201-9. 1990. [pubmed]
  • Ridgeon EE, Young PJ, Bellomo R, Mucchetti M, Lembo R, Landoni G. The Fragility Index in Multicenter Randomized Controlled Critical Care Trials. Critical care medicine. 2016. [pubmed]
  • Walsh M, Srinathan SK, McAuley DF. The statistical significance of randomized controlled trial results is frequently fragile: a case for a Fragility Index. Journal of clinical epidemiology. 67(6):622-8. 2014. [pubmed] [free full text]

FOAM and web resources

CCC 700 6

Critical Care


Chris is an Intensivist and ECMO specialist at the Alfred ICU in Melbourne. He is also a Clinical Adjunct Associate Professor at Monash University. He is a co-founder of the Australia and New Zealand Clinician Educator Network (ANZCEN) and is the Lead for the ANZCEN Clinician Educator Incubator programme. He is on the Board of Directors for the Intensive Care Foundation and is a First Part Examiner for the College of Intensive Care Medicine. He is an internationally recognised Clinician Educator with a passion for helping clinicians learn and for improving the clinical performance of individuals and collectives.

After finishing his medical degree at the University of Auckland, he continued post-graduate training in New Zealand as well as Australia’s Northern Territory, Perth and Melbourne. He has completed fellowship training in both intensive care medicine and emergency medicine, as well as post-graduate training in biochemistry, clinical toxicology, clinical epidemiology, and health professional education.

He is actively involved in in using translational simulation to improve patient care and the design of processes and systems at Alfred Health. He coordinates the Alfred ICU’s education and simulation programmes and runs the unit’s education website, INTENSIVE.  He created the ‘Critically Ill Airway’ course and teaches on numerous courses around the world. He is one of the founders of the FOAM movement (Free Open-Access Medical education) and is co-creator of litfl.com, the RAGE podcast, the Resuscitology course, and the SMACC conference.

His one great achievement is being the father of three amazing children.

On Twitter, he is @precordialthump.

| INTENSIVE | RAGE | Resuscitology | SMACC

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.