Diagnostic errors abound and sadly the more we study the subject the more horrified we should be about the cognitive bias and exceptionally poor application of Bayes theorem, probability and statistics by clinicians. Look at the highlighted example below, (italics bold), where clinicians assessing a 44 yo woman with chest pain and no risk factors to have a 70% chance of having disease when the actual conclusion should have been 2 to 11% chance of disease. An invited commentary to the original study is copied below.
JAMA Intern Med. Published online April 5, 2021. doi:10.1001/jamainternmed.2021.0240
An enviably close and influential collaboration during the 1970s between the psychologists Amos Tversky and Daniel Kahneman reshaped our beliefs about intuitive probabilistic reasoning. One of their many contributions was a demonstration of the base-rate fallacy, the tendency for people to neglect prior probabilities, or “base rates,” when calculating the chances of an event given more specific data.1 For example, the chances that a patient has a disease being tested reflects not only the test result and the test’s sensitivity and specificity, but also the relevant base rate, which is the prevalence of disease in a specific population.
For decades, the base-rate fallacy has been documented by psychologists in studies with respondents across professions and expertise levels.2 It is thus unsurprising that a long line of inquiry has investigated base-rate neglect and statistical reasoning more generally in practitioners, often with disappointing results.3-5 Many such studies have tested statistical concepts directly or assessed practitioners’ abilities to manipulate numbers provided to them, leaving some to argue that practitioners may be intuitive statisticians or pattern recognizers with little need to deal with statistical formalism when it matters.
In this issue of JAMA Internal Medicine, Morgan and colleagues6 make an important contribution to this literature by exploring where probabilistic hiccups arise along the diagnostic pathway. The survey study builds on prior work by investigating not only the ability of practitioners to manipulate supplied numbers, but to themselves estimate both pretest and posttest probabilities in common clinical settings.
The study6 explored 4 common clinical scenarios—pneumonia, cardiac ischemia, breast cancer screening, and urinary tract infection. Respondents estimated pretest and posttest probabilities after being given brief case descriptions and also answered a general statistical reasoning question (see the eAppendix to the Original Investigation6). The authors received fully completed surveys from 553 of 723 practitioners invited (76%), consisting primarily of attending and resident physicians (89%) as well as nurse practitioners and physician assistants.
In one scenario, clinicians were asked about Mrs Jones, a 43-year-old premenopausal woman with atypical chest pain, a normal electrocardiogram, no risk factors, normal vital signs, and normal findings on examination. The median pretest probability estimate that Mrs Jones has cardiac ischemia was 10%; the authors’ literature-based estimate ranged from 1% to 4.4%. Following positive results of an exercise stress test, respondents judged the median (interquartile range [IQR]) estimated probability that Mrs Jones has cardiac ischemia to be 70% (50%-90%), whereas the authors’ evidence-based answer was 2% to 11%. In another scenario, respondents were asked to consider an annual visit with a 45-year-old woman without specific risk factors or symptoms for breast cancer. The median (IQR) pretest probability estimate of breast cancer was 5% (1%-10%), while the authors’ literature-based estimate was 0.2% to 0.3%. After a positive finding on mammography, the median (IQR) posttest probability estimate was 50% (30%-80%) among respondents, whereas the authors computed a literature-based range of 3% to 9%. Probability estimates for 2 other scenarios involving pneumonia and urinary tract infection similarly differed starkly from the literature-based estimates. Positive and negative likelihood ratios were imputed for each clinician and scenario.
In the study by Morgan et al,6 practitioners consistently overestimated both pretest and posttest probabilities, often dramatically. Responses were highly variable across clinicians and scenarios. Interestingly, the imputed likelihood ratios suggest that some of the probabilistic updates might have been correct, even if posttest probabilities were overestimated. This suggests that the inflation of some posttest probabilities might be accounted for, in part, by the inflation of pretest probabilities, although there was considerable variability in imputed likelihood ratios across respondents. Strengths of the study include the use of common clinical scenarios and an expert panel of practitioners who culled the literature and agreed on evidence-based answers.
As Kahneman and Tversky1 showed us and the new findings from Morgan et al6 illustrate convincingly, the prior probability rules, or at least it should. The prior probability in this case is the probability of disease in the population. But who is the population? Morgan et al6 found that clinicians differed widely in their estimates of pretest probability, but the true pretest probability is shaped by many time-varying factors, including clinical comorbidities, history and physical examination findings, clinical setting, where patients live and work, and demographic characteristics.7
Might part of the variability in pretest and posttest probability estimates not be error, but practitioners recalling their experience treating different patient populations or using different tests than those in the authors’ literature-based estimates? What tacit assumptions are being folded into the respondents’ answers that go beyond a case prompt of a few lines? Are nontraditional risk factors or comorbidities the same, or are different demographic groups being considered? To what extent are the responses manifesting a psychological desire not to miss anything that could cause harm? The authors’ hierarchical literature review6 includes studies from several decades; some estimates may change today and in different settings. As the authors also note, many respondents were residents or academic physicians who may practice in settings with higher disease prevalence.
To fully realize shared decision-making, pretest and posttest probability estimates for precise clinical populations are needed. Indeed, a decades-old framework for decision analysis in medicine8 has laid a rigorous foundation for decision-making under uncertainty, but even the perfect decision analyst is at a loss without the right estimates at the right time for the right population. Fortunately, the contemporary practitioner’s armamentarium includes digital tools that may eventually enable the provision, or perhaps even the calculation, of risk estimates dynamically tailored to many aspects of the patient’s population.
The findings from Morgan et al6 also point to new targets for medical education and research avenues for how probabilistic information might be better integrated into care. For example, while better education around Bayes rule and likelihood ratios is often emphasized to improve practitioners’ numeracy, the authors’ findings suggest that knowledge of pretest probability values and how to estimate them may be more amenable targets. The authors also reference other approaches to communicating probabilistic information. Here too psychologists might teach us about when to most effectively integrate risk information alongside our fast and slow modes of thought9 and also how information might be displayed to promote correct interpretation, such as in natural frequencies or graphical displays. While the challenges around statistical reasoning have long been appreciated, studies like those by Morgan et al6 will be important to pinpoint how probabilistic information can best drive precise shared decision-making across populations.
The original study citation is
https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2778364
We are long overdue to critically assess the training and ongoing evaluation of the accuracy of clinical diagnosis by clinicians which we have known for decades to be desperately lacking. There is no transparent method for choosing a health care provider based on their diagnostic skills. Time for change!
For lectures webinars, zoom meetings on this and other topics nargymd@gmail.com
mobile 9299006004