II. Definitions
- Bayes Theorem (calculation)
- P (Disease | Positive Test) = P(Positive test | Disease) * P(Disease) / P(Positive Test)
- Where
- P (A | B) = Probability of A given B
- P(Positive test | Disease) = Test Sensitivity
III. Evaluation: Example - Probability of Disease Based on a Test
- Positive Test
- Disease Y Present in 75
- Disease Y NOT Present in 25
- Negative Test
- Disease Y Present in 10
- Disease Y NOT Present in 190
- Probabilities
- P(Positive test I Disease) = Test Sensitivity = 75 / (75 + 10) = 0.88
- P(Disease) = Pretest Probability in cohort tested = (75+10)/(75+10+25+190) = 0.28
- P(Positive Test) = True positives and False Positives = (75 + 25)/(75+10+25+190) = 0.33
- Conclusion
- P (Disease | Pos Test) = P(Pos test I Disease) * P(Disease) / P(Pos Test) = 0.88 * 0.28 / 0.33 = 0.75
- In this case a patient from the given cohort has a 75% probability of Disease Y given a Positive Test
IV. Evaluation: Example - Probability of a disease based on a group of findings
- The probability of a disease given one or more findings can be calculated from:
- Prevalence of a Disease (and of its differential diagnosis) AND
- Probability of findings when the disease is present (and when other conditions on the differential diagnosis are present)
- Assumptions
- Conditional independence of findings
- For a given disease, different findings do not have a relationship with one another
- Example: For Acute Coronary Syndrome, Chest Pain and Shortness of Breath are not dependently related
- Mutual exclusivity of conditions
- For a given presentation with specific findings, only one disease is present to explain those findings
- Example: The patient with Chest Pain, Tachypnea and Shortness of Breath
- Does NOT have both a Myocardial Infarction AND a Pulmonary Embolism
- Calculation
- P(D|F) = Probability of Disease (D) given Findings (F) = P(D) * P(F | D) / P(DDx) * P(F | DDx)
- Where
- P(D) = Probability of Disease (D)
- P(F | D) = Probability of Findings (F) given Disease (D)
- P(DDx) = Sum of probabilities of a group of Diseases including the Disease (D) of interest (Differential Diagnosis)
- P(F | DDx) = Probability of Findings (F) given the group of diseases (DDx)
- Conditional independence of findings
V. Evaluation: Example of Family Tree and Hemophilia
- Setup
- A healthy woman has a brother with Hemophilia (xY)
- Hemophilia is X-linked and as she is unaffected she is either Xx (Hemophilia carrier) or XX (normal)
- She has two healthy male children without Hemophilia (each XY)
- What is the probability that she is XX (no Hemophilia gene)
- Assumptions
- P(xX) = p(XX) = probability mother is either Hemophilia carrier (xX) or normal (XX) = 0.5
- P(cXY and cXY|mXX) = probability that both children are XY (normal) given mother is XX = 1
- P(cXY and cXY|mxX) = probablity that both children are XY (normal) given mother is xX (Hemophilia carrier) = 0.5 * 0.5 = 0.25
- Bayes Formula
- P(A|B) = P(B|A) * P(A) / (P(B|A)*P(A) + P(B|not A)*P(not A) )
- P(mXX| cXY and cXY) = Probability mother has 2 normal X copies given 2 non-Hemophiliac sons
- P(mXX| cXY and cXY) = P(cXY and cXY|mXX) * P(XX) / ( P(cXY and cXY|mXX) * P(XX) + P(cXY and cXY|mxX) * P(xX) )
- P(mXX| cXY and cXY) = (1* 0.5) / ( 1 * 0.5 + 0.25 * 0.5 ) = 0.5 / 0.625 = 0.8 or 4/5
- References
- (2015) Columbia Statistical Thinking for Data Science and Analytics, EDX, accessed online 2/4/2017
VI. Resources
- Bayes Theorem (Wikipedia)
- Bayes Theorem (Khan Academy)
VII. References
- Desai (2014) Clinical Decision Making, AMIA’s CIBRC Online Course
- Hersh (2014) Knowledge Acquisition and Use for Clinical Decision Support, AMIA’s CIBRC Online Course