Rama Reksotinoyo

Posted on May 28, 2023

Walking through the bayes theorm to update our belief systematically.

#datascience #math #bayes #programming

I have heard about a quote from ancient Greek that goes, and this quote bellow is related to Bayes' theorem.

"Always leave room for uncertainty so that we remain open to new things and survive."

Bayes' theorem is a statistical method used to calculate the uncertainty or probability of an event based on past assumptions that may be related to the event. Inspired by this, I decided to delve deeper (with my limited brain capacity, of course) into Bayes' theorem, which I tried to implement in program code.

I attempted to implement the probability of someone, let's call them X, having HIV and actually being HIV positive. I wanted to determine the probability that X truly has HIV.

Here is the formula for Bayes' theorem:

With my zero of medical knowledge, I assumed that out of a population of 5000 people who took an HIV test, only 3% of the population actually had HIV, considering that I rarely come across people with HIV in my current environment.

After reading some papers and hearing opinions from HIV specialists and HIV researchers, I found that if X were to take the test now, the probability of them having HIV would be 95%. This can be referred to as sensitivity. If X does not have HIV, the probability of them truly not having HIV is 98%. This can be referred to as specificity.

Therefore:
P(HIV) = prior = 0.3
P(diagnosed|HIV) = likelihood = 0.95
P(diagnosed) = evidence
P(HIV|diagnosed) = posterior
P(not HIV|not diagnosed) = 0.98

Thus, the confusion matrix table is as follows, where the horizontal axis represents the actual values and the vertical axis represents the predicted values.

Please note that the translation provided may not precisely match the terminology used in statistical literature, but it conveys the general meaning.

Confusion Matrix:

Since the evidence is unknown in this case, I am trying to calculate it using the equation from the law of total probability, which is P(B) = Σ[P(B|Aᵢ)P(Aᵢ)]. Here, I am implementing it using the Go programming language:

const (
    PDiagnosedGivenHIV      = 0.95 // P(D|S)
    PNotDiagnosedGivenNoHIV = 0.98 // P(-D|-S)
    PDiagnosedGivenNoHIV    = 0.02 // P(D|¬S)
    PNotDetectedGivenSpam   = 0.05 // P(-D|S)
)

func TotalProbability(prior *float64) float64 {
    // using law of total probability.
    // P(D) = P(D|S)P(S) + P(D|¬S)P(¬S)
    PNoHIV := 1 - *prior
    return (PDiagnosedGivenHIV * *prior) + (PDiagnosedGivenNoHIV * PNoHIV)
}

With the obtained evidence of 0.0479, I just need to substitute the above evidence with the known data into the following equation.

Here is the complete code along with the test cases:

package main

import "fmt"

type Probability struct {
    PriorProb       float64
    ConditionalProb float64
    EvidenceProb    float64
}

func (p *Probability) CalculatePosteriorProb() float64 {
    // Menghitung posterior probability menggunakan rumus Bayes
    posteriorProb := (p.ConditionalProb * p.PriorProb) / p.EvidenceProb
    return posteriorProb
}

const (
    PDiagnosedGivenHIV      = 0.95 // P(D|S)
    PNotDiagnosedGivenNoHIV = 0.98 // P(-D|-S)
    PDiagnosedGivenNoHIV    = 0.02 // P(D|¬S)
    PNotDetectedGivenSpam   = 0.05 // P(-D|S)
)

func TotalProbability(prior *float64) float64 {
    // using law of total probability.
    // P(D) = P(D|S)P(S) + P(D|¬S)P(¬S)
    PNoHIV := 1 - *prior
    return (PDiagnosedGivenHIV * *prior) + (PDiagnosedGivenNoHIV * PNoHIV)
}

func main() {
    prior := 0.03
    evidence := TotalProbability(&prior)

    prob := Probability{
        PriorProb:       prior,    // Probabilitas prior
        ConditionalProb: 0.95,     // Probabilitas kondisional
        EvidenceProb:    evidence, // Probabilitas evidence
    }

    // Menghitung posterior probability menggunakan Bayes' Rule
    posteriorProb := prob.CalculatePosteriorProb()

    fmt.Printf("Posterior Probability: %.2f\n", posteriorProb)
    fmt.Println(" ")

}

package main

import (
    "math"
    "testing"
)

const epsilon = 0.00000001 // Toleransi kesalahan

func TestCalculatePosteriorProb(t *testing.T) {
    prob := Probability{
        PriorProb:       0.1,  // Probabilitas prior
        ConditionalProb: 0.75, // Probabilitas kondisional
        EvidenceProb:    0.1,  // Probabilitas evidence
    }

    expectedPosteriorProb := 0.75

    actualPosteriorProb := prob.CalculatePosteriorProb()

    if math.Abs(actualPosteriorProb-expectedPosteriorProb) > epsilon {
        t.Errorf("Wrong answer! it should be %.8f", expectedPosteriorProb)
    }
}

func TestTotalProbability(t *testing.T) {

    const (
        PDetectedGivenSpam       = 0.95 // P(D|S)
        PNotDetectedGivenNonSpam = 0.98 // P(-D|-S)
        PDetectedGivenNonSpam    = 0.02 // P(D|¬S)
        PNotDetectedGivenSpam    = 0.05 // P(-D|S)
    )

    e := 0.0001
    prior := 0.2

    expectedTotalProb := 0.2060
    actualTotalProb := TotalProbability(&prior)

    if math.Abs(actualTotalProb-expectedTotalProb) > e {
        t.Errorf("Wrong answer! It should be %.4f but %.4f", expectedTotalProb, actualTotalProb)
    }
}

In the next articel , I will talk about how to update our level of confidence or prior, which will demonstrate that the likelihood of someone being affected by HIV is directly proportional to the magnitude of our prior or initial assumption. Testing and implementation of the narrative "to update our belief systematically" will be in the next section. See you soon.

Reference:
Wikipedia

Zenius

StatQuest

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

DEV Community

Walking through the bayes theorm to update our belief systematically.

Hands-on debugging session: instrument, monitor, and fix

Top comments (0)

Read next

8 Type of Load Balancing

How to do Review Sentiment Analysis using Python

The Grand Finale: Mastering Go's Crypto Package, Go Crypto 14

15 System Design Resources for Interviews (including Cheat Sheets)

Okay