ValueMarkers

Posted on Mar 25 • Originally published at valuemarkers.com

Detecting Earnings Manipulation with the Beneish M-Score: Python Implementation

#python #finance #datascience #machinelearning

In 1998, students at Cornell University flagged Enron as a likely earnings manipulator using a statistical model. Wall Street analysts were still recommending "buy." The model? The Beneish M-Score. Here's how to implement it in Python.

What is the Beneish M-Score?

Developed by Professor Messod D. Beneish at Indiana University in 1999, the M-Score is a mathematical model that uses 8 financial ratios to detect whether a company has manipulated its reported earnings.

The key threshold: -1.78

M-Score > -1.78: Likely manipulator
M-Score < -1.78: Unlikely manipulator

The 8 Variables

Each variable captures a different dimension of potential manipulation:

Variable	Name	What It Detects
DSRI	Days Sales in Receivables Index	Revenue inflation through receivables
GMI	Gross Margin Index	Deteriorating margins (incentive to manipulate)
AQI	Asset Quality Index	Capitalization of expenses
SGI	Sales Growth Index	High growth (more pressure to manipulate)
DEPI	Depreciation Index	Slower depreciation to boost earnings
SGAI	SGA Expense Index	Disproportionate overhead changes
TATA	Total Accruals to Total Assets	Earnings quality (cash vs. accruals)
LVGI	Leverage Index	Increasing debt pressure

Python Implementation

def beneish_m_score(
    dsri: float, gmi: float, aqi: float, sgi: float,
    depi: float, sgai: float, tata: float, lvgi: float
) -> dict:
    """
    Calculate the Beneish M-Score for earnings manipulation detection.

    Reference: Beneish, M.D. (1999). "The Detection of Earnings 
    Manipulation." Financial Analysts Journal, 55(5), 24-36.
    """
    THRESHOLD = -1.78

    m = (-4.84
         + 0.920 * dsri
         + 0.528 * gmi
         + 0.404 * aqi
         + 0.892 * sgi
         + 0.115 * depi
         - 0.172 * sgai
         + 4.679 * tata
         - 0.327 * lvgi)

    return {
        'm_score': round(m, 4),
        'likely_manipulator': m > THRESHOLD,
        'variables': {
            'DSRI': dsri, 'GMI': gmi, 'AQI': aqi, 'SGI': sgi,
            'DEPI': depi, 'SGAI': sgai, 'TATA': tata, 'LVGI': lvgi
        }
    }


def calculate_variables(current: dict, previous: dict) -> dict:
    """
    Calculate all 8 M-Score variables from financial statements.

    Args:
        current: Dict with current year financials
        previous: Dict with previous year financials

    Required keys: revenue, receivables, cogs, current_assets,
        ppe, securities, total_assets, depreciation, sga,
        current_liabilities, lt_debt, net_income, ocf
    """
    c, p = current, previous

    # DSRI: Days Sales in Receivables Index
    dsri = ((c['receivables'] / c['revenue']) / 
            (p['receivables'] / p['revenue']))

    # GMI: Gross Margin Index
    gm_prev = (p['revenue'] - p['cogs']) / p['revenue']
    gm_curr = (c['revenue'] - c['cogs']) / c['revenue']
    gmi = gm_prev / gm_curr if gm_curr != 0 else 1.0

    # AQI: Asset Quality Index
    hard_c = c['current_assets'] + c['ppe'] + c.get('securities', 0)
    hard_p = p['current_assets'] + p['ppe'] + p.get('securities', 0)
    aq_c = 1 - (hard_c / c['total_assets'])
    aq_p = 1 - (hard_p / p['total_assets'])
    aqi = aq_c / aq_p if aq_p != 0 else 1.0

    # SGI: Sales Growth Index
    sgi = c['revenue'] / p['revenue']

    # DEPI: Depreciation Index
    dep_rate_c = c['depreciation'] / (c['ppe'] + c['depreciation'])
    dep_rate_p = p['depreciation'] / (p['ppe'] + p['depreciation'])
    depi = dep_rate_p / dep_rate_c if dep_rate_c != 0 else 1.0

    # SGAI: SGA Expense Index
    sgai = ((c['sga'] / c['revenue']) / 
            (p['sga'] / p['revenue']))

    # TATA: Total Accruals to Total Assets
    tata = (c['net_income'] - c['ocf']) / c['total_assets']

    # LVGI: Leverage Index
    lev_c = (c['current_liabilities'] + c['lt_debt']) / c['total_assets']
    lev_p = (p['current_liabilities'] + p['lt_debt']) / p['total_assets']
    lvgi = lev_c / lev_p if lev_p != 0 else 1.0

    return beneish_m_score(dsri, gmi, aqi, sgi, depi, sgai, tata, lvgi)

The TATA Variable: The Most Powerful Predictor

Notice the coefficient on TATA is 4.679 - by far the largest weight in the formula. Total accruals to total assets measures how much of a company's earnings come from actual cash versus accounting entries.

# High accruals = earnings don't match cash flow = red flag
tata = (net_income - operating_cash_flow) / total_assets

# A TATA of 0.05 adds 0.23 to the M-Score
# A TATA of 0.15 adds 0.70 to the M-Score
# This alone can push a company past the -1.78 threshold

Companies with high TATA values are generating earnings primarily through accounting entries rather than cash. Research consistently shows these earnings are less sustainable and more likely to be manipulated.

Real-World Example: The Enron Detection

In 1998, Beneish's model flagged Enron with an M-Score well above -1.78. The key red flags:

High DSRI: Receivables growing faster than revenue
High AQI: Increasing "soft" assets (off-balance-sheet entities)
Extreme TATA: Massive gap between reported earnings and cash flow

The bankruptcy didn't happen until 2001, giving the model a 3-year lead time.

Limitations

Not designed for financial institutions: Banks make money differently. The sales/receivables relationship doesn't apply the same way.
False positives in high-growth companies: Rapid growth naturally elevates SGI and can push other variables higher. Not every high-growth company is manipulating.
Backward-looking: The model uses last year's data. By the time you calculate it, the manipulation may have already unwound.
Probabilistic, not deterministic: An M-Score above -1.78 doesn't prove manipulation. It says the financial patterns are consistent with companies that have historically manipulated.

Building a Quality Screen

The M-Score is most powerful when combined with complementary metrics:

def quality_triple_check(piotroski_score, altman_z, beneish_m):
    """
    Combine three academic models for comprehensive 
    financial health assessment.
    """
    checks = {
        'financial_strength': piotroski_score >= 6,  # Piotroski F-Score
        'bankruptcy_safe': altman_z > 1.81,           # Altman Z-Score  
        'earnings_clean': beneish_m < -1.78,          # Beneish M-Score
    }

    passed = sum(checks.values())

    return {
        'checks': checks,
        'passed': f"{passed}/3",
        'verdict': 'PASS' if passed == 3 else 
                   'CAUTION' if passed == 2 else 'FAIL'
    }

References

Beneish, M.D. (1999). "The Detection of Earnings Manipulation." Financial Analysts Journal, 55(5), 24-36.
Beneish, M.D., Lee, C.M.C., & Nichols, D.C. (2013). "Earnings Manipulation and Expected Returns." Financial Analysts Journal, 69(2), 57-82.

I'm Javier Sanz, a software engineer and value investor building tools for fundamental analysis at ValueMarkers.

DEV Community