The "Why": A Framework for AI Ethics
Originally published on BlockSimplified
This article is part of my AI Fluency Curriculum, documenting my learnings around AI Fluency & Applied AI.
This is the first post in Module 8: Ethics, Safety, and Governance. We're starting with the foundational question: why do ethics matter for AI, and how do we actually practice them?Ethics in AI isn't a box to check at the end of a project. It's a way of thinking that shapes every decision, from data collection to deployment. Get it wrong, and your AI fails the people who use it. Real people. At scale.
Current as of June 2026
This guide reflects the latest landscape: the EU AI Act's general-purpose AI obligations went live in August 2025, but in May 2026 EU negotiators provisionally agreed a "Digital Omnibus" that pushes the high-risk-system obligations back from August 2026 to December 2027; the 2025-26 wave of AI hiring-bias litigation (Mobley v. Workday's certified collective action, still escalating in 2026, and Harper v. Sirius XM); Fairlearn 0.14 (June 2026); and the growing role of ISO/IEC 42001 and the NIST Generative AI Profile as governance baselines.
I want to start with a confession. When I first heard "AI ethics" years ago, I mentally filed it under "compliance stuff that slows down real work." I was wrong, and the cases in this post are what changed my mind.
Look closely at how the famous failures actually happened and the same shape shows up every time. A well-intentioned team. No malice. No obvious bug. The system worked exactly as designed, but the design hadn't accounted for the ethical implications of the data patterns it learned. Amazon spent years building a recruiting tool it eventually threw away. A healthcare algorithm shaped care decisions for an estimated 100 million people before anyone caught what it was doing. None of these were cheap to fix, and none were caught early.
Ethics isn't the enemy of shipping. It's the prerequisite for shipping something that doesn't blow up in your face.
The four pillars of AI ethics, in brief
- AI ethics breaks down into four pillars (FATP): fairness, accountability, transparency, and privacy.
- AI amplifies bias in its training data: Amazon's recruiting tool, COMPAS risk scores, and a healthcare algorithm affecting 100 million patients all show how.
- Fairness definitions are mathematically incompatible when base rates differ, so you must pick one and document the trade-off.
- Treat ethics as a design constraint with concrete tools like Fairlearn, not a post-launch box to check.
What You'll Learn
By the end of this post, you'll be able to:
- Understand and apply the four pillars of AI ethics: fairness, accountability, transparency, and privacy
- Recognize real-world examples of AI systems causing ethical harm
- Use technical interventions like Fairlearn for measuring and mitigating bias
- Navigate the trade-offs between competing ethical principles
We'll cover three levels: Beginner (real-world case studies of AI bias), Intermediate (technical fairness interventions), and Advanced (societal trade-offs and formal mitigation strategies).
Beginner: Real-World Case Studies of AI Harm
Let me start with stories. Not because I want to scare you, but because ethical failures aren't abstract. They happen to real people, and understanding what went wrong is the first step to building differently.
Case Study 1: Amazon's Hiring Algorithm
In 2018, Reuters reported that Amazon had scrapped an internal AI recruiting tool after discovering it was biased against women. The system was trained on resumes submitted over 10 years, a period when the tech industry was predominantly male. The AI learned that male candidates were preferable, penalizing resumes that included the word "women's" (as in "women's chess club") or graduates of all-women's colleges.
The AI wasn't malicious. It was doing exactly what it was trained to do: find patterns in historical data and replicate them. The problem was that historical data encoded historical discrimination. The algorithm automated and scaled what had been individual human bias.
The lesson: AI systems don't transcend their training data. They amplify it.
Case Study 2: COMPAS Criminal Risk Assessment
ProPublica's 2016 investigation of COMPAS, a criminal risk assessment algorithm used across the US, found that the system was significantly more likely to incorrectly flag Black defendants as high-risk compared to white defendants. A white defendant and a Black defendant with similar criminal histories would receive different risk scores.
The company that made COMPAS, Northpointe (now Equivant), disputed the methodology. They argued that the algorithm met a different fairness criterion. Both sides were technically correct; they were just using different definitions of fairness.
The COMPAS case surfaces something the Amazon case didn't: algorithmic fairness isn't a single thing. There are multiple, mathematically incompatible definitions of what "fair" means. You literally cannot satisfy all of them simultaneously when different groups have different base rates.
The lesson: "Make it fair" isn't a specification. You have to choose which type of fairness matters most for your context, and be honest about the trade-offs.
Case Study 3: Healthcare Algorithm Racial Bias
A 2019 study published in Science found that a widely used healthcare algorithm systematically underestimated the health needs of Black patients. The algorithm used healthcare costs as a proxy for health needs, but because Black patients historically had less access to healthcare (and thus lower costs), the algorithm concluded they were healthier than they actually were.
The result? Sicker Black patients were deprioritized for care programs compared to healthier white patients. The algorithm affected an estimated 100 million patients annually.
100 million: patients affected annually by a healthcare algorithm that underestimated the health needs of Black patients. It used healthcare costs as a proxy for health, and Black patients historically had lower costs due to less access to care. (Obermeyer et al., Science (2019))
Proxy variables are dangerous
The healthcare algorithm never used race directly. It used healthcare costs, which correlated with race due to systemic inequities. This is called "bias by proxy," and it's one of the sneakiest ways discrimination enters AI systems. You can build a discriminatory system without ever touching protected attributes.
The lesson: Fairness through unawareness (not using sensitive attributes) doesn't work. Other variables carry the same signal.
Case Study 4: The 2025 AI Hiring-Bias Lawsuits
The earlier cases were investigations and academic studies. By 2025, these failures became courtroom liability. In May 2025, a federal court in California granted preliminary certification of a nationwide age-discrimination collective action in Mobley v. Workday, where the plaintiff alleges that an AI applicant-screening platform rejected him from over 100 jobs (his broader suit also claims race and disability discrimination). The case has only escalated since: in March 2026 the court rejected Workday's argument that age-discrimination law doesn't cover job applicants, keeping the collective action alive. The legal theory is disparate impact: even with no intent to discriminate, a screening tool that disproportionately filters out a protected group can be unlawful.
Then in August 2025, Harper v. Sirius XM echoed the healthcare case's proxy problem in a hiring context: the complaint alleges the AI screener used educational background and zip codes as stand-ins that correlated with race. Same "bias by proxy" mechanism, brand-new legal exposure.
The lesson: Ethical failures aren't just reputational anymore. If your AI screens, ranks, or scores people, "we didn't mean to discriminate" is not a defense, disparate impact looks at outcomes, not intent.
Case Study 5: The 2023-2024 Wave (When Shipped Products Failed)
The cases above are famous partly because they're old enough to have a verdict. But this isn't ancient history. The same failure modes keep shipping in mainstream AI products.
Google's Gemini (February 2024). Google paused Gemini's ability to generate images of people after it produced historically inaccurate results: racially diverse "founding fathers," a female pope, and Black and Asian Nazi soldiers. This one is the opposite of the Amazon case. Google had tuned the model to force diversity, countering the well-documented tendency of image models to default to white faces, and the correction overshot into rewriting history. CEO Sundar Pichai called the responses "completely unacceptable." The lesson is uncomfortable: mitigating bias is a judgment call, not a switch you flip. Over-correct, and you trade one failure for another.
SafeRent (settled November 2024). A tenant-screening AI gave each applicant a score that landlords used to accept or reject them. A class action alleged the score disproportionately downgraded Black and Hispanic applicants and housing-voucher holders, because it leaned on credit history and ignored the vouchers that make rent affordable. SafeRent settled for $2.3 million and agreed to stop showing the score for voucher applicants. This is the healthcare case's proxy problem, in production, with a price tag.
Stable Diffusion (analyzed 2023). When Bloomberg generated more than 5,000 images across occupations, the model amplified real-world bias rather than just mirroring it: higher-paying jobs skewed lighter-skinned and male, lower-paying jobs skewed darker-skinned, and "a person" defaulted to a light-skinned man. A University of Washington study presented at EMNLP 2023 found the same pattern. Generative AI, the tech most of us now use daily, makes the "AI scales the bias in its data" problem worse, not better.
The throughline: 2016 to 2024, different companies, different domains, identical root cause. Anyone who tells you AI bias is a solved problem is selling something.
Why These Cases Matter
Across a decade of mainstream AI systems, built by smart people with good intentions, the same root causes show up every time:
- Training data encoded historical inequities (hiring algorithm)
- Fairness definitions conflict and choices weren't made explicit (COMPAS)
- Proxy variables carried discriminatory signal (healthcare algorithm, SafeRent, and the 2025 hiring lawsuits)
- Outcomes, not intentions, create liability (Mobley, Harper, and the rising wave of disparate-impact litigation)
- Over-correcting backfires too (Gemini's forced diversity rewrote history)
If your response to these cases is "my team would never do that," you're not paying attention. The path from "reasonable business metric" to "systematically disadvantaging vulnerable populations" is shorter than most engineers realize.
The Four Pillars: A Framework You Can Actually Use
I organize AI ethics around four pillars. I call it FATP: Fairness, Accountability, Transparency, and Privacy. Use them as design constraints, not a post-launch checklist.
Pillar 1: Fairness
Fairness is about ensuring AI systems don't create or reinforce unfair bias against individuals or groups.
Key questions to ask:
- Who could be harmed by this system's errors?
- Are outcomes equitable across different demographic groups?
- What fairness metric are we optimizing for, and why that one?
- Have we tested for bias using real, representative data?
The impossible trade-off: Different fairness definitions (demographic parity, equalized odds, predictive parity) cannot all be satisfied simultaneously when base rates differ between groups. This is mathematically proven. You have to choose.
For a hiring algorithm, you might prioritize equal selection rates across groups (demographic parity). For a medical diagnosis system, you might prioritize equal false negative rates across groups (equalized odds) because missing a disease is the critical error. The right choice depends on what harms you're most trying to prevent.
Pillar 2: Accountability
Accountability establishes who is responsible when AI systems cause harm.
Key questions to ask:
- Who owns this AI system's outcomes?
- What happens when it makes a mistake?
- Can affected individuals appeal or challenge AI decisions?
- Is there human oversight for high-stakes decisions?
The accountability gap: Traditional accountability frameworks assumed human decision-makers. AI breaks that. When an autonomous system denies your loan, who do you hold accountable? The developer who wrote the algorithm? The company that deployed it? The data provider whose dataset contained bias? The accountability gap is what you're left with when nobody clearly owns the outcome.
Practical accountability requires:
- Clear role assignments (who can stop a deployment, who reviews outcomes)
- Audit trails (records of what decisions were made and why)
- Redress mechanisms (how harmed individuals can seek remedy)
- Meaningful human oversight, where reviewers can actually change or stop a decision
Pillar 3: Transparency
Transparency means making AI systems understandable to stakeholders.
Key questions to ask:
- Can affected individuals understand why a decision was made about them?
- Is the AI system's involvement disclosed?
- Are the limitations documented and communicated?
- Can the system be audited by external parties?
Levels of transparency:
- Technical explainability: What features drove this prediction? (SHAP values, attention weights)
- User interpretability: Why did I get this result, in plain language?
- Organizational disclosure: Are people told when AI is making decisions about them?
Different stakeholders need different types of transparency. An ML engineer debugging a model needs technical explainability. An end user denied a loan needs human-understandable reasoning. A regulator needs audit access.
Pillar 4: Privacy
Privacy protects individuals from unauthorized collection, use, and exposure of their data.
Key questions to ask:
- What data does this system collect, and is collection minimized?
- How long is data retained?
- Can individuals access, correct, or delete their data?
- Are there protections against re-identification from "anonymized" data?
AI-specific privacy concerns:
- Training data privacy: Models can memorize and regurgitate training data, including personal information
- Inference attacks: Sophisticated attackers can extract training data from model outputs
- Aggregation risks: Combining multiple non-sensitive attributes can reveal sensitive information
Privacy by design means building protections into the system from the start, not bolting them on later.
Intermediate: Technical Fairness Interventions
Theory is nice, but let's get practical. How do you actually measure and mitigate bias in a real AI system?
Measuring Bias with Fairlearn
Fairlearn is an open-source toolkit (now a community-governed project, originally from Microsoft) that helps you assess and improve fairness. The current release is 0.14.0 (June 2026), which brought scikit-learn 1.6 compatibility and made the CorrelationRemover (handy for stripping proxy signal from features) fully scikit-learn-compatible. Here's a practical walkthrough.
Step 1: Define your fairness metric
Before you measure anything, decide what "fair" means for your use case. Fairlearn supports multiple metrics:
- Demographic parity: Selection rates are equal across groups
- Equalized odds: True positive and false positive rates are equal across groups
- Predictive parity: Precision is equal across groups
# Example: Measuring demographic parity difference
from fairlearn.metrics import demographic_parity_difference
# y_true: actual outcomes, y_pred: predicted outcomes
# sensitive_features: group membership (e.g., gender, race)
dp_diff = demographic_parity_difference(
y_true=y_true,
y_pred=y_pred,
sensitive_features=sensitive_features
)
print(f"Demographic Parity Difference: {dp_diff:.3f}")
# Closer to 0 = more fair (by this definition)
Step 2: Visualize disparities
Fairlearn provides dashboards to visualize how your model performs across groups.
from fairlearn.metrics import MetricFrame
from sklearn.metrics import accuracy_score, precision_score, recall_score
metrics = {
'accuracy': accuracy_score,
'precision': precision_score,
'recall': recall_score
}
metric_frame = MetricFrame(
metrics=metrics,
y_true=y_true,
y_pred=y_pred,
sensitive_features=sensitive_features
)
# See performance broken down by group
print(metric_frame.by_group)
This prints accuracy, precision, and recall side by side for each group, so a gap that the aggregate score hides becomes obvious at a glance.
Step 3: Apply mitigation techniques
Fairlearn offers algorithms to reduce unfairness:
- Threshold optimization: Find different decision thresholds for different groups to equalize a fairness metric
- Reduction approaches: Constrain the model during training to satisfy fairness constraints
- Post-processing: Adjust predictions after the model is trained
from fairlearn.postprocessing import ThresholdOptimizer
# Optimize thresholds to achieve equalized odds
postprocess_est = ThresholdOptimizer(
estimator=your_model,
constraints="equalized_odds",
prefit=True
)
postprocess_est.fit(X_train, y_train, sensitive_features=sensitive_train)
y_pred_fair = postprocess_est.predict(X_test, sensitive_features=sensitive_test)
Mitigation has costs
Fairness mitigation almost always reduces some other metric, often accuracy. This is not a bug; it's a fundamental trade-off. You're explicitly trading overall predictive performance for more equitable outcomes across groups. This is a values decision, not a technical one.
The Fairness-Accuracy Trade-off in Practice
Here's what the trade-off looks like in practice. Say you have a loan approval model:
| Scenario | Overall Accuracy | Approval Gap (Group A vs B) |
|---|---|---|
| Baseline | 85% | 25 percentage points |
| After mitigation | 82% | 8 percentage points |
You've reduced the approval gap from 25 points to 8 points, but accuracy dropped 3 points. Is that worth it? The answer depends on:
- How much harm does the disparity cause?
- What are the consequences of reduced accuracy?
- What do stakeholders value?
There's no formula to answer these questions. They're ethical choices that require human judgment.
Advanced: Societal Trade-offs and Formal Mitigation
Here's where it gets genuinely uncomfortable: the FATP pillars sometimes conflict with each other, and the fairness definitions from the previous section are mathematically incompatible. No clever engineering eliminates the trade-off. You have to choose.
The Impossibility Theorem
In 2016, researchers proved that three common fairness definitions (calibration, balance for the positive class, and balance for the negative class) cannot all be satisfied simultaneously unless the base rates are equal across groups or the classifier is perfect.
This is known as the impossibility theorem. In any real-world scenario with different base rates, you will violate at least one reasonable-sounding fairness criterion. That's not a model flaw. It's a mathematical limit.
Example: In criminal recidivism prediction, if one group actually has a higher base rate of re-offending (due to systemic factors like poverty, lack of opportunity, etc.), then:
- Equalizing prediction rates (demographic parity) will over-predict risk for the lower-base-rate group
- Equalizing false positive rates will under-serve the higher-base-rate group
- Equalizing calibration will result in different prediction rates
You literally cannot have all three. Which do you choose?
A Framework for Trade-off Decisions
When facing ethical trade-offs, I use this framework:
1. Identify the stakeholders
- Who benefits from the AI system?
- Who bears the risks?
- Who has no voice in the decision?
2. Map the harms
- What happens if the system is unfair by metric A?
- What happens if it's unfair by metric B?
- Which harms are reversible? Which are permanent?
3. Consider power asymmetries
- Does the system affect vulnerable populations disproportionately?
- Do affected individuals have recourse?
- Who profits from the system vs. who bears the risk?
4. Make and document the choice
- Which fairness criterion are you prioritizing and why?
- What are you explicitly trading off?
- How will you monitor for unintended consequences?
The 2026 Regulatory Reality
Ethics used to be mostly voluntary. As of mid-2026, the FATP pillars increasingly map to legal obligations, so documenting your choices is also how you stay compliant.
- EU AI Act: The bans on prohibited practices (since February 2025) and the obligations for general-purpose AI model providers (since 2 August 2025) are live and unchanged. But the headline 2 August 2026 deadline for high-risk systems has moved: under the "Digital Omnibus" that the Council and Parliament provisionally agreed on 7 May 2026 (formal adoption expected before August), Annex III high-risk obligations are deferred to 2 December 2027 and AI embedded in regulated products to 2 August 2028. The deferral is a pragmatic admission that the supporting standards and infrastructure weren't ready, not a softening of the substance. Penalties for prohibited practices still reach up to EUR 35 million or 7% of global turnover, and the voluntary General-Purpose AI Code of Practice (published July 2025, signed by Anthropic, Google, Microsoft, OpenAI, and others) still operationalizes the transparency and safety expectations.
- US state laws are in flux. Colorado's pioneering AI Act (SB 24-205) targeted "algorithmic discrimination" in high-risk systems, but it never took effect: its start date was pushed to 30 June 2026, then the whole framework was repealed and replaced by SB 26-189 (signed 14 May 2026), a narrower transparency-and-consumer-rights law that takes effect 1 January 2027. The lesson for builders: regulation is moving fast and unevenly, so design to the principles, not to a single statute.
- Voluntary standards as a baseline. ISO/IEC 42001 (the first certifiable AI management-system standard) and the NIST Generative AI Profile (AI 600-1) have become the de facto frameworks organizations adopt to demonstrate due care, increasingly as a procurement requirement for vendors.
Compliance follows ethics, not the other way around
If you've already worked through the FATP checklist, fairness testing, accountability ownership, transparency documentation, and privacy controls, you're most of the way to satisfying the EU AI Act, ISO/IEC 42001, and the NIST profile. Teams that bolt compliance on at the end pay for it the expensive way: retraining, scrapped work, legal exposure, and lost trust, the kind of bill the case studies above all ran up.
Writing a Formal Mitigation Report
For high-stakes AI systems, document your ethical analysis formally. Here's a template:
# Ethical Mitigation Report: [System Name]
## 1. System Description
- Purpose and intended use
- Affected populations
- Decision types (recommendations, predictions, automatic actions)
## 2. Fairness Analysis
### Metrics Assessed
| Metric | Definition | Result |
|--------|------------|--------|
| Demographic Parity | Equal selection rates | 0.15 gap |
| Equalized Odds | Equal TPR/FPR | 0.08 gap |
### Groups Analyzed
- [Group A vs Group B]
- [Other relevant comparisons]
## 3. Trade-off Analysis
- [Fairness metric A] vs [Fairness metric B]: We prioritized [A] because [reasoning]
- Accuracy impact: [X]% reduction in overall accuracy
## 4. Mitigation Applied
- Technique: [ThresholdOptimizer / Reductions / etc.]
- Parameters: [settings]
- Resulting metrics: [post-mitigation numbers]
## 5. Residual Risks
- [Remaining disparities]
- [Known limitations]
## 6. Monitoring Plan
- Metrics tracked post-deployment
- Alerting thresholds
- Review frequency
## 7. Accountability
- System owner: [name]
- Ethics review: [approver]
- Redress contact: [process for affected individuals]
Make it a living document
This isn't a one-time report. Update it as the system evolves, as you gather production data, and as your understanding deepens. Ethical analysis is iterative, not waterfall.
The FATP Checklist
Before deploying any AI system, work through this checklist:
Fairness
- [ ] Identified protected groups relevant to the use case
- [ ] Measured outcomes across demographic groups
- [ ] Chosen and documented primary fairness metric
- [ ] Applied mitigation if disparities exceeded threshold
- [ ] Tested for bias using data representative of production
Accountability
- [ ] Assigned clear ownership for system outcomes
- [ ] Defined human oversight requirements for high-stakes decisions
- [ ] Established redress mechanism for affected individuals
- [ ] Created audit trail for decisions
- [ ] Documented escalation path for ethical concerns
Transparency
- [ ] Documented model limitations and known failure modes
- [ ] Provided explanation capability appropriate to stakeholders
- [ ] Disclosed AI involvement to affected parties
- [ ] Made system auditable by authorized parties
- [ ] Published model card or equivalent documentation
Privacy
- [ ] Minimized data collection to what's necessary
- [ ] Implemented appropriate data retention policies
- [ ] Provided individual access and deletion rights
- [ ] Assessed re-identification risks
- [ ] Protected against training data extraction
The Honest Summary
AI ethics comes down to building systems you can defend when things go wrong. Systems that don't compound existing inequities at scale. Being a good person helps, but good intentions don't survive contact with bad design choices.
What works:
- Treating ethics as a design constraint, not a post-launch audit
- Using concrete tools like Fairlearn to measure and mitigate bias
- Documenting trade-offs explicitly rather than hiding them
- Building accountability structures before you need them
What's hard:
- Navigating mathematically incompatible fairness definitions
- Convincing stakeholders that ethical constraints are worth the accuracy trade-off
- Detecting bias from proxy variables you didn't know were proxies
- Maintaining ethical vigilance as systems evolve
The teams that get AI ethics right share one trait: they ask "who could this hurt?" before they ask "how fast can we ship?" That question is the design constraint. Catch the answer early and you're fixing a decision. Catch it in production and you're fixing a lawsuit.
Next up in Module 8: AI Safety and Security, where we tackle the technical vulnerabilities that make ethical AI possible or impossible.
Quick Reference
| Pillar | Key Question | Primary Tool |
|---|---|---|
| Fairness | Are outcomes equitable across groups? | Fairlearn, AIF360 |
| Accountability | Who's responsible when things go wrong? | RACI matrix, audit trails |
| Transparency | Can stakeholders understand decisions? | SHAP, model cards |
| Privacy | Is data collection minimized and protected? | Privacy by design |
FAQs
Q: Our company doesn't have an AI ethics team. How do we get started?
You don't need a dedicated team to start practicing AI ethics. Begin with the FATP checklist for your next AI project. Assign an "ethics owner" (could be the tech lead or PM) who ensures the checklist gets attention. Run fairness metrics on your existing systems to establish baselines. The goal isn't perfection from day one; it's building the muscle of asking ethical questions consistently. As you mature, you might invest in dedicated roles, but many organizations practice effective AI ethics with distributed responsibility.
Q: Doesn't focusing on fairness reduce model accuracy? How do I justify that to stakeholders?
Yes, fairness interventions often reduce overall accuracy. Frame it this way: What's the cost of the current unfairness? If your hiring algorithm systematically excludes qualified candidates from certain groups, you're leaving talent on the table. If your loan algorithm denies creditworthy applicants unfairly, you're losing good business. Calculate the cost of false negatives across groups, not just aggregate accuracy. Often, "reducing accuracy" means "reducing accuracy for the majority group while improving it for minority groups." The aggregate number goes down, but the system becomes more useful for more people.
Q: How do I know which fairness metric to prioritize?
Start with the harms you're trying to prevent. If the main concern is equal access (loan approvals, hiring), demographic parity matters more. If the concern is equal treatment of actual positives (medical diagnosis), equalized odds matters more. If you need the predictions to mean the same thing across groups (risk scores), calibration matters. There's no universal answer; it depends on the domain, the stakes, and the values of your organization. The key is to make the choice explicitly and document your reasoning.
Read the Full Curriculum
This piece is one post in my AI Fluency Curriculum, where I document what I'm learning about building and shipping AI responsibly. The full version on BlockSimplified includes an interactive quiz, linked Learning Blocks for the key terms, and a curated resource list. If ethics-as-design-constraint resonated, read the full article and the rest of the series.

Top comments (0)