The Moment the Ethics Question Becomes Real
It happens somewhere in the middle of a sprint. You're building a feature — a loan approval model, a hiring screening tool, a customer churn predictor — and a question surfaces that isn't in the ticket.
"What happens if this is wrong about someone?"
Most engineering processes don't have a great answer. There's no ticket for that. Ethics reviews, if they exist, happen at the policy level, disconnected from the code being written. Developers ship the feature, the system goes live, and the ethical consequences of its decisions become someone else's problem — until they become everyone's problem.
This post is about closing that gap. Not at the policy level — at the code level. What does ethical AI development actually look like in practice, as a developer shipping business software in 2026?
Why This Is a Developer Problem, Not Just a Policy Problem
The standard framing is: companies set AI ethics policies, legal reviews them, and developers implement whatever they're told. Ethics is above the engineer's pay grade.
This framing fails for two reasons.
First, the decisions that determine ethical outcomes are made in code. Which features you include in a model. How you handle missing data. Whether you build in override mechanisms. Whether you log decisions in ways that allow auditing. These are engineering decisions, made by developers, that have ethical consequences. The policy document doesn't implement them — you do.
Second, developers are often the first people to see the ethical problems. You're the one who notices that the training data has almost no examples from a particular demographic. You're the one who sees that the model's confidence scores don't actually track its accuracy. You're the one who realises there's no mechanism for users to dispute an automated decision. If you don't raise it, it might not get raised.
This isn't about developers becoming ethicists or making unilateral decisions about company policy. It's about understanding that the choices you make in implementation have ethical stakes, and building that awareness into how you work.
The Core Problems to Design Against
Bias and Fairness
Algorithmic bias is the most discussed AI ethics issue and the one with the most established tooling for detection and mitigation. The key concepts:
Disparate impact: A system that produces different outcomes across demographic groups, even if the demographic variable isn't explicitly in the model. A credit model that uses zip code as a feature may have disparate racial impact even without explicit race data, because zip code correlates with race in many markets.
Measurement bias: The training data measures outcomes that are themselves biased. If a hiring model is trained on historical promotion decisions, and those decisions were influenced by gender bias, the model learns to replicate that bias.
Feedback loops: Models deployed in production generate data that can be used to retrain them. If the model makes biased decisions, those decisions affect the data, which reinforces the bias in the next model.
Detecting fairness issues requires measuring outcomes across demographic groups — which requires having demographic data, which raises privacy questions. The tension between fairness auditing and privacy is real and doesn't have a clean resolution.
Practical starting points:
import pandas as pd
import numpy as np
from scipy import stats
def calculate_disparate_impact(
predictions: pd.Series,
protected_attribute: pd.Series,
positive_outcome_threshold: float = 0.5
) -> dict:
"""
Calculate disparate impact ratio between demographic groups.
Disparate impact ratio < 0.8 is the "4/5ths rule" commonly used
in employment discrimination analysis (US context).
A ratio of 1.0 means equal positive outcome rates across groups.
"""
binary_predictions = (predictions >= positive_outcome_threshold).astype(int)
groups = protected_attribute.unique()
rates = {}
for group in groups:
mask = protected_attribute == group
group_predictions = binary_predictions[mask]
rates[group] = group_predictions.mean()
if len(rates) < 2:
return {"error": "Need at least two groups for comparison"}
# Compare each group to the best-performing group
max_rate = max(rates.values())
results = {
"positive_rates_by_group": rates,
"disparate_impact_ratios": {
group: rate / max_rate
for group, rate in rates.items()
},
"flag_for_review": any(
rate / max_rate < 0.8
for rate in rates.values()
if max_rate > 0
)
}
# Statistical significance test (chi-square)
group_list = list(groups)
if len(group_list) == 2:
g1_mask = protected_attribute == group_list[0]
g2_mask = protected_attribute == group_list[1]
contingency = pd.crosstab(
protected_attribute,
binary_predictions
)
chi2, p_value, _, _ = stats.chi2_contingency(contingency)
results["chi2_p_value"] = p_value
results["statistically_significant"] = p_value < 0.05
return results
def audit_model_fairness(
model,
X_test: pd.DataFrame,
y_test: pd.Series,
protected_cols: list[str]
) -> dict:
"""
Run a basic fairness audit across specified protected attributes.
Call this before deploying any model that affects individuals.
"""
predictions = model.predict_proba(X_test)[:, 1] if hasattr(model, 'predict_proba') else model.predict(X_test)
audit_results = {
"overall_accuracy": (model.predict(X_test) == y_test).mean(),
"fairness_by_attribute": {}
}
for col in protected_cols:
if col in X_test.columns:
di_result = calculate_disparate_impact(
pd.Series(predictions),
X_test[col]
)
# Also check accuracy per group
group_accuracy = {}
for group in X_test[col].unique():
mask = X_test[col] == group
if mask.sum() > 0:
group_accuracy[group] = (
model.predict(X_test[mask]) == y_test[mask]
).mean()
audit_results["fairness_by_attribute"][col] = {
**di_result,
"accuracy_by_group": group_accuracy
}
return audit_results
Transparency and Explainability
When an AI system makes a decision that affects someone — a loan denial, a job application rejection, a fraud flag — that person generally has a right to understand why. This is now law in many jurisdictions (GDPR Article 22 in Europe, various US state laws, the EU AI Act for high-risk systems).
The engineering implication: you need to build explainability into the system from the start, not as an afterthought.
For traditional ML models, SHAP values provide per-prediction feature importance explanations. For LLM-based systems, the explanation requirement is more complex — you need to be able to articulate what inputs led to what output, and you need to log enough context to reconstruct that explanation later.
import shap
import numpy as np
from dataclasses import dataclass
@dataclass
class DecisionExplanation:
prediction: float
confidence: str # "high", "medium", "low"
top_factors: list[dict] # [{"factor": "income", "impact": "positive", "weight": 0.3}]
audit_log_id: str # Reference to stored log for compliance
def explain_prediction(model, X_instance: pd.DataFrame, feature_names: list[str], audit_logger) -> DecisionExplanation:
"""
Generate a human-readable explanation for a single model prediction.
Logs the explanation for compliance purposes.
"""
# Get prediction
pred_proba = model.predict_proba(X_instance)[0][1]
# Generate SHAP explanation
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_instance)
if isinstance(shap_values, list):
shap_values = shap_values[1] # Positive class for binary classification
# Build human-readable factors
feature_impacts = list(zip(feature_names, shap_values[0]))
feature_impacts.sort(key=lambda x: abs(x[1]), reverse=True)
top_factors = [
{
"factor": name,
"impact": "positive" if value > 0 else "negative",
"weight": abs(float(value)),
"human_label": _get_human_label(name, X_instance[name].values[0])
}
for name, value in feature_impacts[:5]
if abs(value) > 0.01 # Only include factors with meaningful impact
]
# Confidence based on prediction distance from 0.5
confidence = "high" if abs(pred_proba - 0.5) > 0.3 else "medium" if abs(pred_proba - 0.5) > 0.15 else "low"
# Log for compliance
log_id = audit_logger.log_decision(
prediction=pred_proba,
input_features=X_instance.to_dict(),
shap_values=dict(zip(feature_names, shap_values[0].tolist())),
top_factors=top_factors
)
return DecisionExplanation(
prediction=pred_proba,
confidence=confidence,
top_factors=top_factors,
audit_log_id=log_id
)
def _get_human_label(feature_name: str, value) -> str:
"""Convert technical feature names to user-friendly descriptions."""
labels = {
"debt_to_income": f"Debt-to-income ratio: {value:.1%}",
"credit_utilisation": f"Credit utilisation: {value:.1%}",
"months_employed": f"Time in current employment: {int(value)} months",
}
return labels.get(feature_name, f"{feature_name}: {value}")
The Right to Contest
Explainability without recourse is incomplete. If an AI system makes a decision that affects someone, they should have a meaningful path to contest it.
In practice, this means:
- A human review process exists for contested decisions
- That process is staffed and has defined SLAs
- The human reviewer has access to the explanation and the input data
- Reversals are possible and tracked
- Reversal patterns feed back into model improvement
This is primarily a process design question, not a technical one — but the technical system needs to support it. Decision logs need to be retrievable. Override mechanisms need to exist. Correction workflows need to be built.
Practical Checklist: Before You Ship an AI System That Affects People
This isn't a compliance checklist — it's a developer's conscience checklist. The questions to ask before signing off on a system that makes automated decisions about individuals:
Data and training
- [ ] Do you understand where the training data came from and what biases it might contain?
- [ ] Have you tested for disparate impact across relevant demographic groups?
- [ ] Is there a process to detect and correct model drift after deployment?
Transparency
- [ ] Can you explain any individual decision in plain language?
- [ ] Are affected individuals informed that an automated system is being used?
- [ ] Is there a mechanism to provide explanations on request?
Oversight and recourse
- [ ] Is there a human review option for contested decisions?
- [ ] Are all automated decisions logged in a retrievable audit trail?
- [ ] Is there a defined process for correction when the system is wrong?
Scope and power
- [ ] Is the system being used only for the purpose it was built for?
- [ ] Are there mechanisms to prevent scope creep (using a hiring model for performance management, for example)?
- [ ] Who has override authority, and are those overrides tracked?
LLM-specific considerations
- [ ] Are there guardrails preventing the LLM from generating harmful outputs?
- [ ] Is sensitive personal data being sent to external AI APIs? (Data governance question)
- [ ] Are there rate limits and monitoring to detect misuse?
The Harder Questions
Some of the ethical questions in AI development don't have clean technical answers.
Should you build a surveillance system for an employer even if it's technically legal? Should you implement a credit scoring model for a market with weak consumer protections? Should you build an AI content moderation system that will make mistakes affecting people's ability to participate in public discourse?
These questions are above the scope of what I can answer here. What I can say: the fact that you're asked to build something doesn't automatically make it the right thing to build. Developers have professional judgment and professional responsibility. Companies that build AI systems without genuine ethical consideration face growing regulatory risk, reputational risk, and — most importantly — the risk of causing real harm.
The practical implication for most developers: raise the questions. You might not get to make the final call, but asking "what happens to the person this gets wrong about" is part of the job. Document your concerns. Advocate for fairness audits, explainability tooling, and human oversight. And be honest with yourself about the systems you're willing to build.
That's not naive idealism — it's professional responsibility.
What ethical questions have come up in AI systems you've built? I'm particularly interested in cases where the right answer wasn't obvious or where you had to push back internally.


Top comments (0)