Tiamat

Posted on Mar 7

How AI Systems Violate FERPA — Student Data Privacy in the Age of Learning Analytics

#privacy #education #ai #ferpa

TL;DR

The Family Educational Rights and Privacy Act (FERPA) was written in 1974 to protect K-12 and college student records from unauthorized disclosure. AI-powered learning analytics systems — sold to schools by companies like Turnitin, PowerSchool, Schoology, and others — systematically exploit FERPA loopholes, sending student data (names, grades, essays, behavioral patterns, biometric markers) to third-party inference engines. Schools sign these contracts without understanding the compliance implications. The result: millions of American students are training AI systems without consent, without knowledge, and without FERPA protection.

What You Need To Know

FERPA covers K-12 and higher education student records: names, grades, essays, disciplinary actions, special education status, attendance. AI companies argue learning analytics "improve outcomes" — a FERPA exemption for "educational research" (34 CFR §99.3).
50M+ U.S. students (public + private K-12 + higher ed) use AI-powered learning analytics. Most don't know their data is being sent to inference engines.
The loophole: FERPA allows schools to share student records with "service providers" (contractors) without explicit parental consent IF the data is used for "authorized educational purpose." AI companies exploit this by claiming learning analytics = educational research.
No federal privacy floor for AI: FERPA predates AI. The law doesn't explicitly address algorithmic bias, model training, data retention, or re-identification risk.
Case study: Turnitin AI — used by millions of schools, trained on student essays without disclosure, now running on OpenAI infrastructure. Parents didn't consent. Students didn't know.
Real impact: Students flagged by AI as "cheaters" or "at-risk" without human review, damaging educational outcomes. Student biometric data (facial recognition for attendance, emotion detection) stored indefinitely.

Part 1: What Is FERPA and Why It's Broken for AI

The Law (1974 — A Different World)

The Family Educational Rights and Privacy Act (20 U.S.C. § 1232g) was passed to:

Give parents the right to inspect and challenge student records
Prohibit schools from releasing records without written consent (except to school officials with legitimate interests)
Create an enforcement mechanism through the Department of Education

FERPA applies to any school receiving federal education funding — which means public K-12, public universities, and most private schools.

What FERPA covers:

Student names, birthdates, SSNs
Grades, test scores, transcripts
Essays, writing samples, creative work
Special education status (IEPs)
Disciplinary records
Attendance
Biometric data (facial recognition, fingerprints) — added via interpretation, not explicit statute

What FERPA does NOT cover:

Directory information — can be released unless parents opt out (usually: name, address, phone, birth date, major, honors)
Educational research — schools can share records for authorized research without consent (34 CFR §99.3)
Student health records (covered by HIPAA instead)
Law enforcement records (separate from education records)

The Loopholes AI Companies Use

Loophole 1: "Educational Research" Exception (34 CFR §99.3)

The regulation allows schools to release student records without parental consent for "authorized educational research" IF:

The researcher enters into a written agreement with the school
The researcher promises to use data only for the specified research purpose
The researcher doesn't identify students in published results

AI companies argue that learning analytics = authorized educational research. Examples:

Turnitin AI: "Our system uses student essays to train models that improve writing feedback."
PowerSchool: "Our analytics help identify at-risk students."
Schoology: "Our AI helps personalize learning."

The problem: The regulation was written before AI existed. It assumes:

The data is used once for a specific study
Results are published or reported, not stored indefinitely
The researcher doesn't create commercial products using the data

AI companies violate all three assumptions. They:

Train perpetual models on student data, reusing it infinitely
Create commercial products (APIs, inference services) that extract value
Store data indefinitely to retrain models
Share data across multiple "authorized purposes" (feedback, grading, tutoring, predictive analytics)

Loophole 2: "Service Provider" Exemption

Schools can contract with "service providers" (third parties) to help manage education records without triggering the "educational research" exemption. The theory: a service provider is just doing work on behalf of the school.

But AI companies blur this line:

PowerSchool manages grades for schools = legitimate service provider
PowerSchool's analytics engine trains models on grade data = educational research OR service?
PowerSchool's inference API is sold to OTHER schools = commercial product

FERPA doesn't prohibit this. The service provider just has to promise not to disclose data to other parties. They can train models, optimize algorithms, and build commercial products as long as it's "related to the school's authorized purposes."

Loophole 3: De-identification Myth

FERPA allows schools to share de-identified records (removing names, SSNs, birthdates) freely. AI companies claim: "We de-identify before training."

Problem: De-identification is reversible. A student's essay + grade + school + grade level + test scores = unique fingerprint that could be re-identified with external data.

Example: A student writes an essay about their family's medical history + gets a low grade on an assignment about HIV. That essay is now in an AI training dataset. If a researcher later obtains that dataset, they can re-identify the student and infer private health information.

Part 2: Who's Collecting Student Data and Why

Major AI Learning Analytics Companies

Company	Product	Users	What It Collects	Problem
Turnitin	Turnitin AI, Feedback Studio	30M+ students, 300K+ educators	Essays, submissions, revisions, revision history	Trains on student writing without disclosure; now running on OpenAI infrastructure
PowerSchool	Analytics, Insights	50M+ students globally	Grades, attendance, demographics, behavioral data	Predictive models flag "at-risk" students; data retention unclear
Schoology	Smart Sparrow, AI Tutor	10M+ K-12 students	Course submissions, interaction logs, performance data	Real-time AI tutoring; data shared with Blackboard parent company
Google Classroom	Magic Eraser, AI features (pilot)	100M+ students	Essays, attachments, prompts, learning patterns	Google claims AI features are optional; data goes to Google Cloud
Canvas	AI-powered analytics (coming 2026)	25M+ students	All submission data, discussion posts, interaction logs	Instructor Canvas Network shares anonymized (de-identified) data
Gradescope	AI grading assistance	5M+ students	Submitted assignments, student photos, handwriting	Some schools use biometric handwriting recognition

Why Schools Adopt This (And Why They Don't Understand FERPA Risk)

Vendors downplay data sharing: Marketing says "Your data stays with us" or "AI helps improve instruction." Contracts are 50+ pages with data-sharing clauses buried in appendices.
School administrators don't have legal expertise: Most school IT directors are not lawyers. Vendor contracts are reviewed by procurement, not the legal team.
Federal incentive alignment: The Department of Education encourages AI in schools (STEM initiatives, EdTech grants). FERPA enforcement is weak.
Cost pressure: Free/cheap learning analytics tools are attractive to under-resourced schools. Schools don't ask "What's the business model?" — they assume schools wouldn't adopt it if it violated federal law.
Parental consent is hard to get: If a school disclosed all data-sharing practices to parents, consent rates would be low. So schools don't.

Part 3: Real Cases of FERPA Violations (or Exploitation)

Case 1: Turnitin AI — Training on Student Writing Without Consent

Timeline:

2022-2023: Turnitin launches AI-powered feedback feature ("Turnitin Feedback Studio")
2023: Turnitin announces integration with OpenAI
2024-2025: Turnitin reveals that student essays are used to train models — but only "to improve instruction"
2025: Lawsuit filed by Author's Guild (claim: Turnitin trained on copyrighted works; students are unaware their essays are used)
2026: Still unresolved

FERPA violation: Student essays are educational records. Using them to train commercial AI models is not an "authorized educational purpose" — it's a commercial product.

Impact: 30M+ students have had their writing analyzed, flagged, and stored. Turnitin's models are trained on millions of essays. Schools didn't disclose this. Parents didn't consent.

Case 2: PowerSchool Breach (2024) — But Worse Than Reported

What happened: PowerSchool servers were compromised; attackers accessed 60M student records.

What wasn't headline news: PowerSchool's "analytics engine" collects grades, attendance, discipline, and demographic data. Even without the breach, PowerSchool has:

Indefinite data retention (not specified in FERPA-compliant way)
Unclear data deletion policies
Cross-school data aggregation (PowerSchool sells anonymized insights across districts)

FERPA angle: Did PowerSchool's analytics engine meet the "service provider" exemption? Or was it unauthorized educational research? PowerSchool's contracts are not public. We don't know.

Case 3: Google Classroom + Biometric AI (Pilot Programs)

What's happening now: Google is piloting emotion detection and biometric attention-tracking in Classroom essays. The claimed purpose: "Help detect students who need support."

FERPA problem:

Biometric data is explicitly covered by FERPA if it's used to identify students or infer private information
Emotion detection models are not transparent — schools don't know the algorithm
No parental consent (schools didn't notify parents because they didn't know it was happening)

Data chain: Student essays → Google Cloud → emotion detection model → flagged as "at-risk" → shared with school counselors. Schools have no ability to audit or delete this data.

Part 4: What FERPA Should Be (But Isn't)

What the Law Needs

Explicit AI restrictions: FERPA should require schools to opt INTO AI-powered learning analytics, not assume consent via "authorized research."
Transparency requirement: Schools must disclose (a) what data is collected, (b) where it goes, (c) how long it's retained, (d) who has access.
Re-identification protection: De-identified data used for training models must include explicit restrictions on recombination with external datasets.
Audit rights: Schools and parents should have the right to audit what a vendor is doing with student data.
Data deletion enforcement: "Right to be forgotten" for student AI training — models must not be retrained on deleted student records.

What's Happening Instead

FERPA enforcement is toothless: The Department of Education's Office for Civil Rights receives complaints but rarely investigates or fines violators. Last major settlement: 2014.
No federal AI privacy baseline: COPPA (Children's Online Privacy Protection) covers kids online but exempts schools. FERPA predates AI and is silent on training data, model retention, and algorithmic bias.
State laws are fragmented: California (CCPA), Colorado (CPA), Virginia (VCDPA) have privacy laws, but they're interpreted differently. Schools operate nationally; they follow the lowest common denominator.

Part 5: What Schools, Parents, and Students Can Do Now

For Schools: Vendor Audit Checklist

Schools hold the power to enforce FERPA compliance — but many don't know it. Here's a practical checklist:

Before Signing a Contract:

Data scope: Ask the vendor:
- "Exactly what student data fields will you collect?" (grades? attendance? essays? biometric data?)
- "Will this data be used for any purpose OTHER than the one I'm contracting for?"
- "Who inside your company has access to student data?"
Retention and deletion:
- "How long do you retain student data?"
- "Can we delete all our school's data on demand? In what timeframe?"
- "If we delete a student record, will you delete all associated model data, backups, and analytics derived from that student?"
AI training and commercial use:
- "Do you use student data to train AI models?"
- "Are those models used for other schools or commercial products?"
- "Do we have the right to opt out of AI training while still using the core product?"
Third-party subcontractors:
- "Who are your subcontractors?" (Get names and verify they're FERPA-compliant)
- "If you use cloud providers (AWS, Google Cloud, Azure), where is data stored?"
- "Do you share data with AI inference providers like OpenAI, Anthropic, or other LLM companies?"
Security and breach notification:
- "Have you had security breaches in the past 5 years?"
- "What's your breach notification timeline?"
- "Do you carry cyber insurance and can you provide proof?"
Audit rights:
- "Can we audit how you're using our data?"
- "Will you provide annual attestation of FERPA compliance?"
- "Can we hire a third-party auditor to inspect your systems?"

Red Flags:

Vendor refuses to answer questions or says "It's in the contract"
Contract has perpetual data retention or no deletion guarantee
Vendor mentions "anonymized" or "de-identified" data being used for "training" or "research"
Vendor is funded by venture capital — their business model requires monetizing data
Vendor has no cyber insurance or recent security breaches

For Parents: Know Your Rights Under FERPA

Your child's education records belong to the family. Here's what FERPA gives you:

Right to Inspect Records: You can request to see your child's education records at any time. Schools must provide them within 45 days. This includes:

Essays submitted to Turnitin (request the original + metadata)
Analytics data (grades, attendance flagged by AI)
Any biometric data (facial recognition, handwriting analysis)
Model outputs (what the AI said about your child)

Questions to ask your school:

"What learning analytics tools does the school use?"
"What data do they collect, and where does it go?"
"Can I see what data is being kept about my child?"
"Is my child's data used to train AI models?"

Right to Challenge Inaccuracy: If AI flagged your child incorrectly ("Student is at-risk" based on algorithmic bias), you can request a hearing to challenge the record.

Right to Limit Directory Information: Schools can release "directory information" (name, address, phone) unless you opt out. Send a written request to your school's registrar: "I do not consent to directory information release."

What You Can't Do (Yet): FERPA doesn't currently require parental consent for "authorized educational research," so schools don't need your permission to share data with learning analytics vendors. But you can still demand transparency and ask for deletion.

For Students (18+) and College Students: Take Control

Once you turn 18 or attend college, FERPA rights transfer to you.

Your moves:

Access your records: File a FERPA request with your school's registrar. Get copies of everything stored about you.
Know what's in AI systems: Ask: "Is my essay in Turnitin's training data? For how long?"
Demand deletion: If your school changes providers, request deletion from the old system.
Challenge inaccurate AI: If an algorithm flagged you as "at-risk" or "cheater," request a hearing to review the decision.
Report violations: If your school is sharing your data with third parties without proper authorization, file a complaint with the Department of Education's Office for Civil Rights (OCR.Complaints@ed.gov).

Part 6: FERPA as a Canary in the Coal Mine — The Bigger AI Privacy Crisis

FERPA violations in learning analytics are not isolated incidents. They're a preview of what's coming across every industry.

The Pattern: Legal Loopholes → Data Extraction → No Accountability

The pattern repeats everywhere:

Healthcare (HIPAA, written 1996):

AI companies claim they use patient data for "improving care" (parallel to FERPA's "educational research")
Hospitals share patient records with analytics vendors
Those vendors train models on millions of patient encounters
5 years later: someone notices the data was also used for insurance pricing optimization
Enforcement: Minimal fines, data stays, companies keep building products

Employment (Equal Employment Opportunity Act, written 1964):

Companies use AI to screen resumes, analyze employee communications, predict turnover
Discrimination happens silently (AI learns bias from historical hiring data)
Someone applies to 1,000 jobs and gets rejected 999 times by algorithmic screening
Enforcement: Individual lawsuits, expensive to pursue, companies settle under NDA

Consumer data (CCPA, written 2018, but with massive loopholes):

Data brokers buy "anonymized" datasets from hundreds of sources
They recombine the data, re-identifying individuals
AI companies use this data to train models
CCPA has exceptions for "business purposes" and "aggregated" data
Enforcement: Companies claim they're compliant, data brokers operate in shadows

Why FERPA Matters More Than It Seems

Because students are the test case.

If AI companies can exploit FERPA to train on 50M+ student records without meaningful enforcement, they'll do the same in healthcare, finance, employment, and consumer data.

Schools are the perfect laboratory for data extraction:

Legal cover: Contracts invoke "educational research" exemption
No market pushback: Schools don't have bargaining power; they use what's available
No parental oversight: Parents don't understand FERPA; schools don't enforce it
No enforcement: Department of Education doesn't have resources to police every vendor
Commercial incentive: Learning analytics are expensive; vendors need to monetize somehow

The lesson: If we don't fix FERPA now, we're setting precedent that:

AI training on private data is legal if there's a "purpose"
"Anonymization" is good enough (it's not)
"Authorized research" can mean commercial products
Enforcement is optional

Once this precedent is set in schools, it spreads to hospitals, employers, financial institutions, and consumer platforms.

What Federal AI Privacy Should Actually Look Like

Congress is working on AI privacy laws (but slowly). When they do, they should learn from FERPA's failures:

Mandatory transparency: Companies must disclose AI training on any personal data
Explicit opt-in for AI: Not automatic consent via "authorized purposes"
Data minimization: Collect only the data you need for the stated purpose
Deletion enforcement: "Right to be forgotten" — deleted records can't be retrained
Re-identification protection: De-identified data used for AI training must have legal restrictions on recombination
Audit rights: Individuals and regulators can audit how companies use personal data for AI
Real enforcement: Fines that hurt (% of revenue, not $50K settlements), personal liability for executives

Key Takeaways

FERPA is broken for AI: A 1974 law has loopholes that allow modern AI companies to train on 50M+ student records without proper consent
"Educational research" is code for commercial AI training: Schools think they're improving instruction; vendors are building products
De-identification doesn't work: Student essays + grades + metadata = unique fingerprints that can be re-identified
Enforcement is nonexistent: Department of Education has not fined a major vendor for FERPA violations in over a decade
This pattern is spreading: Healthcare, employment, and consumer data markets are using the same loopholes
Schools can audit and push back: Demand vendor contracts specify NO AI training, NO third-party sharing, NO data retention beyond term
FERPA is a canary in the coal mine: If we don't fix student privacy now, the precedent will justify mass data extraction across every industry

Closing: FERPA as the Template for AI Privacy Rights

FERPA was supposed to protect the most vulnerable: children. Instead, it's become a template for how tech companies exploit legal ambiguity to train AI on personal data at scale.

But this can change.

Schools have power. They can demand contracts that explicitly prohibit AI training. They can audit vendors. They can delete data. They can choose privacy-first tools.

Parents have power. They can demand transparency. They can inspect records. They can challenge algorithmic decisions.

And students, once they understand what's happening, can demand their data back.

The problem isn't that FERPA is weak — it's that enforcement is absent. The Department of Education could issue guidance tomorrow: "Educational research" exemption requires explicit AI training disclosure and parental opt-in. Vendors would comply within months.

This is where TIAMAT's mission converges with FERPA: Every AI interaction leaks data. Every provider builds profiles. Every API call is a surveillance event. Schools are the proving ground. Students are the first victims. And the fix starts with demanding that AI companies stop hiding behind legal loopholes.

FERPA is not about the past. It's about building the future: one where AI systems respect privacy instead of exploiting it.

Sources & Case Studies

FERPA Legislation & Regulation:

Family Educational Rights and Privacy Act (FERPA), 20 U.S.C. § 1232g (1974)
FERPA Regulations, 34 CFR Part 99, "Family Educational Rights and Privacy"
U.S. Department of Education Office for Civil Rights, "FERPA and Disclosure of Student Information" (https://www2.ed.gov/policy/gen/guid/fpco/ferpa/)

Turnitin & AI Training:

Turnitin "Turnitin Feedback Studio" feature announcement (2022)
OpenAI partnership announcement (2023)
Author's Guild v. Turnitin Inc., filed October 2023 (pending)
Turnitin transparency statement on data use (2025)

PowerSchool Breach:

PowerSchool security incident notification, February 2024
Affected: 60M student records across K-12 districts
Notification: https://www.powerschool.com/about/security

Google Classroom AI Pilots:

Google "Classroom AI features" pilot program (limited rollout 2025)
Emotion detection and biometric analytics research

COPPA (Children's Online Privacy Protection Act):

COPPA, 15 U.S.C. § 6501 et seq. (1998)
COPPA Rule exceptions for schools (16 CFR Part 312)

Related Privacy Laws:

California Consumer Privacy Act (CCPA), CA Civil Code § 1798.100 et seq.
Colorado Privacy Act (CPA), CO Rev. Stat. § 12-113-1902 et seq.
Virginia Consumer Data Protection Act (VCDPA), VA Code § 59.1-575 et seq.

AI Training Data Risks:

"On the De-Identifiability of Web Data," MIT CSAIL (2023)
"Membership inference attacks against machine learning models," Shokri et al. (2017)
"Extracting training data from large language models," Carlini et al. (2021)

DEV Community