This was 2024 project I did during my internship with HNG TECH focused on data cleansing and anomaly detection in electoral datasets. I recently revisited it to document the methodology and insights more clearly, as part of building my professional portfolio and showcasing real-world analytical impact.
Section 1: Why This Matters**
Electoral credibility is foundational to democracy. In this project, I used Python to analyze polling unit-level data from Zamfara State, identifying statistical anomalies that could signal irregularities, data entry errors, or procedural lapses. This is a lightweight, reproducible approach to election auditing—no GIS tools required.
Section 2: Methodology Overview**
📥 Data Source
- CSV file with polling unit-level results
- Fields: Latitude, Longitude, Registered Voters, Accredited Voters, Votes per Party, PU Metadata.
- Excel was used to manually clean and organize the geospatial coordinates (longitude and latitude)
Outlier Detection
- Calculated outlier scores using z-scores and domain heuristics
- Ranked polling units by anomaly severity per party
Section 2: Key Findings
- PU 108 & PU 104 (Birnin Magaji): High outlier scores for APC and PDP
- PU 321 & PU 124 (Tsafe): Negative transcription counts flagged
- Geospatial Clustering: Anomalies concentrated in specific LGAs
- Cross-Party Deviations: Irregularities not isolated to one party
Section 4: Technical Highlights
- No GIS tools—just Python (Pandas, Matplotlib, NumPy)
- Google Colab for cloud execution
- Scalable framework for other states or elections
Section 5: Recommendations
- Audit flagged polling units
- Fix metadata gaps and transcription errors
- Integrate anomaly detection into official result verification
Closing Thoughts
This project taught me how to apply statistical thinking to real-world problems, especially in politically sensitive domains. It’s a reminder that data science isn’t
For full analysis—including code, tables, and transcription breakdowns—see the complete notebook on my LinkedIn page(https://www.linkedin.com/posts/rahmah-abubakar-243058288_analysis-process-activity-7253521166152163328-zh2n?utm_source=share&utm_medium=member_android&rcm=ACoAAEXGTFMBU8WaWDL8Z4Wl7HzBBVCiQx_eYO4)._
Top comments (0)