DEV Community

Iman
Iman

Posted on

Show Dev: TSAuditor, a data quality auditing library for time-series tabular data

 Check out the library here!

I made TSAuditor, a pure Python, open-source, data-quality auditing library for time-series tabular data, with a focus on financial and sensor domains. tsauditor scans a DataFrame and returns a report of structural problems, anomalies, and data leakage between features along with the prediction target.

This library aims to simplify the data cleaning step in an ETL pipeline. It searches the input dataset for:

  • Missing timestamps
  • Hidden duplicates
  • Data gaps
  • Outliers
  • Possible leakage causing variables
  • Structural breaks

It also validates timestamp continuity, frequency stability, and chronological order.

I would love feedback, and I am also open to contributions. Would tsauditor be useful to you?

Top comments (0)