โ About
While I was working on endoflife.date integrations, the need for offline copy started to raise:
Offline copy of data
#2530
I really like the idea but to avoid repeated calls of the API for every product I would like data on, I would like to be maintain a local copy of the data and then only download updates each time I start my application (or after a particular time period e.g. only request updates once every 24 hours)
Ideally, I would be able to get the data in JSON format which I can then manage locally.
Alternative would be to call the API for every product to get the product data for each product. But this would also require that I know all of the products in the first place which given the dynamic nature of the data isn't very attractive.
After some various attempts, I finally found a Kaggle based solution.
I wanted the data to:
- ๐ Be easy to share
- โ Rely on the official API
- ๐ Up-to date (without any effort)
- ๐ Easy to integrate with third party products
- ๐งโ๐ฌ Be deployed on a datacentric/datascience platform
- ๐ค Show source code (Open Source)
- ๐ Be easily extensible
Therefore I created a Notebook that does the following things once a week:
- Queries the API
- Load & store data in a DuckDb database
-
Export resulting database in
sqlancsv -
Export database a
Apache Parquetfiles
๐งฐ Tools
All you need is Python and DuckDB json functions:
๐ฏ Result
As you can see, for now, the only input is the API:
... while we have fresh output files:

๐ฃ๏ธ Conclusion
Finally I delivered the following solution to the community:
๐ Weekly Scheduled offline exports on Kaggle โพ๏ธ
#2633
โ About
Getting an easy to use offline copyof endoflife.date would be very convenient to be able to produce data-analysis.
endoflife.date API to get an automated offline copy of the datas.
๐ The Notebook
Below are the very portable outputs :
๐ฐ Benefits
Weekly:
-
csvexports - DuckDB exports
- Apache Parquet exports



Top comments (3)