DEV Community

Cover image for News Archiver
Anuj Singh
Anuj Singh

Posted on • Edited on

4 2

News Archiver

Overview of My Submission

Archives headlines/content from news websites (selected news website which are for now New18 and IndiaToday it can be change) every 3hr. which allows users to historically see the news and how it’s reported on different sites.
By using a package node-schedule to run background job every 3hr. and puppeteer for scraping the content from a website and this date will be saved in Appwrite database.
Render this data on a client-side application (website).

Submission Category:

Web2 Wizards

Link to Code

Frontend/Client-Side Application

Haven't loaded the data for regular update

for that i have to host the app on a node server or with a appwrite droplete on digitalocean to regular collect the data from sites. MY DIGITALOCEAN CREDIT IS LOW SO I CAN'T HOST SORRY🙇‍♀️🙇






Live website without data because I didn't host my application on a appwrite droplet to collect the data and display it sorry for that 🙇‍♀️.

Web app

Backend/Server-Side Application

Code

Additional Resources / Info

Do check it out in 1.5x for a quick walkthrough on my application and how it works

A quick walk through of my Application and basically how it works

PS- I didn't focus on security so that's why my id is still shows in main app. And if I have to I can just set it the .env

Backend

  • Run node index.js

Image description

  • Cron schedule can be set to any time (for now let say every min * * * * *)
  • After that it Scrape data mainly img and headline

Image description

  • Now coming up it check the collection list
    • Basically for 1 things
    • Does a collection exit with current date it if not then create a collection with given attributes
    • If collection exits then create the document in that collection with the scraped data

Image description

1.
Image description

2.
Image description

  • That's it for the Backend
    Fontend

  • Renders the data that was in collection

Image description

And how it render the data of the selected date?
Well it's easy To start of I created the collection with Date id

Image description

For more info you can connect with me on Twitter

OoO

The backend can be hosted on digitalocean so it can always keep running with appwrite droplet. If you are wondering.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay