DEV Community

Cover image for Dato.RSS - RSS with Ruby
Davide Santangelo
Davide Santangelo

Posted on • Edited on

Dato.RSS - RSS with Ruby

Intro

My latest side Rails project is Search Engine.

dato.rss is a simple, clean and fast RSS search engine with a RESTful API.

the project is divided into

Search Engine: Quickly search through the millions of available RSS feeds.

RESTful API: Turns feed data into an awesome API. The API simplifies how you handle RSS, Atom, or JSON feeds. You can add and keep track of your favourite feed data with a simple, fast and clean REST API. All entries are enriched by Machine Learning and Semantic engines.

Search

Search is just implemented with Full Text Search Postgres feature.

I used the pg_search Gem, which can be used in two ways:

Multi Search: Search across multiple models and return a single array of results. Imagine having three models: Product, Brand, and Review. Using Multi Search we could search across all of them at the same time, seeing a single set of search results. This would be perfect for adding federated search functionality to your app.

Search Scope: Search within a single model, but with greater flexibility.

for the implementation I found very useful the article https://pganalyze.com/blog/full-text-search-ruby-rails-postgres

    execute <<-SQL
      ALTER TABLE entries
      ADD COLUMN searchable tsvector GENERATED ALWAYS AS (
        setweight(to_tsvector('simple', coalesce(title, '')), 'A') ||
        setweight(to_tsvector('simple', coalesce(body,'')), 'B') ||
        setweight(to_tsvector('simple', coalesce(url,'')), 'C')
      ) STORED;
    SQL
Enter fullscreen mode Exit fullscreen mode

Search View

Alt Text

Feed Rank

Feed Ranking is provided by openrank a free root domain authority metric based on the common search pagerank dataset. The value is normilized by

((Math.log10(domain_rank) / Math.log10(100)) * 100).round
Enter fullscreen mode Exit fullscreen mode

Machine Learning

Machine Learning is provided by dandelion API Semantic Text Analytics as a service, from text to actionable data. Extract meaning from unstructured text and put it in context with a simple API.

RESTful API

All API documentation is in the Wiki section. Feel free to make it better, of course.

https://github.com/davidesantangelo/dato.rss/wiki

To use some features such as adding a new feed you need a token with write permission. Currently only I can enable it. In case contact me

Github

GitHub logo davidesantangelo / dato.rss

The best RSS Search experience you can find

DATO.RSS

A seamless RSS Search Engine experience with a hint of Machine Learning.

SEED

An SQL dump of the database with over 3 million entries extracted in over a year can be downloaded at https://davidesantangelo.gumroad.com/l/nkyymb

BETA

Dato.RSS is in beta, and will likely see many changes in the near future.

If you have comments or suggestions, please send them to us using the Issues TAB.

Thanks for trying the beta!

Alt Text

Search Engine: Quickly search through the millions of available RSS feeds.

RESTful API: Turns feed data into an awesome API. The API simplifies how you handle RSS, Atom, or JSON feeds. You can add and keep track of your favourite feed data with a simple, fast and clean REST API. All entries are enriched by Machine Learning and Semantic engines.

Example

curl 'https://<yourhost>/api/searches?q=news' | json_pp
{
  "data": [
    {
      "id": "86b0f829-e300-4eef-82e1-82f34d03aff6
Enter fullscreen mode Exit fullscreen mode

Top comments (0)