DEV Community

William Narmontas
William Narmontas

Posted on

Web standards for content syndication

https://www.scalawilliam.com/web-standards-for-content-syndication/

I've been making websites since 2004 and the one thing that struck me is how brittle it is to synchronise content over HTTP and also how to do it instantaneously. For example, when I write a blog post I'd like it to be synchronised to my own website, perhaps posted to Twitter or get a trackback/pingback - many different targets with many different intentions.

Perhaps it's not a blog post but an automatically generated item. It may be urgent or non-urgent, it may be frequent or infrequent.

In this article, I'm only interested in one-way data sharing, so WebSockets is out of the picture.

Pull

Pulling by polling means initiating a request to a server every X minutes and then checking if the content has changed.

This introduces a maximum delay and requires setting up scheduled tasks. Also you want to make sure not to exceed usage limits of the target server.

Atom and RSS are good for providing a structured pull approach.

Push

Pushing means that as soon as your content is published, YOU call into different targets to notify them of what has happened. It's reactive.

There are three flavours of push. If you want to be receiving the pushes, you need to host a dynamic server which is a disadvantage especially if you use static hosting.

Predefined set of targets

If you wanted to tweet automatically, at the point of publishing, you'd call into the Twitter silo API, but now you have to customise it for every single target. This typically happens with an HTTP POST.

Self-managing subscribers

One interesting use case I have is in ActionFPS where I wish to automatically enter tournament data into Challonge when a "clan war" is played. One way for me to do it is by calling into Challonge directly (current, but brittle), and the other way is to create a separate service (akin to microservices) to react to these events.

But I wish other people to be able to react as well. So I'm going to try out WebSub which is a W3 Candidate Recommendation from 11 April 2017, previously known as PubSubHubbub.

This is what we should be doing in microservices; not tightly coupled approaches.

Content-based targets

... such as mentions. There's a new standard out there called Webmention, which is similar to the good old xml-rpc, pingback and trackback.

Webmention enables conversations. Here's the specification: https://www.w3.org/TR/webmention/ - which is a W3C Recommendation as of 12 January 2017.

Effectively, if you want to receive webmentions, you place a <link rel="webmention"/> in your HTML and when someone mentions your page, they will call that endpoint.

Because of this approach, you can use free services like Bridgy (open source) to automatically post your content to Twitter and other places, and also syndicate responses back! You don't need centralisation or the "let's write our own Twitter adapter" approach above. Nice!

Pull-push

This is my favourite approach: EventSource (aka Server-Sent-Events). The W3 Draft is here: https://www.w3.org/TR/2011/WD-eventsource-20111020/ - it's also reactive.

EventSource is available in most browsers except IE - but you can use a polyfill for IE.

EventSource has a good Node.js library and also comes with reliability included - if you missed out an event then you can recover from the last event you saw. I made a small contribution to that library several weeks ago so that I could synchronise logs from ActionFPS to my local machine in real time. There's also a Scala client for SSE.

Advantage is that your receiver is a client and not a server, but a disadvantage is that it's a continuous connection.

Push API

Made for web browsers. Push API: W3C Working Draft 08 May 2017. Uses service workers.

Further reading

Of interest might also be:

Summary

Basically, decide to use the approach purely based on the frequency and urgency of delivery of your content:

  • Infrequent & not urgent, no server to receive - pull
  • Infrequent & urgent, with server to receive - push
  • Urgent, browser-only - Push API
  • Frequent, no server to receive - pull-push

Top comments (3)

Collapse
 
fuzzy76 profile image
Håvard Pedersen

jsonfeed.org is an up-and-coming new pull format that seems better than anything XML-based. :)

Collapse
 
scalawilliam profile image
William Narmontas

Better how?

Collapse
 
fuzzy76 profile image
Håvard Pedersen

Mostly ease of parsing. JSON is easy to parse from vanilla javascript without frameworks. The same applies to PHP. Parsing complex XML is a major hassle.