loading...
Cover image for A bot that tweets new DEV articles about Vue

A bot that tweets new DEV articles about Vue

danielelkington profile image Daniel Elkington ・6 min read

So I love VueJS. And I love dev.to articles about Vue. And I don't want to miss any. I could just keep constantly refreshing the #vue tag on dev.to, but instead I decided it would be more convenient to see all new Vue articles in my Twitter feed. So I wrote my a Twitter bot - @TheVueDev.

Basic architecture and flow

The basic flow I decided on was the following

  • An Azure Function written in Typescript runs every minute,
  • The function scrapes dev.to/t/vue/latest to check for new articles,
  • If new articles are found, they get automatically tweeted!

An Azure Function is Microsoft Azure's "serverless" offering that allows you to run a small piece of code in response to some trigger. There's a generous free tier (1 million free executions per month!) that means this project is very cheap, though not entirely free as the Function requires an Azure Storage account to store some things related to the function. Still, the overall cost for this project is less than US$1/month.

Setting up Twitter

Setting up the Twitter account is relatively straightforward (you just need to be prepared to answer a lot of questions that Twitter understandably asks to try to prevent abuse). After setting up a Twitter account for the bot at twitter.com, I went to developer.twitter.com and followed the prompts to apply for a developer account. Finally, I created a new "app", filled in some more details, requested some access tokens, and was presented with a page that looks like this:
The Twitter developer page with some access tokens and secret keys to copy
That's right, there are four tokens/keys, and we're going to need all of them!

The Function

I followed the quickstart guide in the Azure docs to create an Azure Function in Visual Studio code, selecting "JavaScript" as the language. I then ran tsc --init to add TypeScript, and was ready to start coding!

Running code every minute

For each function you create, you need to specify what will trigger it (find all the options here). In my case I wanted to trigger the function every minute, which could be done with the following code in the function's function.json file:

{
  "disabled": false,
  "bindings": [
    {
      "schedule": "0 * * * * *",
      "type": "timerTrigger",
      "direction": "in",
      "name": "everyMinuteTimer"
    }
  ]
}

Scraping dev.to

I want to thank Sarthak Sharma for his amazing Vue CLI - check it out if you haven't already. The code he used to scrape dev.to gave me a lot of pointers for this project.



Like Sarthak, I used the x-ray library to help with scraping. Every minute I want to load up https://dev.to/t/vue/latest and scrape details of the latest VueJS articles. With the help of the Chrome devtools, grabbing things like the article title, author details, and a link to the article is quite straightforward - querying the page uses syntax very similar to CSS.

const xray = Xray()
const articleScrapeData = await xray(
  'https://dev.to/t/vue/latest',
  '#substories .single-article',
  [
    {
      title: '.index-article-link .content h3',
      author: 'h4 a',
      link: '.index-article-link@href',
      tags: ['.tags .tag'], // need the tags so we can strip them out of the title
      authorLink: '.small-pic-link-wrapper@href'
    }
  ]
)

That will return an array of the articles on the page. There's a bit of cleanup of the data I needed to do (here's the code if you're interested). Of course, it makes sense to tweet out the author's Twitter handle if they've provided it, so I scraped that from the Author page.

  const socialLinks: string[] = await xray(article.authorLink, [
      '.profile-details .social a@href'
  ])
  const twitterLink = socialLinks.find(x => x.includes('twitter.com/'))
  if (twitterLink != null) {
    // twitterLink is in the form https://twitter.com/{blah}
    const twitterURL = new URL(twitterLink)
    const twitterHandle = '@' + twitterURL.pathname.substring(1)
  }

For each article, I could then construct the tweet. I decided that tweets would contain the article title, then the author's twitter handle or name, and finally a link to the article (which dev.to very nicely formats with a beautiful card).

const tweetContent =
`${article.title}

{ ${article.twitterHandle || article.authorName} }
${article.link}` // Yay for Template Literals!

Now all we need to do is Tweet the articles!

The dilemma

The problem was that the function can't just tweet all the "recent articles" we find each minute, or we'll end up tweeting heaps of duplicates. Compounding the problem is that dev.to doesn't seem to provide an easy way to see when an article was published to anything more granular than the nearest day. I came up with three possible solutions:

  1. Only run the function once every 24 hours, and tweet out any articles with yesterday's date. Problem is that if the function fails for some reason we'll miss a day, and if it works followers need to wait up to 24 hours for new articles!
  2. Implement some sort of storage to keep track of the articles that have been tweeted. But this seems like a bit of overkill for a project like this. (You could argue that Typescript is also overkill for a project like this, but I wanted to use it for learning purposes ☺️)
  3. Use the Twitter API to get the latest articles that the account has tweeted, do a comparison with the latest ones found on dev.to, and tweet out any that haven't yet been tweeted.

Option 3 seemed the most straightforward. Scraping dev.to/t/vue/latest only returns the last 8 articles, while the Twitter API returns the last 200 tweets from the account, so the function should always be able to accurately determine whether an article has been tweeted.

Using the Twitter API

Omar Sinan's article on building a Twitter bot was very helpful - check it out!



Like Omar, I used the Twit library to help with tweeting. Connecting it to @TheVueDev was as simple as pasting in the four keys and tokens from the Twitter app page (as environment variables of course!).

this.t = new Twit({
  consumer_key: process.env.twitter_consumer_key,
  consumer_secret: process.env.twitter_consumer_secret,
  access_token: process.env.twitter_access_token,
  access_token_secret: process.env.twitter_access_token_secret
})

Then we just tweet out our articles if they haven't yet been tweeted!

// Get recent tweets
const res = await this.t.get('statuses/user_timeline', {
  screen_name: 'TheVueDev'
})
const recentTweets = <any[]>res.data
// Determine if article has been tweeted yet
// Need to look at entities.urls.expanded_url because Twitter 
// automatically shortens the URL in the tweet itself.
if (recentTweets.find(x =>
  x.entities.urls.find((u: any) => u.expanded_url === articleLink)
  ) == null) {
  // Article hasn't yet been tweeted - tweet it!
  await this.t.post('statuses/update', { status: tweetContent })
}

(Find the more efficient version of this code here.)

Deployment

Deploying the function to Azure can be done right from within Visual Studio code with a few button clicks using the Azure Functions extension. Neat!

And with that, I have a functioning Twitter bot, and can now easily know about all the new VueJS articles as they appear on dev.to!

A tweet from my Twitter bot

Here's the Twitter account, and here's the full code, with step-by-step instructions for deploying your own bot to tweet out any dev.to tag of your choice!

GitHub logo danielelkington / twitter-vue-dev

A bot that lives in Azure Functions and tweets all dev.to VueJS articles

twitter-vue-dev

Logo

A bot that lives in Azure Functions and tweets all dev.to VueJS articles. Find it @TheVueDev.

dev.to article explaining how this was created.

This same code can be used to create a Twitter bot that tweets new articles from any valid dev.to tag.

Deployment

Follow these instructions to create your own Twitter bot that tweets dev.to articles for the tag of your choice. It will probably cost you a few cents a month in Azure Storage fees.

Setup Twitter account

  1. Go to twitter.com and make a Twitter account for your bot.

  2. While signed into your bot's Twitter account, go to developer.twitter.com, click "Apply", and then follow the prompts to apply for a developer account.

    • When asked to "Select a user profile to associate", ensure your bot's account is selected.
    • Fill in the requested information about what your developer account will be used for, accept the terms and…

Discussion

pic
Editor guide
 

Nice!

CC: @peter @michaeltharrington

As we expand our Twitter distribution we could definitely make use of some of this automation and community involvement would be pretty great. Just getting this on your radar.

 

Should probably point out that this implementation is quite fragile - if dev.to at some stage added an API interface it would make this sort of thing much more robust than the current scraping solution😉

 

Duly noted. We have some endpoints

dev.to/api/articles?tag=vue

But they’re pretty scattered. We’ll have more order to the madness in the near future.

 

Loved it. Awesome article. I’ve used puppeteer for automations like this before but it’s usually an overkill. I’ll try to replicate this by building a bot for React :) Thank you for the article.

 

Thanks for the kind words; all the best with your bot!

 

Bot up and running on azure as well. The twitter profile is twitter.com/TheReactDev. I'll wrap it up tomorrow and put the code on GitHub. This stuff is fun!

One thing to note is that checking the extended_url didn't worked for me. Sometimes the twitter api handles me a twitter.com link instead of a dev.to link. Any ideas on why?

Do you mean the expanded_url in tweet.entities.urls? I didn't have this problem 🤔. If you get stuck feel free to put your code on GitHub and I'm happy to take a look!

@frontendwizard Just saw your React bot - it looks great! Especially like how you turned the dev.to tags into Twitter hashtags. Great work.

Thanks. Yeah, I mean the tweet.entities.urls. For some posts it does not return the dev.to url but instead returns a twitter.com link to the own link. I ended up fetching the tweets on extended mode and checking the title + author on the full_text. Not great but, it works.

I found the problem. You have to ask for the tweets with tweet_mode: "extended" on the statuses/user_timeline endpoint, otherwise you don't always get the posted url. Might be a quirk of the twitter API, I'm not sure about it. Was writing about this when I finally understood the behavior. That's why writing about what you're doing is so important! 😄

 

Thank you for this!

 

I was surprised to seeing this. idea is awesome

 

Great job not just on the idea, but bringing it to life! Proof its working:

 

An RSS feed! An RSS feed! My kingdom for an RSS feed!