DEV Community

Cover image for Durable Functions to Schedule Publish to Dev.To
Justin Yoo for Microsoft Azure

Posted on • Originally published at devkimchi.com

Durable Functions to Schedule Publish to Dev.To

There's the tool called PublishToDev built by one of my colleagues, Todd, which schedules to publish articles on Dev.To. It's super useful because I can schedule my posts whenever I want to publish them on there. As soon as I saw this tool, I wanted to clone code in .NET because it would be beneficial to practice:

Let's walk through how I made it.

You can find the entire source codes of this application at this GitHub repository:

GitHub logo devrel-kr / devto-article-publish-scheduler

https://dev.to 에 블로그 포스트를 예약 발행하는 도구입니다 | This is a scheduler, in a given date and time, to publish an article to https://dev.to

DevTo Article Publish Scheduler

This is a scheduler, in a given date and time, to publish an article to https://dev.to.

Getting Started

To use this scheduler, 1) deploy the app then 2) send an HTTP API request to the app by following the instructions.

Keys/Secrets

To deploy this app onto Azure, you need a few things beforehand.

  • API key for dev.to: Go to https://dev.to/settings/account and generate a new API key.
  • Azure credentials: Use your existing Azure account or create a free account, if you don't have one. Then run the following Azure CLI command to get your Azure credentials
    az ad sp create-for-rbac \
        --name "<service_principal_name>" \
        --sdk-auth \
        --role contributor
    Enter fullscreen mode Exit fullscreen mode
  • Azure resource group: Use your existing Azure resource group or create a new one by running the Azure CLI command
    az group create \
        -n "<resource_group_name>" \
        -l "<location>
    Enter fullscreen mode Exit fullscreen mode

Web Pages Scraping

Once you write a blog post on Dev.To, you'll be able to get a preview URL before it publishing it. The preview URL has the format of https://dev.to/<username>/xxxx-****-temp-slug-xxxx?preview=xxxx. All you need to know from the preview page is to get the article ID found from the HTML element having the attribute of id="article-body".

HTML document view on Dev.To preview page

According to the picture above, you can find the attribute of data-article-id. Its value is the very article ID.

Using either Puppeteer or Playwright to scrape a web page is super simple. Both have their own .NET ported versions like Puppeteer Sharp and Playwright Sharp respectively. However, they don't work on Azure Functions, unfortunately. More precisely, they work on your local dev environment, not on Azure instance. This post would be useful for your node.js Azure Functions app, but it's not that helpful for your .NET application. Let me find a way for it to work on Azure Functions instance correctly.

Therefore, I had to change the scraping method to be a traditional way, using HttpClient and regular expressions (line #1-2, 8).

The ends justifies the means. - Niccolo Machiavelli

You've got the article ID of your post. Let's move on.

Dev.To API Document – Open API

Dev.To is a blog platform for developer communities. Tens of blog posts with a broad range of development topics are published day by day. It also provides APIs to publish and manage blog posts. In other words, it has well documented APIs. Within the document page, you can also find the Open API document, which you will be able to build a wrapper SDK instantly.

Wrapper SDK Generation with AutoRest

As long as you've got an Open API document, generating an SDK is a piece of cake, using AutoRest. I created a .NET SDK by the following command. I set the namespace of Aliencube.Forem.DevTo and output directory of output. The last --v3 option indicates that the Open API document conforms to the v3 spec version.

AutoRest does not only generate SDK in .NET but also in Go, Java, Python, node.js, TypeScript, Ruby and PHP. Therefore, you can generate the SDK with your desired language. The wrapper SDK repository can be found at:

GitHub logo aliencube / forem-sdk

API wrapper for https://dev.to in C#, Go, Java, Python, node.js, TypeScript, Ruby and PHP

Forem/Dev.To SDK

This is an UNOFFICIAL wrapper package of the Forem/DevTo API used for https://dev.to, using AutoRest.

Acknowledgement

Official API Document

Generating SDK

Make sure you have AutoRest installed on your machine.

npm install -g autorest
Enter fullscreen mode Exit fullscreen mode

Run the following command to generate SDK.

autorest config-file.yaml --input-file=forem.swagger-<version>.json
Enter fullscreen mode Exit fullscreen mode

Getting Started

  • .NET SDK
  • 🔲 JavaScript SDK: TBD
  • 🔲 Python SDK: TBD
  • 🔲 Java SDK: TBD
  • 🔲 Go SDK: TBD
  • 🔲 PHP SDK: TBD

TO-DO List

Package Status Version
.NET SDK on NuGet
🔲 JavaScript SDK on npm
🔲 Python SDK on PyPI
🔲 Java SDK on Maven
🔲 Go

Blog Post Markdown Document Download

To use the API, you need to have an API key, of course. In the account settings page, generate a new API key.

Dev.To API Key

Then, use the wrapper SDK generated above, and you'll get the markdown document (line #4-6).

Frontmatter Update

All the blog posts published to Dev.To contain metadata called frontmatter at the top of the markdown document. The frontmatter is written in YAML. Your blog post markdown might look like:

In the frontmatter, you'll see the key/value pair of published: false. Updating this value to true and saving the post means that your blog post will be published. Therefore, all you need to do is to update that value in the frontmatter area. Have a look at the code below, which extracts the frontmatter from the markdown document.

The frontmatter string needs to be deserialised to a strongly-typed FrontMatter instance, using the YamlDotNet library. Then, change the Published value to true.

Once updated the frontmatter instance, serialise it again and concatenate it with the existing markdown body.

Blog Post Markdown Document Update

Now, make another API call with this updated markdown document, and your post will be published.

This is how your Dev.To blog post is published via their API. Let's move onto the scheduling part.

Azure Durable Functions for Scheduling

It's good to understand that Azure Durable Functions is a combination of three unit functions–API endpoint function or durable client function, orchestrator function and activity function. Each has its respective role in the following scenarios.

  1. The API endpoint function accepts the API requests. It then calls the orchestrator function to manage the entire workflow and returns a response with the 202 status code.
  2. The orchestrator function controls when and how activity functions are called, and aggregate states.
  3. Individual activity functions do their jobs and share the result with the orchestrator function.

Azure Durable Functions Workflow

The orchestrator function also includes the timer feature as one of the controlling methods for activity functions. With this timer, we can do the scheduling. In other words, we temporarily save the blog post at one time, then schedule to publish it by setting a timer.

API Endpoint Function

The endpoint function is the only type to be exposed outside. It's basically the same as the HTTP trigger function, but it has additional parameter with the durable function binding (line #4).

What does it do, by the way?

  1. The function accepts API requests from outside, with a request payload. In this post, the request payload looks like the following JSON object. The schedule value should follow the ISO8601 format (eg. 2021-01-20T07:30:00+09:00).

  2. Deserialise the request payload.

  3. Create a new orchestrator function and call it with the request payload.

  4. As the orchestrator function works asynchronously, the endpoint function responds with the HTTP status code of 202.

Orchestrator Function

The orchestrator function takes care of the entire workflow. Here's the binding for the orchestrator function (line #3).

IDurableOrchestrationContext instance knows the request payload passed from the endpoint function.

Activate a timer, using the schedule from the request payload.

Once the timer is activated, the orchestrator function is suspended until the timer expires. Once the timer expires, the orchestrator function resumes and calls the activity function.

Finally, it returns the result aggregated from the activity function.

Activity Function

While both endpoint function and orchestrator function do not look after the blog post itself, the activity function does all the things, including web page scraping, Dev.To API call and markdown document update. Here's the binding for the activity function (line #3).

Add the codes for scraping, API call and markdown update mentioned above.

And, it finally returns the result.


So far, we've walked through implementing an Azure Durable Functions app to schedule to publish articles to the Dev.To platform. Throughout this, I think you've understood the core workflow of Azure Durable Functions–API request, orchestration and individual activities. The power of the Durable Functions is that it overcomes the limitations of stateless, by storing states. I hope you feel this power and convenience, too.

Discussion (2)

Collapse
shaijut profile image
Shaiju T

Nice 😄, Questions.

  1. Why there is a need to do auto publish to Dev.to, any use case ?
  2. To schedule we have to pass time parameter to Azure Functions API , right ?
Collapse
justinyoo profile image
Justin Yoo Author

@shaijut Good question!

  1. There are many use cases to publish articles to dev.to automatically. This post is the typical example–I want to publish my post on a specific date/time, rather than immediately.

  2. That's correct. You should pass a timestamp for scheduling.