There's the tool called PublishToDev built by one of my colleagues, Todd, which schedules to publish articles on Dev.To. It's super useful because I can schedule my posts whenever I want to publish them on there. As soon as I saw this tool, I wanted to clone code in .NET because it would be beneficial to practice:
- Scraping web pages using either Puppeteer or Playwright,
- Using the Dev.To API,
- Handling the frontmatter programmatically, and
- Writing Azure Durable Functions app
Let's walk through how I made it.
You can find the entire source codes of this application at this GitHub repository:
https://dev.to 에 블로그 포스트를 예약 발행하는 도구입니다 | This is a scheduler, in a given date and time, to publish an article to https://dev.to
DevTo Article Publish Scheduler
This is a scheduler, in a given date and time, to publish an article to https://dev.to.
To use this scheduler, 1) deploy the app then 2) send an HTTP API request to the app by following the instructions.
To deploy this app onto Azure, you need a few things beforehand.
- API key for dev.to: Go to https://dev.to/settings/account and generate a new API key.
- Azure credentials: Use your existing Azure account or create a free account, if you don't have one. Then run the following Azure CLI command to get your Azure credentialsaz ad sp create-for-rbac \ --name "<service_principal_name>" \ --sdk-auth \ --role contributor
- Azure resource group: Use your existing Azure resource group or create a new one by running the Azure CLI commandaz group create \ -n "<resource_group_name>" \ -l "<location>…
Once you write a blog post on Dev.To, you'll be able to get a preview URL before it publishing it. The preview URL has the format of
https://dev.to/<username>/xxxx-****-temp-slug-xxxx?preview=xxxx. All you need to know from the preview page is to get the article ID found from the HTML element having the attribute of
According to the picture above, you can find the attribute of
data-article-id. Its value is the very article ID.
Using either Puppeteer or Playwright to scrape a web page is super simple. Both have their own .NET ported versions like Puppeteer Sharp and Playwright Sharp respectively. However, they don't work on Azure Functions, unfortunately. More precisely, they work on your local dev environment, not on Azure instance. This post would be useful for your node.js Azure Functions app, but it's not that helpful for your .NET application. Let me find a way for it to work on Azure Functions instance correctly.
Therefore, I had to change the scraping method to be a traditional way, using
HttpClient and regular expressions (line #1-2, 8).
You've got the article ID of your post. Let's move on.
Dev.To is a blog platform for developer communities. Tens of blog posts with a broad range of development topics are published day by day. It also provides APIs to publish and manage blog posts. In other words, it has well documented APIs. Within the document page, you can also find the Open API document, which you will be able to build a wrapper SDK instantly.
As long as you've got an Open API document, generating an SDK is a piece of cake, using AutoRest. I created a .NET SDK by the following command. I set the namespace of
Aliencube.Forem.DevTo and output directory of
output. The last
--v3 option indicates that the Open API document conforms to the v3 spec version.
AutoRest does not only generate SDK in .NET but also in Go, Java, Python, node.js, TypeScript, Ruby and PHP. Therefore, you can generate the SDK with your desired language. The wrapper SDK repository can be found at:
API wrapper for https://dev.to in C#, Go, Java, Python, node.js, TypeScript, Ruby and PHP
- This is a wrapper package of the APIs offered by Forem/DevTo, using AutoRest.
- This is NOT affiliated with Forem/DevTo by any means.
Official API Document
- Official API document: https://docs.forem.com/api/
- The document page provides an Open API specification complying to v3.0.3.
- Current API version is
Make sure you have AutoRest installed on your machine.
npm install -g autorest
Run the following command to generate SDK.
autorest config-file.yaml --input-file=forem.swagger-<version>.json
🔲Python SDK: TBD
🔲Java SDK: TBD
🔲Go SDK: TBD
🔲PHP SDK: TBD
To use the API, you need to have an API key, of course. In the account settings page, generate a new API key.
Then, use the wrapper SDK generated above, and you'll get the markdown document (line #4-6).
All the blog posts published to Dev.To contain metadata called frontmatter at the top of the markdown document. The frontmatter is written in YAML. Your blog post markdown might look like:
In the frontmatter, you'll see the key/value pair of
published: false. Updating this value to
true and saving the post means that your blog post will be published. Therefore, all you need to do is to update that value in the frontmatter area. Have a look at the code below, which extracts the frontmatter from the markdown document.
The frontmatter string needs to be deserialised to a strongly-typed
FrontMatter instance, using the YamlDotNet library. Then, change the
Published value to
Once updated the frontmatter instance, serialise it again and concatenate it with the existing markdown body.
Now, make another API call with this updated markdown document, and your post will be published.
This is how your Dev.To blog post is published via their API. Let's move onto the scheduling part.
It's good to understand that Azure Durable Functions is a combination of three unit functions–API endpoint function or durable client function, orchestrator function and activity function. Each has its respective role in the following scenarios.
- The API endpoint function accepts the API requests. It then calls the orchestrator function to manage the entire workflow and returns a response with the 202 status code.
- The orchestrator function controls when and how activity functions are called, and aggregate states.
- Individual activity functions do their jobs and share the result with the orchestrator function.
The orchestrator function also includes the timer feature as one of the controlling methods for activity functions. With this timer, we can do the scheduling. In other words, we temporarily save the blog post at one time, then schedule to publish it by setting a timer.
What does it do, by the way?
The function accepts API requests from outside, with a request payload. In this post, the request payload looks like the following JSON object. The
schedulevalue should follow the ISO8601 format (eg.
Deserialise the request payload.
Create a new orchestrator function and call it with the request payload.
As the orchestrator function works asynchronously, the endpoint function responds with the HTTP status code of 202.
The orchestrator function takes care of the entire workflow. Here's the binding for the orchestrator function (line #3).
IDurableOrchestrationContext instance knows the request payload passed from the endpoint function.
Activate a timer, using the schedule from the request payload.
Once the timer is activated, the orchestrator function is suspended until the timer expires. Once the timer expires, the orchestrator function resumes and calls the activity function.
Finally, it returns the result aggregated from the activity function.
While both endpoint function and orchestrator function do not look after the blog post itself, the activity function does all the things, including web page scraping, Dev.To API call and markdown document update. Here's the binding for the activity function (line #3).
Add the codes for scraping, API call and markdown update mentioned above.
And, it finally returns the result.
So far, we've walked through implementing an Azure Durable Functions app to schedule to publish articles to the Dev.To platform. Throughout this, I think you've understood the core workflow of Azure Durable Functions–API request, orchestration and individual activities. The power of the Durable Functions is that it overcomes the limitations of stateless, by storing states. I hope you feel this power and convenience, too.