DEV Community

Cover image for Interacting with the Dev.to Article API
Steve Layton
Steve Layton

Posted on • Updated on

Interacting with the Dev.to Article API

ATLG Sidebar

Uhh OK

Hay all! If this is your first time, welcome! If not welcome back! This week I'm starting a new "series" here on Dev.to. This week so I decided to try to flex my muscle memory and write a small utility that would interact with an API. I'm using the dev.to/api/articles endpoint in this case. This dovetails into another side project I want to start poking around with! I need a chunk of data on hand for that (we'll come back to that on a different sidebar post if it ever comes together).


Code Walkthrough

package main

import (
  "encoding/json"
  "fmt"
  "io/ioutil"
  "net/http"
  "sync"
  "time"
)

If you write Go, JSON-to-Go is going to be your friend. If we need to unmarshal JSON we can simply paste in a sample of the JSON and it will give you a struct. Nice. We don't actually need the entire struct we could have left only the ID int32 'json:"id"' as its the only one we use. I'm included the entire thing now, we may use it in the future.

// Articles array JSON struct
type Articles []struct {
  TypeOf                 string    `json:"type_of"`
  ID                     int32     `json:"id"`
  Title                  string    `json:"title"`
  Description            string    `json:"description"`
  CoverImage             string    `json:"cover_image"`
  PublishedAt            time.Time `json:"published_at"`
  TagList                []string  `json:"tag_list"`
  Slug                   string    `json:"slug"`
  Path                   string    `json:"path"`
  URL                    string    `json:"url"`
  CanonicalURL           string    `json:"canonical_url"`
  CommentsCount          int       `json:"comments_count"`
  PositiveReactionsCount int       `json:"positive_reactions_count"`
  User                   struct {
    Name            string      `json:"name"`
    Username        string      `json:"username"`
    TwitterUsername string      `json:"twitter_username"`
    GithubUsername  interface{} `json:"github_username"`
    WebsiteURL      string      `json:"website_url"`
    ProfileImage    string      `json:"profile_image"`
    ProfileImage90  string      `json:"profile_image_90"`
  } `json:"user"`
}

I started to lay this out as if we were going to use it as an importable package. It could have been done mostly inline in main() but maybe we'll flesh this out into a full DEV API client. I'll have to take a look at the v0 API a bit closer and see what is actually supported.

// DevtoClient struct
type DevtoClient struct {
  DevtoAPIURL string
  Client      *http.Client
}

// New returns our DevtoClient
func New(apiurl string, client *http.Client) *DevtoClient {
  if client == nil {
    client = http.DefaultClient
  }
  return &DevtoClient{
    apiurl,
    client,
  }
}

Formatting our requests this way might be overkill but, for a start, it puts us in a good spot. The requests can take a few different parameters but we aren't concerned with any of those. At least not this time around, we only want the articles themselves.

// FormatPagedRequest returns *http.Request ready to do() to get one page
func (dtc DevtoClient) FormatPagedRequest(param, paramValue string) (r *http.Request, err error) {
  URL := dtc.DevtoAPIURL + "articles/?" + param + "=" + paramValue
  fmt.Printf("%v\n", URL)
  r, err = http.NewRequest(http.MethodGet, URL, nil)
  if err != nil {
    return nil, err
  }
  return r, nil
}

// FormatArticleRequest returns http.Request ready to do() and get an article
func (dtc DevtoClient) FormatArticleRequest(i int32) (r *http.Request, err error) {
  URL := fmt.Sprintf(dtc.DevtoAPIURL+"articles/%d", i)
  r, err = http.NewRequest(http.MethodGet, URL, nil)
  if err != nil {
    return nil, err
  }
  return r, nil
}

This time around I am experimenting with sync.waitGroup. WaitGroups allow us to kick off a series of Goroutines and wait for them to finish before moving on with the code. We'll see further on in the code when getArticle() executes as a Goroutine. It is what actually gets the article from the API and writes it to disk. This way we're grabbing one set of 30 article ids. As we parse those we begin getting the articles. Once we've received them all only then do we move on to the next set.

func getArticle(dtc *DevtoClient, i int32, wg *sync.WaitGroup) {
  defer wg.Done()
  r, err := dtc.FormatArticleRequest(i)
  if err != nil {
    panic(err)
  }

  resp, err := dtc.Client.Do(r)
  if err != nil {
    panic(err)
  }
  defer resp.Body.Close()

  body, err := ioutil.ReadAll(resp.Body)
  if err != nil {
    panic(err)
  }
  fileName := fmt.Sprintf("%d.json", i)
  ioutil.WriteFile("./out/"+fileName, body, 0666)
}

main() is straightforward enough. We create our client, using http.DefaultClient. We've provided the ability to use an alternate configuration if we need it in the future. doit and c will be controlling our for loop and the main body of the program.

func main() {
  dtc := New("https://dev.to/api/", nil)
  doit := true
  c := 1

In each run, through the loop, we get a single page of articles. We then set up our WaitGroup and our articles variable. Once we have unmarshalled the articles JSON we get the length of that array. That length tells WaitGroup how many "times" to wait. Note that we are calling defer wg.Done() as the first line in the getArticle(). This subtracts one from the WaitGroup total allowing us to move on when finished. The current Dev.to article API returns an empty array, [] when there is no data for a page. We check to see if we have that as a response and stop if so.

  for doit {
    req, err := dtc.FormatPagedRequest("page", fmt.Sprintf("%d", c))
    if err != nil {
      panic(err)
    }
    resp, err := dtc.Client.Do(req)
    if err != nil {
      panic(err)
    }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
      panic(err)
    }

    var wg sync.WaitGroup
    var articles Articles

    json.Unmarshal(body, &articles)
    wg.Add(len(articles))

    for i := range articles {
      go getArticle(dtc, articles[i].ID, &wg)
    }
    wg.Wait()

    if string(body) != "[]" {
      c++
      continue
    }
    doit = false
  }
}

Wrapping Up

There we go, the first "sidebar"! I'll probably be adding the code for this into the main "Attempting To Learn Go" repo over on GitHub. Now that I think about it I still need to post last weeks over there!

Have you done any API work in Go? Anything you would do a bit differently? Let me know in the comments below. Consructive comments are always welcome!


You can find the code for this and most of the other Attempting to Learn Go posts in the repo on GitHub.



Top comments (8)

Collapse
 
ladydascalie profile image
Benjamin Cable • Edited

Hey there!

I'll point out a couple things, firstly, in the following snippet:

// FormatArticleRequest returns http.Request ready to do() and get an article
func (dtc DevtoClient) FormatArticleRequest(i int32) (r *http.Request, err error) {
  URL := fmt.Sprintf(dtc.DevtoAPIURL+"articles/%d", i)
  r, err = http.NewRequest(http.MethodGet, URL, nil)
  if err != nil {
    return nil, err
  }
  return r, nil
}

You are returning a *http.Request and an error which means you can simplify it like this:

// FormatArticleRequest returns http.Request ready to do() and get an article
func (dtc DevtoClient) FormatArticleRequest(i int32) (r *http.Request, err error) {
  URL := fmt.Sprintf(dtc.DevtoAPIURL+"articles/%d", i)
  return http.NewRequest(http.MethodGet, URL, nil)
}

Later on in your worker, you are forgetting to check the errors on json.Unmarshal(body, &articles)

Other than that, good job on parameterizing the HTTP client instance. Were I to use your API client, I would certainly love to set my own timeouts etc. so this is the right way to go!

Good job!

Collapse
 
shindakun profile image
Steve Layton

Thanks for the comment! Ungh! I keep forgetting that I can shorten the returns when I do it that way. One of these days! Good catch on the json.Unmarshal not sure how that snuck through, I'll make sure to fix that up before putting the code on GitHub.

Collapse
 
turnerj profile image
James Turner • Edited

Cool, another post about the Dev.to API! 🙂

Correct me if I am wrong though but that will go over every single page till it runs out of data right? If so, you might want to be careful actually running that as there would be hundreds of pages and you're grabbing the contents for each article on every page. Maybe chuck a "max articles" count on there somewhere? Or maybe limit it to a specific tag?

Collapse
 
shindakun profile image
Steve Layton

The articles API endpoint looks to only return 30 articles per page and seems to cut off around ~460 pages or so. The idea is indeed to grab all public posts as I need a good amount of text for something else I'm toying around with so, in this case, I have a rough idea of how many articles we'll be getting. However, we're only getting 30 articles worth of content in one go which should not cause any issues on the server side or locally. You definitely wouldn't want to be running this repeatedly in a short span or anything though.

Collapse
 
turnerj profile image
James Turner

Yeah as it is still about 13,000 requests - if that ran often, it could easily hammer the site.

Thread Thread
 
shindakun profile image
Steve Layton

The articles endpoint is cached as far as I know so, as is the code shouldn't cause any significant load on Varnish. Heh. But yeah, imagine trying to simply download the articles one at a time, api/articles/1, that what like ~90000 hits. Yuck.

Collapse
 
geocine profile image
Aivan Monceller

Is there a documentation somewhere of the API , I can't seem to find it anywhere. I want to retrieve my articles.

Collapse
 
shindakun profile image
Steve Layton

There is no documentation as of yet, I pulled the details out of the code itself on GitHub. If all you need is a dump of your own articles you can request an emailed export at the bottom of the settings page, dev.to/settings/misc.