jeikabu

Posted on Jul 5, 2021 • Edited on Jul 7, 2021 • Originally published at rendered-obsolete.github.io

Cross-posting to Tumblr with AWS Lambda

#rust #serverless #productivity #cli

I really enjoyed making a CLI tool to automate cross-posting blog posts to different places. So, I started looking for another platform to add and Tumblr seemed like a good candidate. Again, rough code is on Github.

Tumblr API

The Tumblr API is well documented. But before you get started you need to register an "application". For "Default callback URL" put a valid URL, you can always edit this later (via Account > Settings > Apps > ✎). This will get you an OAuth "consumer" key and secret (sometimes referred to as "client" credentials).

Some parts of the Tumblr API can be called with just the consumer key. For example, to retrieve published posts:

curl -G https://api.tumblr.com/v2/blog/{blog-identifier}/posts?api_key={oauth_consumer_key}
# PowerShell Example:
Invoke-WebRequest https://api.tumblr.com/v2/blog/rendered-obsolete/posts?api_key=xxx

But most of the interesting ones, like creating a new post, require OAuth.

OAuth

Tumblr's docs are a bit light on OAuth 1.0a details, but their implementation closely resembles Twitter's and there's also an RFC. Twitter calls it "3-legged" OAuth authentication, the RFC calls it "redirection-based" authentication. Whatever you call it, the gist is:

POST https://www.tumblr.com/oauth/request_token with client credentials (i.e. consumer key/secret) -> receive "temporary credentials"
Direct user to https://www.tumblr.com/oauth/authorize website for "resource owner" approval
Once they approve access Tumblr will redirect them to our callback URL -> callback receives "verifier"
GET https://www.tumblr.com/oauth/access_token with temporary credentials -> receive user credentials (i.e. token and secret)

First, I'll go over the CLI client changes, then we'll deal with the callback since it's an excuse to write an AWS Lambda in Rust again. The final resulting user token/secret can be re-used, so this only needs to be done once per account, thankfully.

Client

I initially implemented the gory details of OAuth 1.0a using percent-encoding for string encoding along with hmac and sha-1 for the required signing. But in the end settled on oauth1-request since it made the client much simpler.

First, use client/consumer credentials to obtain temporary credentials:

let uri = format!("{}/oauth/request_token", WWW);
let client_credentials =
    oauth1_request::Credentials::new(&self.consumer_key, &self.consumer_secret);
// Sign request using only client/consumer credentials
let auth_header =
    oauth1_request::Builder::<_, _>::new(client_credentials, oauth1_request::HmacSha1)
        // Can optionally specify callback URL here, otherwise Tumblr will use the application default
        //.callback("https://callback_url")
        .post(uri.clone(), &());
let resp = self
    .client
    .post(uri)
    .header(reqwest::header::AUTHORIZATION, auth_header)
    .send()
    .await?;

let resp_body = resp.text().await?;
// Parse `key0=value0&key1=value1&...` in response body for temporary credentials
let mut resp_body_pairs = resp_body
    .split('&')
    .map(|pair| pair.split_once('='))
    .flatten();
let temp_token = get_value(&mut resp_body_pairs, "oauth_token")?.to_owned();
let temp_token_secret = get_value(&mut resp_body_pairs, "oauth_token_secret")?.to_owned();

Now we have the "temporary credentials". We still need the oauth_verifier that will be sent to our callback URL. To receive this from our callback we'll use AWS Simple Queue Service (SQS).

To interact with AWS in Rust there's the recently announced official AWS SDK for Rust which requires AWS access credentials. To obtain AWS access credentials, in the AWS console click $YOUR_NAME (upper-right) > My Security Credentials > Access keys > Create New Access Key (keep these safe). And then set the following environment variables:

$Env:AWS_ACCESS_KEY_ID="xxx"
$Env:AWS_SECRET_ACCESS_KEY="yyy"
$Env:AWS_DEFAULT_REGION="us-east-1"

Now, continuing with the client:

// Create temporary SQS queue `bullhorn-{temporary token}` to receive oauth_verifier from lambda
let queue_name = format!("bullhorn-{}", temp_token);
let client = aws_sqs::Client::from_env();
let output = client.create_queue().queue_name(queue_name).send().await?;
let queue_url = output.queue_url.unwrap();

// Show "resource owner" approval website in system default web browser
let query = format!("{}/oauth/authorize?oauth_token={}", WWW, temp_token);
let exit_status = open::that(query)?;

The "open" crate makes it easy to show the Tumblr resource owner approval page in a browser. While the user is approving our Tumblr application, we wait for our callback to send the oauth verifier via SQS:

// Receive oauth_verifier from lambda via SQS
let messages = loop {
    let output = client
        .receive_message()
        .queue_url(&queue_url)
        .send()
        .await?;
    if let Some(msgs) = output.messages {
        if !msgs.is_empty() {
            break msgs;
        }
    }
};
// Delete the temporary SQS queue
let _ = client.delete_queue().queue_url(queue_url).send().await?;
let verifier = messages[0].body.as_ref().unwrap();

// Exchange client/consumer and temporary credentials for user credentials
let uri = format!("{}/oauth/access_token", WWW);
let temp_credentials = oauth1_request::Credentials::new(&temp_token, &temp_token_secret);
// Must authenticate with both client and temporary credentials
let token = oauth1_request::Token::new(client_credentials, temp_credentials);
let auth_header =
    oauth1_request::Builder::<_, _>::with_token(token, oauth1_request::HmacSha1)
        // Must include `oauth_verifier`
        .verifier(verifier.as_ref())
        .get(uri.clone(), &());
let resp = self.client
    .get(uri)
    .header(reqwest::header::AUTHORIZATION, auth_header)
    .send()
    .await?;

The user token and secret we receive is valid until revoked (via user's Settings > Apps), so we can store them somewhere safe to avoid having to do this again. We can now use Tumblr on the user's behalf.

Callback

Above covered the important parts of the client. The part of OAuth authentication remaining is our callback that Tumblr's resource owner approval webpage redirects to. We could leave an HTTP server running all the time, but that's a bit silly for something we only need occasionally. This looks like a job for "serverless" and AWS Lambda.

Basic Lambda

As before, we'll use the AWS Lambda Rust runtime to implement our lambda function. MUSL is used to create a fully static Rust binary.

In Cargo.toml:

# Rename binary for AWS Lambda custom runtime:
# https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html
[[bin]]
name = "bootstrap"
path = "src/main.rs"

[dependencies]
anyhow = "1.0"
aws_sqs = { git = "https://github.com/awslabs/aws-sdk-rust", tag = "v0.0.10-alpha", package = "aws-sdk-sqs" }
futures = "0.3"
lambda_runtime = "0.3"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1.5.0", features = ["full"] }
tracing = "0.1"
tracing-subscriber = "0.2"

In .cargo/config:

[build]
# Override default build target
target = "x86_64-unknown-linux-musl"

# Needed on Mac otherwise build fails with `ld: unknown option: --eh-frame-hdr`
[target.x86_64-unknown-linux-musl]
linker = "x86_64-linux-musl-gcc"

For starters, put something like the basic example in src/main.rs:

use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct Request {
    oauth_token: String,
    oauth_verifier: String,
}

#[derive(Serialize)]
struct Response {
    msg: String,
}

#[tokio::main]
async fn main() -> Result<(), lambda_runtime::Error> {
    let func = lambda_runtime::handler_fn(handler);
    lambda_runtime::run(func).await
}

async fn handler(event: Request, ctx: lambda_runtime::Context) -> anyhow::Result<Response> {
    Ok(Response{ msg: "ok".to_owned() })
}

In order to compile this lambda, we also need to build OpenSSL with musl (based on this post):

### Install pre-requisites
# For Mac/OSX: install command line tools and cross-compiler
xcode-select --install
brew install FiloSottile/musl-cross/musl-cross

rustup target add x86_64-unknown-linux-musl

### Build OpenSSL with musl
wget https://github.com/openssl/openssl/archive/OpenSSL_1_1_1f.tar.gz
tar xzf OpenSSL_1_1_1f.tar.gz
cd openssl-OpenSSL_1_1_1f
export CROSS_COMPILE=x86_64-linux-musl-
# `-DOPENSSL_NO_SECURE_MEMORY` is to avoid `define OPENSSL_SECURE_MEMORY` which needs `#include <linux/mman.h>` (which OSX doesn't have).
./Configure CFLAGS="-DOPENSSL_NO_SECURE_MEMORY -fpie -pie" no-shared no-async --prefix=output_abs_path/musl --openssldir=output_abs_path/musl/ssl linux-x86_64
make depend
# Use `sysctl -n hw.physicalcpu` or `hw.logicalcpu` with `-j` if so inclined
make
# Install required stuff to `output_abs_path/musl/` (exclude man-pages, etc.)
make install_sw
# Set value from `--prefix` above
export OPENSSL_DIR=output_abs_path/musl

### Build Rust lambda function
cargo build --release

We could now create our lambda with the AWS console, but the aws-lambda-rust-runtime docs show how to do it via the AWS CLI.

# Set AWS credentials: https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html
aws configure

# Create package with layout required for custom lambda runtime:
# https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html
# -j/--junk-paths avoids directory structure
zip --junk-paths lambda.zip ../target/x86_64-unknown-linux-musl/release/bootstrap

# Get ARN for desired existing role
aws iam list-roles

# Create new lambda function
aws lambda create-function \
    --function-name bullhorn \
    --handler doesnt.matter \
    --zip-file fileb://./lambda.zip \
    --runtime provided \
    --role arn:aws:iam::01234:role/service-role/bullhorn \
    --tracing-config Mode=Active \
    # Needed with CLI V2 
    --cli-binary-format raw-in-base64-out
    --cli-connect-timeout 10000

# Run the lambda
aws lambda invoke \
    --function-name bullhorn \
    --payload '{"oauth_token": "xxx", "oauth_verifier": "yyy"}' \
    # Needed with CLI V2
    --cli-binary-format raw-in-base64-out \
    response.json

# To update lambda binary
aws lambda update-function-code \
    --function-name bullhorn \
    --zip-file fileb://./lambda.zip \
    --cli-connect-timeout 10000

--cli-connect-timeout deals with commands failing with Error: Connection was closed before we received a valid response (see issue)
Add --region <region> for a region other than the default set with aws configure

API Gateway

To get Tumblr to call our lambda via the callback URL we use AWS API Gateway to trigger our lambda via an HTTP endpoint.

Open the lambda in AWS console then + Add trigger > API Gateway > Create an API, for "API type" REST API ("HTTP" likely also works) and for "Security" Open.

From the list of triggers expand Details and note "API endpoint". This can be set as "Default callback URL" of the Tumblr application as well as the oauth_callback parameter in the client.

When http://tumblr/authorize redirects to the callback you get https://your_callback_url?oauth_token=xxx&oauth_verifier=yyy. Apparently lambdas don't accept query strings (the ?xxx=yyy bits), so we need to move the query into the body.

In AWS console open API Gateway and /bullhorn ANY > Integration Request, deselect Use Lambda Proxy integration and expand Mapping Templates. Select When there are no templates defined, Add mapping template enter application/json and click ☑. Now there's two options:

From "Generate template" pick Method request passthrough. This will result in the input to your lambda containing the entire request:

transformations: {
"body-json" : {},
"params" : {
    "path" : {}
    ,"querystring" : {
        "oauth_verifier" : "yyy",
        "oauth_token" : "xxx"
    }
    ,"header" : {
    }
},
"stage-variables" : {
},
"context" : {
    ...

Create a simple mapping to move the query parameters into the request body:

{
    "oauth_token": "$input.params('oauth_token')",
    "oauth_verifier": "$input.params('oauth_verifier')"
}

The latter is easier. See the docs for template syntax details.

Don't forget to Save!

In API Gateway you can verify everything works, for "Method" GET and "Query Strings" oauth_token=xxx&oauth_verifier=yyy then Test. In the log output you'll see: the original query, the mapped endpoint request body, and the output from the lambda.

Once this is working you must select Actions > Deploy API, set "Deployment stage" to default and then Deploy. Failure to do this will result in your lambda receiving the query string from the initial API Gateway configuration.

CloudWatch Logs isn't enabled by default for API Gateway. To enable logging you'll need to create a role and enable them in API Gateway > {select API} > Stages > default > Logs/Tracing.

Final Lambda

In order to work with SQS queues our lambda needs some additional permissions:

From AWS console open IAM
Select the role used by the lambda function and expand the policy
Edit policy > Add additional permissions
- "Service" = SQS
- "Actions" = Read / GetQueueUrl and Write / SendMessage
- "Resources" = arn:aws:sqs:us-east-1:01234:bullhorn-* (the naming convention used for our SQS queues)

A working version of the lambda is just the send portion of SQS:

async fn handler(event: Request, _ctx: lambda_runtime::Context) -> anyhow::Result<Response> {
    let client = aws_sqs::Client::from_env();
    // Get SQS queue based on name set by client
    let queue = client
        .get_queue_url()
        .queue_name(format!("bullhorn-{}", event.oauth_token))
        .send()
        .await?;

    // Send oauth_verifier to client so it can retrieve user token/secret
    let res = client
        .send_message()
        .message_body(event.oauth_verifier)
        .set_queue_url(queue.queue_url)
        .send()
        .await?;
    let response = Response {
        message_id: res.message_id,
        sequence_number: res.sequence_number,
    };
    Ok(response)
}

Our client will receive the verifier and complete OAuth authentication as described earlier.

Since managed services like SQS generally exhibit eventual consistency, I wonder if get_queue_url() can temporarily fail and needs to be retried. It's worked fine the way it is (so far), so maybe that's accounted for when the client creates the queue?

Posting to Tumblr

Thus far everything was related to OAuth and obtaining user credentials. Once we have them we can interact with Tumblr.

The Tumblr API has both a "legacy" and "neue" forms. Legacy supports markdown, but not canonical URLs. Neue has canonical URLs but not markdown; posts must be composed from Neue Post Format (NPF) blocks. It might work to build the markdown as HTML and embed in a block, but I'll look into that later. Until there's a better solution, we'll create a "link" post that's just a hyper-link to the original article.

// Check if the article already exists
let posts: Posts = self
    .client
    .get(format!("{}/blog/{}/posts", URL, self.blog_id))
    // Only requires api_key authentication, get response in "Neue Post Format"
    .query(&[("api_key", &self.consumer_key), ("npf", &"true".to_owned())])
    .send()
    .await?
    .json()
    .await?;
let existing = posts.response.posts.iter().find_map(|p| {
    p.content
        .iter()
        // Find block that is a "link" and contains canonical URL, and return its ID
        .find(|block| match block {
            ContentBlock::Link { display_url, .. } => {
                display_url == canonical_url
            }
            _ => false,
        })
        .map(|_| p.id_string.clone())
});

// There's a series of structs for serde to parse the JSON response.
// The following are the important ones:

#[derive(Debug, serde::Deserialize)]
struct Post {
    id_string: String,
    content: Vec<ContentBlock>,
    // <SNIP>
}

// Serde will serialize these from JSON like, e.g.:
// {type="link", display_url="xxx", ...}
#[derive(Debug, serde::Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
enum ContentBlock {
    Link { display_url: String, title: String },
}

Create/update an article:

// Must authenticate using both client/consumer and user tokens/secrets
let token = oauth1_request::Token::from_parts(
    &self.consumer_key,
    &self.consumer_secret,
    &self.token,
    &self.token_secret,
);
let tags = post.front_matter.tags.map(|tags| RequestTags { tags });
let request = LinkRequest {
    // If we found existing article this will be Some and we'll update.  Otherwise this is None and we create.
    id: existing.clone(),
    title: Some(post.front_matter.title),
    url: canonical_url,
    tags,
    ..Default::default()
};

// To create an article: POST {blog_id}/post
// To update: POST {blog_id}/post/edit
let uri = format!(
    "{}/blog/{}/post{}",
    URL,
    self.blog_id,
    if existing.is_some() { "/edit" } else { "" }
);
// Sign the request and create `Authorization` HTTP header
let auth_header =
    oauth1_request::post(uri.clone(), &request, &token, oauth1_request::HmacSha1);
// For POST, request body contains `application/x-www-form-urlencoded`
let body = oauth1_request::to_form_urlencoded(&request);

let resp = self
    .client
    .post(uri)
    .header(reqwest::header::AUTHORIZATION, auth_header)
    .header(
        reqwest::header::CONTENT_TYPE,
        "application/x-www-form-urlencoded",
    )
    .body(body)
    .send()
    .await?;

// HTTP request to create/update "link" type post
#[derive(oauth1_request::Request)]
struct LinkRequest {
    /// Must be `Some` when updating an existing article, `None` when creating a new one
    id: Option<String>,
    #[oauth1(rename = "type")]
    r#type: String,
    tags: Option<RequestTags>,

    // <SNIP>
}

// Helper to serialize Vec<_>
struct RequestTags {
    tags: Vec<String>,
}

// Need to impl Display so oauth1_request knows how to serialize a Vec<_>
impl std::fmt::Display for RequestTags {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
        let tag_param = self.tags.join(",");
        write!(f, "{}", tag_param)
    }
}

The resulting post looks like this:

DEV Community