Aurelia Specker for XDevelopers

Posted on Feb 4, 2020

Migrate to Twitter’s newly released Labs recent search endpoint

#migrate #python #twitter

In January, Twitter announced the launch of a new search endpoint. This endpoint is available in Twitter Developer Labs and is meant to preview what the Tweet search endpoints will look like in the future.

The framework of the Labs recent search endpoint is very similar to how Twitter will eventually structure other historical search endpoints available in the next generation of the Twitter API. This search endpoint returns the same, new, and modernized version of Tweet and user JSON objects as other endpoints available in Twitter Developer Labs. It also includes expansions and selectable response formats.

In an attempt to test this new endpoint, I refactored my London commute app and replaced the Premium 30-Day Search API with the new Labs search endpoint. This article highlights the migration changes required to do this. Please refer to the original blog post that walks you through the steps required to build the initial London commute app when reading this migration article.

You can find the refactored code that uses the new endpoint on GitHub. Make sure to select the labs-1-tweets-search branch.
You can find the original code using the Premium 30-Day Search API on GitHub on the master branch.
A full comparison of the changes required to migrate from using the Premium 30-Day Search API to using the new Labs search endpoint can be found on GitHub.
If you already have code written for the standard, premium, or enterprise versions of the Search API, you can also check out the "Compare" section of the Twitter developer docs.

Set up

Follow these steps to gain access to the new Labs search endpoint:

Head over to the Labs section of the Twitter developer portal and click "Join Labs"
Select "Activate" next to Recent search, then select a Twitter developer app (your app should be linked to the Twitter account of the notification sender, in this example the Twitter account @maddie_testing).

In the previous version of this app, we use the Python wrapper searchtweets to access the premium search API. We no longer require this. Remove the following lines from alerts.py:

from searchtweets import ResultStream, gen_rule_payload, load_credentials, collect_results
import pandas as pd

Instead, insert the following lines at the top of alerts.py:

import urllib.parse
import requests
from requests.auth import AuthBase

The new version looks like this:

from requests_oauthlib import OAuth1Session
import urllib.parse
import requests
from requests.auth import AuthBase
import datetime as dt
import yaml
import json

Handling credentials

Open your credentials.yaml file and replace the previous version of your credentials with the new one:

Previous version

search_tweets_api:
    account_type: premium
    endpoint: https://api.twitter.com/1.1/tweets/search/30day/{ENV}.json
    consumer_key: XXXXXXXXXXX
    consumer_secret: XXXXXXXXXXX
    access_token: XXXXXXXXXXX
    access_token_secret: XXXXXXXXXXX

New version

labs_search_tweets_api:
    consumer_key: XXXXXXXXXXX
    consumer_secret: XXXXXXXXXXX
    access_token: XXXXXXXXXXX
    access_token_secret: XXXXXXXXXXX

The consumer_key and consumer_secret will be used to generate a bearer_token to access the Labs search endpoint.

The consumer_key, consumer_secret, access_token, and access_token_secret are required to access the POST statuses/update endpoint.

Remember to always add the credentials.yaml file to your .gitignore file, to avoid accidentally pushing your private credentials to GitHub.

Handling authentication

In alerts.py, remove the following lines of code that refer to the Python wrapper searchtweets:

creds = load_credentials(filename="./credentials.yaml",
                       yaml_key="search_tweets_api",
                       env_overwrite=False)

Make sure to replace the four references to "search_tweets_api" with "labs_search_tweets_api" to reflect the changes made in your credentials.yaml file.

Keep the section that uses OAuth1Session to manage credentials for the POST statuses/update endpoint (as described in the original blog post).

In order to authenticate with the Labs search endpoint, we will use OAuth 2.0 Bearer Token authentication.

This is what the authentication section of your code should look like in this newer version:

# Authentication
with open("./credentials.yaml") as file:
    data = yaml.safe_load(file)

consumer_key = data["labs_search_tweets_api"]["consumer_key"]
consumer_secret = data["labs_search_tweets_api"]["consumer_secret"]
access_token = data["labs_search_tweets_api"]["access_token"]
access_token_secret = data["labs_search_tweets_api"]["access_token_secret"]

oauth = OAuth1Session(
    consumer_key,
    client_secret=consumer_secret,
    resource_owner_key=access_token,
    resource_owner_secret=access_token_secret,
)

# Generate bearer token with consumer key and consumer secret via https://api.twitter.com/oauth2/token
class BearerTokenAuth(AuthBase):
    def __init__(self, consumer_key, consumer_secret):
        self.bearer_token_url = "https://api.twitter.com/oauth2/token"
        self.consumer_key = consumer_key
        self.consumer_secret = consumer_secret
        self.bearer_token = self.get_bearer_token()

    def get_bearer_token(self):
        response = requests.post(
            self.bearer_token_url,
            auth=(self.consumer_key, self.consumer_secret),
            data={"grant_type": "client_credentials"},
            headers={"User-Agent": "LabsMetTutorialPython"})

        if response.status_code is not 200:
            raise Exception(f"Cannot get a Bearer token (HTTP %d): %s" % (response.status_code, response.text))

        body = response.json()
        return body["access_token"]

    def __call__(self, r):
        r.headers["Authorization"] = f"Bearer %s" % self.bearer_token
        r.headers["User-Agent"] = "LabsMetTutorialPython"
        return r

# Create Bearer Token for authenticating
bearer_token = BearerTokenAuth(consumer_key, consumer_secret)

Request parameters

Timestamps

The new Labs search endpoint offers two optional timestamp parameters (start_time and end_time) in the ISO 8601/RFC 3339 date format: YYYY-MM-DDTHH:mm:ssZ

As this format is different to the format required with the Premium Search API, you will have to edit the output format of datetime like so:

# Generate start_time and end_time parameters
utc = dt.datetime.utcnow() + dt.timedelta(minutes = -1)
utc_time = utc.strftime("%Y-%m-%dT%H:%M:%SZ")
print("end_time:", utc_time)

two_hours = dt.datetime.utcnow() + dt.timedelta(hours = -2, minutes = -1)
two_hours_prior = two_hours.strftime("%Y-%m-%dT%H:%M:%SZ")
print("start_time", two_hours_prior)

Generating a query

As we’re no longer using the searchtweets Python wrapper, we can delete the following lines of code:

rule = gen_rule_payload("from:metline -has:mentions",from_date=str(two_hours_prior), to_date=str(utc_time), results_per_call=100)
print("rule:", rule)

tweets = collect_results(rule, 
                        max_results=100,
                        result_stream_args=creds)

[print(tweet.created_at_datetime, tweet.all_text, end='\n\n') for tweet in tweets[0:10]];

Instead, we’re going to use the Python library urllibe to encode non-ASCII text and make the parameters safe for use as a URL component:

# Generate query and other parameters
query = urllib.parse.quote(f"from:metline -has:mentions")
print(query)
start_time = urllib.parse.quote(f"{two_hours_prior}")
print(start_time)
end_time = urllib.parse.quote(f"{utc_time}")
print(end_time)
tweet_format = urllib.parse.quote(f"compact")
print(tweet_format)

Note the use of Python 3’s f-Strings formatting syntax to pass in the timestamp variables. We are going to use this syntax for all instances of string formatting in this version of the tutorial.

The request URL is generated by appending the different encoded parameters to the endpoint.

# Request URL
endpoint = "https://api.twitter.com/labs/1/tweets/search" 
url = f"{endpoint}?query={query}&start_time={start_time}&end_time={end_time}&format={tweet_format}"
print(url)

Define headers. This tells your client that it is able to parse gzip requests and decompress them:

# Request headers
headers = {
    "Accept-Encoding": "gzip"
}

Getting a response

Store the response in a variable called response:

response = requests.get(url, auth = bearer_token, headers = headers)

if response.status_code is not 200:
   raise Exception(f"Request returned an error:{response.status_code}, {response.text}")

And then convert the response to JSON. This will enable you to pull out the Tweet text and store it in a new variable called combined_tweet_text:

# Convert response to JSON & pull out Tweet text
parsed_response = json.loads(response.text)
print(parsed_response)
try:
    tweet_text = [tweet["text"] for tweet in parsed_response["data"]]
    combined_tweet_text = " ".join(tweet_text)
    print(combined_tweet_text)
except:
    combined_tweet_text = " "

Delete the previous version of this:

tweet_text = []
tweet_date = []
combined_tweet_text = ''

for tweet in tweets: 
   tweet_text.append(tweet.all_text)
   tweet_date.append(tweet.created_at_datetime)
   combined_tweet_text += tweet.all_text

Analyse Tweets

The Tweet analysis section that determines if the commuter has to be notified of a delay remains similar to the first version of this application. This is where you can input details specific to your use case.

In order to provide greater flexibility, we can pull out the Twitter @handles of the commuters into two variables (commuter_1 and commuter_2). However, this change is not mandatory.

If a notification is required, we use the POST statuses/update endpoint to send a Tweet to the commuter.

# Analyse Tweets & notify commuter (details specific to use case)
all_trigger = {"closure", "wembley", "delays", "disruption", "cancelled", "sorry", "stadium"}

david_trigger = {"hillingdon", "harrow"}

aurelia_trigger = {"baker"}

tweet_words = set(combined_tweet_text.lower().split())

commuter_1 = "@AureliaSpecker"
commuter_2 = "@dormrod"

if len(tweet_words.intersection(all_trigger)) != 0: 
    message = f"{commuter_1} and {commuter_2} 👋 check https://twitter.com/metline for possible delays, [{utc_time}]"
elif len(tweet_words.intersection(david_trigger)) != 0:
    message = f"{commuter_2} 👋 check https://twitter.com/metline for possible delays, [{utc_time}]" 
elif len(tweet_words.intersection(aurelia_trigger)) != 0:
    message = f"{commuter_1} 👋 check https://twitter.com/metline for possible delays, [{utc_time}]"
else:
    message = "There are no delays"
    pass

print("Message:", message)

params = {"status": message}

oauth.post(
    "https://api.twitter.com/1.1/statuses/update.json", params=params
)

Conclusion

I used several libraries and services beyond the Twitter API to make this tutorial, but you may have different needs and requirements and should evaluate whether those tools are right for you.

Let us know if this inspires you to build anything on the Twitter community forums or by Tweeting us at @TwitterDev. You can also give us feedback on our feedback platform.

DEV Community

Migrate to Twitter’s newly released Labs recent search endpoint

Set up

Handling credentials

Handling authentication

Request parameters

Timestamps

Generating a query

Getting a response

Analyse Tweets

Conclusion

Top comments (0)

Read next

Python Tips & Tricks Day 2

Django With Postgres On Ubuntu.

Retrieval: Grasp advanced techniques for accessing and indexing data in the vector store

Announcing TechSchool: A free and open-source platform to learn programming