By Vikram Vaswani, Developer Advocate
This tutorial was originally published at https://docs.rev.ai/resources/tutorials/integrate-topic-extraction-api-nodejs/ on Jun 13, 2022.
Introduction
Topic extraction attempts to detect the topics or subjects of a document. It is useful in a number of different scenarios, including
- Auto-generated agendas for meetings and phone calls
- Automated classification or keyword indexing for digital media libraries
- Automated tagging for Customer Service (CS) complaints or support tickets
Rev AI offers a Topic Extraction API that identifies important keywords and corresponding topics in transcribed speech. For application developers, it provides a fast and accurate way to retrieve and rank the core subjects in a transcribed conversation and then take further actions based on this information.
This tutorial explains how to integrate the Rev AI Topic Extraction API into your Node.js application.
Assumptions
This tutorial assumes that:
- You have a Rev AI account and access token. If not, sign up for a free account and generate an access token.
- You have a properly-configured Node.js development environment with Node.js v16.x or v17.x. If not, download and install Node.js for your operating system.
- You have a JSON transcript generated from the Asynchronous Speech-to-Text API. If not, use this example JSON transcript.
NOTE: The Topic Extraction API is under active development. Always refer to the API documentation for the most up-to-date information.
Step 1: Install Axios
The Topic Extraction API is a REST API and, as such, you will need an HTTP client to interact with it. This tutorial uses Axios, a popular Promise-based HTTP client for Node.js.
Begin by installing Axios into your application directory:
npm install axios
Within your application code, initialize Axios as below:
const axios = require('axios');
const token = '<REVAI_ACCESS_TOKEN>';
// create a client
const http = axios.create({
baseURL: 'https://api.rev.ai/topic_extraction/v1beta/',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
}
});
Here, the Axios HTTP client is initialized with the base endpoint for the Topic Extraction API, which is https://api.rev.ai/topic_extraction/v1beta/
.
Every request to the API must be in JSON format and must include an Authorization
header containing your API access token. The code shown above also attaches these required headers to the client.
Step 2: Submit transcript for topic extraction
To perform topic extraction on a transcript, you must begin by submitting an HTTP POST request containing the transcript content, in either plaintext or JSON, to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs
.
The code listings below perform this operation using the HTTP client initialized in Step 1, for both plaintext and JSON transcripts:
const submitTopicExtractionJobText = async (textData) => {
return await http.post(`jobs`,
JSON.stringify({
text: textData
}))
.then(response => response.data)
.catch(console.error);
};
const submitTopicExtractionJobJson = async (jsonData) => {
return await http.post(`jobs`,
JSON.stringify({
json: jsonData
}))
.then(response => response.data)
.catch(console.error);
};
If you were to inspect the return value of the functions shown above, here is an example of what you would see:
{
id: 'W6DvsEjteqwV',
created_on: '2022-04-13T09:16:07.033Z',
status: 'in_progress',
type: 'topic_extraction'
}
The API response contains a job identifier (id
field). This job identifier will be required to check the job status and obtain the job result.
Learn more about submitting a topic extraction job in the API reference guide.
Step 3: Check job status
Topic extraction jobs usually complete within 10-20 seconds. To check the status of the job, you must submit an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs/<ID>
, where <ID>
is a placeholder for the job identifier.
The code listing below demonstrates this operation:
const getTopicExtractionJobStatus = async (jobId) => {
return await http.get(`jobs/${jobId}`)
.then(response => response.data)
.catch(console.error);
};
Here is an example of the API response to the previous request after the job has completed:
{
id: 'W6DvsEjteqwV',
created_on: '2022-04-13T09:16:07.033Z',
completed_on: '2022-04-13T09:16:07.17Z',
word_count: 13,
status: 'completed',
type: 'topic_extraction'
}
Learn more about retrieving the status of a topic extraction job in the API reference guide.
Step 4: Retrieve topic extraction report
Once the topic extraction job's status
changes to completed
, you can retrieve the results by submitting an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1beta/jobs/<ID>/result
, where <ID>
is a placeholder for the job identifier.
The code listing below demonstrates this operation:
const getTopicExtractionJobResult = async (jobId) => {
return await http.get(`jobs/${jobId}/result`,
{ headers: { 'Accept': 'application/vnd.rev.topic.v1.0+json' } })
.then(response => response.data)
.catch(console.error);
};
If the job status is completed
, the return value of the above function is a JSON-encoded response containing a sentence-wise topic extraction report. If the job status is not completed
, the function will return an error instead.
Here is an example of the topic extraction report returned from a completed job:
{
"topics": [
{
"topic_name": "incredible team",
"score": 0.9,
"informants": [
{
"content": "We have 17 folks and, uh, I think we have an incredible team and I just want to talk about some things that we've done that I think have helped us get there.",
"ts": 71.4,
"end_ts": 78.39
},
{
"content": "Um, it's sort of the overall thesis for this one.",
"ts": 78.96,
"end_ts": 81.51
},
{
"content": "One thing that's worth keeping in mind is that recruiting is a lot of work.",
"ts": 81.51,
"end_ts": 84
},
{
"content": "Some people think that you can raise money and spend a few weeks building your team and then move on to more",
"ts": 84.21,
"end_ts": 88.47
}
]
},
{
...
}
]
}
Itβs also possible to filter the result set to return only topics which score above a certain value by adding a threshold
query parameter to the request.
Learn more about obtaining a topic extraction report in the API reference guide.
Step 5: Create and test a simple application
Using the code samples shown previously, it's possible to create a simple application that accepts a JSON transcript and returns a list of topics detected in it, as shown below:
const main = async (jsonData) => {
const job = await submitTopicExtractionJobJson(jsonData);
console.log(`Job submitted with id: ${job.id}`);
await new Promise((resolve, reject) => {
const interval = setInterval(() => {
getTopicExtractionJobStatus(job.id)
.then(r => {
console.log(`Job status: ${r.status}`);
if (r.status !== 'in_progress') {
clearInterval(interval);
resolve(r);
}
})
.catch(e => {
clearInterval(interval);
reject(e);
});
}, 15000);
});
const jobResult = await getTopicExtractionJobResult(job.id);
console.log(jobResult);
};
// extract topics from example Rev AI JSON transcript
http.get('https://www.rev.ai/FTC_Sample_1_Transcript.json')
.then(response => main(response.data));
This example application begins by fetching Rev AI's example JSON transcript and passing it to the main()
function as input to be analyzed. The main()
function submits this data to the Topic Extraction API using the submitTopicExtractionJobJson()
method. It then uses setInterval()
to repeatedly poll the API every 15 seconds to obtain the status of the job. Once the job status is no longer in_progress
, it uses the getTopicExtractionJobResult()
method to retrieve the job result and prints it to the console.
Here is an example of the output returned by the code above:
Job submitted with id: xgKIzeODYYba
Job status: completed
{
topics: [
{ topic_name: 'quick overview', score: 0.9, informants: [Array] },
{ topic_name: 'concert tickets', score: 0.9, informants: [Array] },
{ topic_name: 'dividends', score: 0.9, informants: [Array] },
{ topic_name: 'quick background', score: 0.6, informants: [Array] }
]
}
NOTE: The code listing above polls the API repeatedly to check the status of the topic extraction job. This is presented only for illustrative purposes and is strongly recommended against in production scenarios. For production scenarios, use webhooks to asynchronously receive notifications once the topic extraction job completes.
Next steps
Learn more about the topics discussed in this tutorial by visiting the following links:
- Documentation: Topic Extraction API reference
- Documentation: Topic Extraction API webhooks
- Tutorial: Get Started with Topic Extraction
- Tutorial: Get Started with Rev AI Webhooks
Top comments (0)