Telegram news bot using Datanews API
Introduction
Telegram is a popular messaging app. It is well-known for its security and efficiency. Apart from being able to send messages to each other, its users can also create bots to automate certain routine tasks. In this tutorial, we will use Telegram's bot API to create a news bot using Datanews API.
Telegram Bot API overview
We will use python-telegram-bot wrapper around the official API. This library significantly simplifies the job of the programmer when writing a bot. It is always easier to learn something new by reading through a couple of examples. Here is one:
from telegram.ext import Updater, CommandHandler
USAGE = '/greet <name> - Greet me!'
def start(update, context):
update.message.reply_text(USAGE)
def greet_command(update, context):
update.message.reply_text(f'Hello {context.args[0]}!')
def main():
updater = Updater("TOKEN", use_context=True)
dp = updater.dispatcher
# on different commands - answer in Telegram
dp.add_handler(CommandHandler("start", start))
dp.add_handler(CommandHandler("greet", greet_command))
# Start the Bot
updater.start_polling()
updater.idle()
if __name__ == '__main__':
main()
This small piece of code creates a bot that recognizes two commands:
-
/start
- the bot will respond with the help page to this one. -
/greet
- this command receives an argument (e.g.Datanews
) and responds withHello Datanews!
.
Let's go through each line in detail and discuss what this code does.
We start with the main
function:
def main():
updater = Updater("TOKEN", use_context=True)
dp = updater.dispatcher
dp.add_handler(CommandHandler("start", start))
dp.add_handler(CommandHandler("greet", greet_command))
updater.start_polling()
updater.idle()
This function sets up all the necessary machinery needed for our bot to work. Particularly, it creates an instance of the Updater
class. Note that you need a Telegram token to be able to use Telegram bot API. Check out the official guide on how to create bots here.
Back to the code! The purpose of the Updater
is to deliver updates (e.g. messages sent by users) to Dispatcher
. When the latter receives an update, it tries to dispatch some of the user-specified callbacks to handle it. Each of those callbacks is managed by some handler.
You can think of a handler as a function to handle an update that is only executed when some condition is met. The condition in question, though, depends on the handler and can be specified by the programmer. In our case, we have two instances of CommandHandler
class.
dp.add_handler(CommandHandler("start", start))
dp.add_handler(CommandHandler("greet", greet_command))
Each of them handles a particular command, supported by our bot - /start
and /greet
respectively.
Then we call start_polling
method.
updater.start_polling()
This will make our bot periodically fetch updates from the Telegram server. This method will internally create two threads: one will poll updates from the Telegram server, the other one will be used by the dispatcher to handle those updates.
The next line makes sure our bot correctly handles various interruption signals (e.g. SIGINT
).
updater.idle()
This is required when we want our bot to have persistent state. You can learn more about it and other cool library features in their wiki.
Let's now discuss two callback functions that handle bot's commands:
def start(update, context):
update.message.reply_text(USAGE)
def greet_command(update, context):
update.message.reply_text(f'Hello {context.args[0]}!')
Each of these functions takes two arguments:
-
update
- an update received by our bot from Telegram servers. -
context
- contains various useful methods and information. For example, it hasuser_data
dictionary which can store various user-related information.
Additionally, each of these methods sends a text message back to the user.
You can check out more elaborate examples of Telegram bots in the library's official repo.
Let's now move on to the main topic of our discussion.
Datanews API overview
Datanews.io provides API for retrieving and monitoring news from more than a thousand different newspapers, news aggregators and other websites. We collect and process more than 100k news articles a day. Naturally, we provide a flexible and easy-to-use API for querying those articles. For our small project, though, we only need a small part of that API. Particularly, we want our bot to be able to:
- Retrieve articles based on a query string, sent by the user.
- Retrieve articles from some particular publisher.
These use-cases can be handled by a single end-point - /headlines
. You can learn more about the provided API in our official documentation.
Now we can go straight to the implementation of our bot.
The implementation
First of all, let's define a callback that handles the /start
command.
def get_usage():
return '''This bot allows you to query news articles from Datanews API.
Available commands:
/help, /start - show this help message.
/search <query> - retrieve news articles containing <query>.
Example: "/search covid"
/publisher <domain> - retrieve newest articles by publisher.
Example: "/publisher techcrunch.com"
'''
def help_command(update, context):
update.message.reply_markdown(get_usage())
As you can see, the implementation closely resembles our bot example above - we simply return the help information to our user. You can notice that our bot will support four commands. The help_command
function implements the first two of them. Let's now discuss the other two.
def search_command(update, context):
def fetcher(query):
return datanews.headlines(query, size=10, sortBy='date', page=0, language='en')
_fetch_data(update, context, fetcher)
def publisher_command(update, context):
def fetcher(query):
return datanews.headlines(source=query, size=10, sortBy='date', page=0, language='en')
_fetch_data(update, context, fetcher)
These functions look very similar. They both use the /headlines
API endpoint as discussed earlier (we are using the official Datanews library for Python here). They both delegate their work to a helper _fetch_data
. The only difference is in the arguments we pass to the Datanews API: search_command
retrieves articles matching a certain query
whereas publisher_command
fetches all articles as long as they are published by a specific source. Note, however, that in both cases we only get the first 10 most recent articles.
Let's now take a look at the helper that does all the job.
def _fetch_data(update, context, fetcher):
if not context.args:
help_command(update, context)
return
query = '"' + ' '.join(context.args) + '"'
result = fetcher(query)
if not result['hits']:
update.message.reply_text('No news is good news')
return
last_message = update.message
for article in reversed(result['hits']):
text = article['title'] + ': ' + article['url']
last_message = last_message.reply_text(text)
This function simply checks that the user has indeed specified required arguments to the command, fetches the data from the Datanews API and sends it in reverse order to the user. A couple of comments here:
- We make sure to surround a query with
"
so that Datanews returns all articles containing the complete query and not just a single word from it. You can learn more about the query syntax in the documentation. - We also make sure to handle the case when no articles were found - it wouldn't be good to just sit silently in this situation.
- We send all articles in reversed order so that the last one received is the most recent.
With this out of the way, let's take a look at the main
function.
def main():
updater = Updater(token='TOKEN')
updater.dispatcher.add_handler(CommandHandler('start', help_command))
updater.dispatcher.add_handler(CommandHandler('help', help_command))
updater.dispatcher.add_handler(CommandHandler('search', search_command))
updater.dispatcher.add_handler(CommandHandler('publisher', publisher_command))
updater.dispatcher.add_handler(
MessageHandler(
Filters.text & Filters.regex(pattern=re.compile('help', re.IGNORECASE)),
help_command
)
)
updater.start_polling()
updater.idle()
This function is very similar to the one from the example. The only major difference is in the following lines:
updater.dispatcher.add_handler(
MessageHandler(
Filters.text & Filters.regex(pattern=re.compile('help', re.IGNORECASE)),
help_command
)
)
The MessageHandler
is used to catch messages sent by the user. You can think of it as a CommandHandler
on steroids: it processes any messages that satisfy a specified filter. In our case, we want to print help information every time the user sends a text message containing the help
word.
That's it. You now have a fully functional news bot.
Conclusion
Well, this was fun! We discussed the Telegram bot API and its implementation in Python. We also gave a brief overview of Datanews API and built our own news bot that uses it. However, this is only the tip of the iceberg: we can add support for news monitoring and many other cool features to our awesome bot as easy as we just did.
You can check out the source code here and the working example at https://t.me/realDatanewsBot.
This post was originally published on our official blog. You may see more here.
Top comments (1)
Great! THX