DEV Community

Andrea Benfatti
Andrea Benfatti

Posted on • Updated on

Async programming in Python with asyncio

For people coming from JavaScript asynchronous programming is nothing new, but for python developers getting used to async functions and future (the equivalent of promise in JS) may not be trivial

Concurrency vs Parallelism

Concurrency and parallelism can sound really similar but in programming there is an important difference.
Immagine you are writing a book while cooking, even if it seems like you are doing both tasks at the same time, what you are doing is switching between the two tasks, while you wait for the water to boil you are writing your book, but while you are chopping some vegetables you pause your writing. This is called concurrency. The only way to do these two tasks in parallel is having two people, one writing and one cooking, which is what multicore CPU do.

concurrency-parallelism

Why asyncio

Async programming allows you to write concurrent code that runs in a single thread. The first advantage compared to multiple threads is that you decide where the scheduler will switch from one task to another, which means that sharing data between tasks it's safer and easier.

def queue_push_back(x):
    if len(list) < max_size:
        list.append(x)
Enter fullscreen mode Exit fullscreen mode

If we run the code above in a multithread program it's possible that two threads execute line 2 at the same time so 2 items will be added to the queue at the same time and potentially making the queue size bigger than max_size

Another advantage of async programming is memory usage. Every time a new thread is created some memory is used to allow context switching, if we use async programming this is not a problem since the code runs in a single thread.

How to write async code in python

Asyncio has 3 main components: coroutines, event loop, and future

Coroutine

A coroutine is the result of an asynchronous function which can be declared using the keyword async before def

async def my_task(args):
    pass

my_coroutine = my_task(args)
Enter fullscreen mode Exit fullscreen mode

When we declare a function using the async keyword the function is not run, instead, a coroutine object is returned.

There are two ways to read the output of an async function from a coroutine.
The first way is to use the await keyword, this is possible only inside async functions and will wait for the coroutine to terminate and return the result

result = await my_task(args)
Enter fullscreen mode Exit fullscreen mode

The second way is to add it to an event loop as we will see in the next sections.

Event loop

The event loop is the object which execute our asyncronous code and decide how to switch between async functions. After creating an event loop we can add multiple coroutines to it, this corutines will all be running concurrently when run_until_complete or run_forever is called.

# create loop
loop = asyncio.new_event_loop()
# add coroutine to the loop
future = loop.create_task(my_coroutine)
# stop the program and execute all coroutine added
# to the loop concurrently
loop.run_until_complete(future)
loop.close()
Enter fullscreen mode Exit fullscreen mode

Future

A future is an object that works as a placeholder for the output of an asynchronous function and it gives us information about the function state.
A future is created when we add a corutine to an event loop. There are two way to this:

future1 = loop.create_task(my_coroutine)
# or
future2 = asyncio.ensure_future(my_coroutine)
Enter fullscreen mode Exit fullscreen mode

The first method adds a coroutine to the loop and returns a task which is a subtype of future. The second method is very similar, it takes a coroutine and it adds it to the default loop, the only difference is that it can also accept a future, in which case it will not do anything and return the future unchanged.

A simple program

import asyncio

async def my_task(args):
    pass

def main():
    loop = asyncio.new_event_loop()
    coroutine1 = my_task()
    coroutine2 = my_task()
    task1 = loop.create_task(coroutine1)
    task2 = loop.create_task(coroutine2)
    loop.run_until_complete(asyncio.wait([task1, task2]))
    print('task1 result:', task1.result())
    print('task2 result:', task2.result())
    loop.close()
Enter fullscreen mode Exit fullscreen mode

As you can see to run an asynchronous function we first need to create a coroutine, then we add it to the event loop which create a future/task. Up to this point none of the code inside our async function has been executed, only when we call loop.run_until_completed the event loop start executing all the coroutines that have been added to the loop with loop.create_task or asyncio.ensure_future.
loop.run_until_completed will block your program until the future you gave as argument is completed. In the example we used asyncio.wait() to create a future which will be complete only when all the futures passed in the argument list are completed.

Async functions

One thing to keep in mind while writing asynchronous functions in python is that just because you used async before def it doesn't mean that your function will be run concurrently. If you take a normal function and add async in front of it the event loop will run your function without interruption because you didn't specify where the loop is allowed to interrupt your function to run another coroutine. Specify where the event loop is allowed to change coroutine is really simple, every time you use the keyword await the event loop can stop running your function and run another coroutine registered to the loop.

async def print_numbers_async1(n, prefix):
    for i in range(n):
        print(prefix, i)

async def print_numbers_async2(n, prefix):
    for i in range(n):
        print(prefix, i)
        if i % 5 == 0:
            await asyncio.sleep(0)

loop1 = asyncio.new_event_loop()
count1_1 = loop1.create_task(print_numbers_async1(10, 'c1_1')
count2_1 = loop1.create_task(print_numbers_async1(10, 'c2_1')
loop1.run_until_complete(asyncio.wait([count1_1, count2_1])
loop1.close()

loop2 = asyncio.new_event_loop()
count1_2 = loop1.create_task(print_numbers_async1(10, 'c1_2')
count2_2 = loop1.create_task(print_numbers_async1(10, 'c2_2')
loop2.run_until_complete(asyncio.wait([count1_2, count2_2])
loop2.close()
Enter fullscreen mode Exit fullscreen mode

If we execute this code we will see that loop1 will print first print all numbers with prefix c1_1 and then with the prefix c2_1 while in the second loop every 5 numbers the loop will change task.

Real world example

Now that we know the basics of asynchronous programming in python let's write some more realistic code which will download a list of pages from the internet and print a preview containing the first 3 lines of the page.

import aiohttp
import asyncio

async def print_preview(url):
    # connect to the server
    async with aiohttp.ClientSession() as session:
        # create get request
        async with session.get(url) as response:
            # wait for response
            response = await response.text()

            # print first 3 not empty lines
            count = 0
            lines = list(filter(lambda x: len(x) > 0, response.split('\n')))
            print('-'*80)
            for line in lines[:3]:
                print(line)
            print()

def print_all_pages():
    pages = [
        'http://textfiles.com/adventure/amforever.txt',
        'http://textfiles.com/adventure/ballyhoo.txt',
        'http://textfiles.com/adventure/bardstale.txt',
    ]

    tasks =  []
    loop = asyncio.new_event_loop()
    for page in pages:
        tasks.append(loop.create_task(print_preview(page)))

    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
Enter fullscreen mode Exit fullscreen mode

This code should be pretty easy to understand, we start by creating an asynchronous function which downloads an URL and prints the first 3 not empty lines. Then we create a function which for each page in a list of pages call print_preview, add the coroutine the to loop and store the future inside a list of tasks. Finally, we run the event loop which will run the coroutine we added to it and it will print the preview of all the pages.

Async generator

The last feature I want to talk about is asynchronous generator. Implementing an asynchronous generator is quite simple.

import asyncio
import math
import random

async def is_prime(n):
    if n < 2:
        return True
    for i in range(2, n):
        # allow event_loop to run other coroutine
        await asyncio.sleep(0)
        if n % i == 0:
            return False
    return True

async def prime_generator(n_prime):
    counter = 0
    n = 0
    while counter < n_prime:
        n += 1
        # wait for is_prime to finish
        prime = await is_prime(n)
        if prime:
            yield n
            counter += 1

async def check_email(limit):
    for i in range(limit):
        if random.random() > 0.8:
            print('1 new email')
        else:
            print('0 new email')
        await asyncio.sleep(2)

async def print_prime(n):
    async for prime in prime_generator(n):
        print('new prime number found:', prime)

def main():
    loop = asyncio.new_event_loop()
    prime = loop.create_task(print_prime(3000))
    email = loop.create_task(check_email(10))
    loop.run_until_complete(asyncio.wait([prime, email]))
    loop.close()
Enter fullscreen mode Exit fullscreen mode

Exception handling

When an unhandled exception is raised inside a coroutine it doesn't break our program as in normal synchronous programming, instead, it's stored inside the future and if you don't handle the exception before the program exit you will get the following error

Task exception was never retrieved
Enter fullscreen mode Exit fullscreen mode

There are two ways to fix this, catch the exception when you access the future result or calling the future exception method.

try:
    # this will raise the exception raised during the coroutine execution
    my_promise.result()
catch Exception:
    pass

# this will return the exception raised during the coroutine execution
my_promise.exception()
Enter fullscreen mode Exit fullscreen mode

Going deeper

If you have read everything up to this point you should know how to use asyncio to write concurrent code, but if you wish to go deeper and understand how asyncio works I suggest you watch the following video

If you would like to see more complex uses of asyncio or if you have any question leave a comment and I will replay to you as soon as possible

Top comments (13)

Collapse
 
mi2pankaj profile image
Pankaj Kumar Katiyar

This is so far the best and most informative article on async concurrency in python. Love it. Thanks a lot for posting online.

Collapse
 
daolf profile image
Pierre • Edited

Hello and thanks for this great article.

Just one remark about multithreading though.
Each of Python threads will not execute line 2 at the same time. But something like this can happen:
thread 1 execute line 2 -> thread 2 execute line 2 -> thread 1 execute line 3 -> thread 2 execute line 3 !!! (size overflow)
Race condition.

Because of the GIL (Global Interpreter Lock) no line of Python can be executed at the same time when doing multithreading.

Collapse
 
welldone2094 profile image
Andrea Benfatti

That's totally true. I didn't talk about it because it wasn't the main topic. Anyway on this subject you can use Processes instead of thread which don't suffer from GIL but i think they bring some overhead with them

Collapse
 
mtoto_lekgwathi profile image
Mapogo Lekgwathi

I think this is somewhat unhelpful, what if I write mainly synchronous code and want some tasks to execute asynchronously? (I don't care about what the async function returns)

E.g I want to run a http request in the background every time something happens, but I don't want to create a thread, and run it

Collapse
 
welldone2094 profile image
Andrea Benfatti

unfortunately, I think that what you are asking is not possible, if the program runs in a single process/thread it needs to know when to switch between tasks. I think for your problem the best solution is to use multiple threads which make the code really simple and you don't need to decide when to switch context with just minimal overhead.

Collapse
 
eljayadobe profile image
Eljay-Adobe

On a related note: pythonclock.org/

Collapse
 
welldone2094 profile image
Andrea Benfatti

cannot wait for it! I have a strong unmotivated hate for python2

Thread Thread
 
eljayadobe profile image
Eljay-Adobe

I started using Python 2 around 2001, and had a very long laundry list of Things I Hate About Python.

When Python 3 came out, I had all but given up on Python. Took a look at Python 3, and out of my list of hate, everything-except-one-thing was fixed in Python 3.

The only thing that wasn't "fixed" was offside rule. I know offside rule is part of the heart and soul of Python. It will never change. Even after all this time, it still feels like fingernails on a chalkboard to me.

I can accept that one point of agree-to-disagree. (Heck, my Things I Hate About C++ list is about x1000 times longer than my Python 2 hate list. Python 3 with but a single hate item is bliss in comparison to C++!)

Collapse
 
lauriy profile image
Lauri Elias

asycnio <- typo

Collapse
 
welldone2094 profile image
Andrea Benfatti

fixed, thanks

Collapse
 
abdurrahmaanj profile image
Abdur-Rahmaan Janhangeer

I suggest you watch the following video

which one?

Collapse
 
welldone2094 profile image
Andrea Benfatti

sorry, i may have forgot to copy it from my website. This is the video, i will update the post as well
youtube.com/watch?v=M-UcUs7IMIM

Collapse
 
welldone2094 profile image
Andrea Benfatti

I cannot see the link anymore, anyway I didn't talk about ensure_future to avoid creating confusion and so far I have never needed it.