DEV Community

Muhammad Faizan Asghar
Muhammad Faizan Asghar

Posted on

Why is My Multi-Threaded API Still Slow?

I'm facing an issue with my API, and I'm hoping someone can help. Despite adding multi-threading, the performance gains are far from what I expected. Ideally, if one thread takes 1 second to complete a task, then 10 threads running concurrently should also take about 1 second (that's my understanding). However, my API response times are still very slow.

The Problem

I'm using FastAPI along with libraries like Playwright, MongoDB, and ThreadPoolExecutor. The goal was to use threading for CPU-bound tasks and async-await for IO-bound tasks. Still, my response times are not improving as expected.

Book Automation Example

One part of my project involves automating book queries using Playwright to interact with an EPUB viewer. The following function uses Playwright to open a browser, navigate to a book's page, and perform searches:

from playwright.async_api import async_playwright
import asyncio

async def search_with_playwright(search_text: str, book_id: str):
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        book_id = book_id.replace("-1", "")
        book_url = f"http://localhost:8002/book/{book_id}"
        await page.goto(book_url)
        await page.fill("#searchInput", search_text)
        await page.wait_for_selector("#searchResults")
        search_results = await page.evaluate('''
            () => {
                let results = [];
                document.querySelectorAll("#searchResults ul li").forEach(item => {
                    let excerptElement = item.querySelector("strong:nth-of-type(1)");
                    let cfiElement = item.querySelector("strong:nth-of-type(2)");

                    if (excerptElement && cfiElement) {
                        let excerpt = excerptElement.nextSibling ? excerptElement.nextSibling.nodeValue.trim() : "";
                        let cfi = cfiElement.nextSibling ? cfiElement.nextSibling.nodeValue.trim() : "";
                        results.push({ excerpt, cfi });
                return results;
        await browser.close()
        return search_results
Enter fullscreen mode Exit fullscreen mode

The function above is meant to be async to avoid blocking other tasks. However, even with this async setup, the performance is still not as expected.
Note: I've calculated the time taken to open book and run query a single book is approximately 0.0028s

Refactor Example

I used run_in_executor() to execute functions in ProcessPoolExecutor, trying to avoid the GIL and properly manage workloads.

async def query_mongo(query: str, id: str):
    query_vector = generate_embedding(query)

    results = db[id].aggregate([
            "$vectorSearch": {
                "queryVector": query_vector,
                "path": "embedding",
                "numCandidates": 2100,
                "limit": 50,
                "index": id

    # Helper function for processing each document
    def process_document(document):
            chunk = document["chunk"]
            chapter = document["chapter"]
            number = document["chapter_number"]
            book_id = id

            results =, book_id))
            return {
                "content": chunk,
                "chapter": chapter,
                "number": number,
                "results": results,
        except Exception as e:
            print(f"Error processing document: {e}")
            return None

    # Using ThreadPoolExecutor for concurrency
    all_data = []
    with ThreadPoolExecutor() as executor:
        futures = {executor.submit(process_document, doc): doc for doc in results}

        for future in as_completed(futures):
                result = future.result()
                if result:  # Append result if it's not None
            except Exception as e:
                print(f"Error in future processing: {e}")

    return all_data
Enter fullscreen mode Exit fullscreen mode


Even after these changes, my API is still slow. What am I missing? Has anyone faced similar issues with Python's GIL, threading, or async setups? Any advice would be greatly appreciated!

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)


This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

πŸ‘‹ Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.
