Felipe de Godoy

Posted on Jun 14

FastAPI for Data Applications: From Concept to Creation. Part I

#fastapi #dataengineering #datascience #llmops

In this blog post, we'll explore how to create an API using FastAPI, a modern Python framework designed for building APIs with high performance. We will create a simple API that allows users to add, update, and query items stored temporarily in memory. Alongside this, we'll discuss how you can extend this example to expose machine learning models, perform online processing in decision engines, and ensure best practices for a robust, secure API.

Pre-requisites: Installation of FastAPI and Uvicorn

Before diving into the code, we need to install FastAPI and Uvicorn, an ASGI server to run our application. Run the following command in your terminal:

pip install fastapi uvicorn

If you prefer, you can run in a virtual environment to isolate your execution.

Project Structure

We'll start by creating a file named main.py, which will contain all the code for our application.

1. Setting Up FastAPI

The first step is to import the necessary modules and initialize the FastAPI app.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List

app = FastAPI()

FastAPI: The core library for building the API.
HTTPException: This handles HTTP exceptions.
BaseModel: Part of Pydantic, used for data validation.
List: Used for type hinting.

2. Defining the Data Model

Next, we'll define the data model using Pydantic's BaseModel. This model will describe the structure of the data we want to store and process.

class Item(BaseModel):
    id: int
    name: str
    description: str = None

Item: Our data model with three fields: id, name, and description. The description field is optional.

3. In-Memory Database

For simplicity, we'll use a list to simulate a database. Here you can insert the connectors for your specific use case.

fake_db: List[Item] = []

fake_db: This list will store instances of Item during the application's runtime.

4. Define the Root Endpoint

We'll start by defining a simple root endpoint to ensure our application is running.

@app.get("/")
async def root():
    return {"message": "Index Endpoint"}

This endpoint returns a JSON message saying "Index Endpoint" when accessed.

5. Implementing API Endpoints

a) Retrieve All Items

To retrieve all items in our "database", we'll define a GET endpoint.

@app.get("/items/", response_model=List[Item])
async def get_items():
    return fake_db

This endpoint returns the list of all items.

b) Retrieve a Specific Item

To retrieve a specific item by its ID:

@app.get("/items/{item_id}", response_model=Item)
async def get_item(item_id: int):
    for item in fake_db:
        if item.id == item_id:
            return item
    raise HTTPException(status_code=404, detail="Item not found")

item_id: The item's ID passed as a path parameter.
This function searches the list for an item with the given ID and returns it if found. If not, it raises a 404 error.

c) Create a New Item

To add new items to our list:

@app.post("/items/", response_model=Item)
async def create_item(item: Item):
    for existing_item in fake_db:
        if existing_item.id == item.id:
            raise HTTPException(status_code=400, detail="Item already exists")
    fake_db.append(item)
    return item

create_item: This function checks if an item with the same ID already exists. If it does, it raises a 400 error. Otherwise, it adds the item to the list.

d) Update an Existing Item

To update an existing item by its ID:

@app.put("/items/{item_id}", response_model=Item)
async def update_item(item_id: int, updated_item: Item):
    for idx, existing_item in enumerate(fake_db):
        if existing_item.id == item_id:
            fake_db[idx] = updated_item
            return updated_item
    raise HTTPException(status_code=404, detail="Item not found")

update_item: The function updates an item if it exists, otherwise raises a 404 error.

Running the Application

Save your file as main.py. To run the application, execute the following command:

uvicorn main:app --reload

The --reload flag enables auto-reloading, meaning the server will restart whenever you make changes to the code. This is useful during development.

Testing Your API

You can test the API using tools like curl, Postman, bash terminal, or even a web browser. Here are some curl commands for testing:

a) Add New Items

curl -X POST "http://127.0.0.1:8000/items/" -H "Content-Type: application/json" -d '{"id": 1, "name": "Item 1", "description": "This is item 1"}'

curl -X POST "http://127.0.0.1:8000/items/" -H "Content-Type: application/json" -d '{"id": 2, "name": "Item 2", "description": "This is item 2"}'

b) Update an Item

curl -X PUT "http://127.0.0.1:8000/items/1" -H "Content-Type: application/json" -d '{"id": 1, "name": "Updated Item 1", "description": "This is the updated item 1"}'

c) Retrieve All Items

curl -X GET "http://127.0.0.1:8000/items/"

d) Retrieve a Specific Item

curl -X GET "http://127.0.0.1:8000/items/1"

Advanced Use Cases

Exposing a Machine Learning Model

One of the most exciting applications of FastAPI is exposing machine learning models. Imagine you want to serve a trained model for real-time predictions. Here’s an example of how to load a machine-learning model and expose it via an endpoint.

Firstly, ensure you have a trained machine-learning model stored in a pickled file. For this example, we'll assume you have a model.pkl file.

import pickle
from sklearn.ensemble import RandomForestClassifier

# Load the model
with open("model.pkl", "rb") as f:
    model = pickle.load(f)

@app.post("/predict/")
async def predict(data: List[float]):
    prediction = model.predict([data])
    return {"prediction": prediction[0]}

Data Validation: Validate input data properly to avoid unexpected errors. Pydantic models can help structure the input data for the prediction endpoint.
Performance: Be mindful of the model's loading time and how often it's accessed. For high-frequency endpoints, consider model optimization and caching mechanisms.
Risks: Make sure only validated and sanitized data is fed into your model to avoid security vulnerabilities like input injection.

Online Processing in a Decision Engine

Another practical scenario is implementing online data processing using an API. For example, in a financial application, you might need to decide based on streaming data, such as approving or declining a transaction.

@app.post("/process/")
async def process_transaction(transaction: dict):
    # Perform some real-time computation/decision
    decision = "approved" if transaction["amount"] < 1000 else "declined"
    return {"transaction_id": transaction["transaction_id"], "decision": decision}

Efficiency: Ensure your processing logic is efficient since it could potentially process numerous transactions in real-time.
Consistency: Maintain the consistency and idempotence of your decision logic to handle scenarios where the same request might be processed multiple times.
Security: Always validate incoming data to protect against malicious payloads.

Conclusion

In this blog post, we've built a simple API using FastAPI that allows you to add, update, and fetch items stored in memory. Additionally, we discussed how you can extend this architecture to expose machine learning models and perform online processing in decision engines.

This example serves as a foundational understanding of how to use FastAPI for basic CRUD operations and more advanced use cases. For more complex scenarios, consider connecting to a real database, implementing robust authentication, and ensuring high availability.

Feel free to explore and expand upon this example to suit your needs. Happy coding!

Looking Ahead to Part 2: Dockerizing and Running in Kubernetes

In the next part of our series, we will take this FastAPI application to the next level by preparing it for production deployment. We will walk you through the process of Dockerizing the application, ensuring it is containerized for consistency and portability. Then, we'll delve into orchestrating the application using Kubernetes, providing the scalability and reliability required for production environments. This includes setting up Kubernetes manifests, deploying the application to a Kubernetes cluster, and managing the application lifecycle in a cloud-native environment. Stay tuned to learn how to transform your development setup into a robust, enterprise-ready deployment! Keep an eye out for Part 2 of this series as we dive into Docker and Kubernetes.

Github repo: https://github.com/felipe-de-godoy/FastAPI-for-Data-Delivery

Credits: Image from "Real Python"

DEV Community