DEV Community

Josh
Josh

Posted on • Originally published at joshchoo.com

How actix-web's application state and Data extractor works internally

When developing web servers, we sometimes need a mechanism to share application state, such as configurations or database connections. actix-web makes it possible to inject shared data into application state using the app_data method and retrieve and pass the data to route handlers using Data extractor.

Let's explore just enough to understand how it works under the hood!

Note: The examples are based on zero2prod and actix-web v4 source code.

The following is an example of an actix-web server that exposes a /subscriptions route. subscribeHandler accepts a Postgres database connection (PgConnection), which we added to the application state via the app_data method.

pub fn run(listener: TcpListener, connection: PgConnection) -> Result<Server, std::io::Error> {
    let connection = web::Data::new(connection);
    let server = HttpServer::new(move || {
        App::new()
            .route("/subscriptions", web::post().to(subscribeHandler))
            // vvvvv Store PgConnection app data here
            .app_data(connection.clone())
    })
    .listen(listener)?
    .run();
    Ok(server)
}

...

// Extract PgConnection from application state
pub async fn subscribeHandler(_connection: web::Data<PgConnection>) -> HttpResponse {
    //...
}
Enter fullscreen mode Exit fullscreen mode

Question: Since it is possible to store other data types, how does actix-web know to retrieve and pass a PgConnection value to subscribeHandler instead of a different type?

Firstly, let's take a look at the web::post().to(...) implementation:

// in actix-web's src/web.rs

pub fn to<F, I, R>(handler: F) -> Route
where
    F: Handler<I, R>,
    I: FromRequest + 'static, // <--- Notice this!
    R: Future + 'static,
    R::Output: Responder + 'static,
{
    Route::new().to(handler)
}
Enter fullscreen mode Exit fullscreen mode

We could dive deeper into the function calls, but it would get quite complex and unnecessary to know. So instead, take note of the FromRequest trait associated with the Handler.

Next, we shall take a look at web::Data<T>, which subscribeHandler takes as a parameter, and see that it implements the FromRequest trait with the from_request function:

// in actix-web's src/data.rs

pub struct Data<T: ?Sized>(Arc<T>);

impl<T: ?Sized + 'static> FromRequest for Data<T> {
    type Config = ();
    type Error = Error;
    type Future = Ready<Result<Self, Error>>;

    #[inline]
    fn from_request(req: &HttpRequest, _: &mut Payload) -> Self::Future {
        if let Some(st) = req.app_data::<Data<T>>() {
            ok(st.clone())
        } else {
            log::debug!(
                "Failed to construct App-level Data extractor. \
                 Request path: {:?} (type: {})",
                req.path(),
                type_name::<T>(),
            );
            err(ErrorInternalServerError(
                "App data is not configured, to configure use App::data()",
            ))
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

from_request calls the req.app_data::<Data<T>>(), which returns an Option. It returns the data, st, if found. Otherwise it returns an Internal Server Error.

So, how exactly does calling req.app_data::<Data<T>> return the correct PgConnection data that we specified in the generic type parameter?

Let's look at the app_data method implementation:

// in actix-web's src/request.rs

pub struct HttpRequest {
    pub(crate) inner: Rc<HttpRequestInner>,
}

pub(crate) struct HttpRequestInner {
    pub(crate) app_data: SmallVec<[Rc<Extensions>; 4]>,
}

impl HttpRequest {
    pub fn app_data<T: 'static>(&self) -> Option<&T> {
        // container has type `Extensions`
        for container in self.inner.app_data.iter().rev() {
            if let Some(data) = container.get::<T>() {
                return Some(data);
            }
        }

        None
    }
}
Enter fullscreen mode Exit fullscreen mode

It appears that the incoming HttpRequest data contains app_data, which the code iterates through and then calls container.get::<T>().

// in actix-http's src/extensions.rs

pub struct Extensions {
    map: AHashMap<TypeId, Box<dyn Any>>,
}

impl Extensions
    pub fn get<T: 'static>(&self) -> Option<&T> {
        self.map
            .get(&TypeId::of::<T>())
            .and_then(|boxed| boxed.downcast_ref())
    }
}
Enter fullscreen mode Exit fullscreen mode

get::<T>() tries to get the value at the key TypeId::of::<T>() from the hashmap, map. What does TypeId::of::<T>() do?

/// A `TypeId` represents a globally unique identifier for a type.

pub struct TypeId {
    t: u64,
}

impl TypeId {
    /// Returns the `TypeId` of the type this generic function has been
    /// instantiated with.
    pub const fn of<T: ?Sized + 'static>() -> TypeId {
        TypeId { t: intrinsics::type_id::<T>() }
    }
}
Enter fullscreen mode Exit fullscreen mode

Aha! Given a type, such as PgConnection, Rust allows us to obtain a unique u64 value that identifies it! Hence calling TypeId::of::<PgConnection>() gives us a value that we can use a key for the hashmap inside of app_data.

But wait a moment! Isn't PgConnection just a generic type parameter? So, at runtime, how does Rust ensure that calling TypeId::of::<PgConnection>() gives us the value associated with PgConnection and not some other type?

Rust generates different copies of generic functions during compilation for each concrete type needed, such as PgConnection. This process is known as monomorphization. Given that we use web::Data<PgConnnection>, Rust will compile copies of the from_request, app_data, get and TypeId::of functions that are specific to the PgConnection type.

At runtime, when a request hits the /subscriptions route, the PgConnection-specific version of the TypeId::of function will return a concrete TypeId value associated with PgConnection. This value is then used to search app_data's internal hashmap and return the stored PgConnection data that we added to the application state.

So, we've figured out how actix-web retrieves PgConnection from the application state. Now, to complete our understanding, let's look at how actix-web stores data in the application state:

// Server code

App::new()
    .route("/subscriptions", web::post().to(subscribeHandler))
    // vvvvv Store PgConnection app data here
    .app_data(connection.clone())
Enter fullscreen mode Exit fullscreen mode

Digging into the app_data method's implementation:

// in actix-web's src/app.rs

impl App {
    pub fn app_data<U: 'static>(mut self, ext: U) -> Self {
        self.extensions.insert(ext);
        self
    }
}
Enter fullscreen mode Exit fullscreen mode

App inserts the PgConnection value into its own extensions, which contains a hashmap (remember from earlier):

// in actix-http's src/extensions.rs

impl Extensions
    pub fn insert<T: 'static>(&mut self, val: T) -> Option<T> {
        self.map
            .insert(TypeId::of::<T>(), Box::new(val))
            .and_then(downcast_owned)
    }
}
Enter fullscreen mode Exit fullscreen mode

Hey! It's the same Extensions type! The method inserts the PgConnection value with the key of TypeId::of::<T>(), which resolves to the PgConnection-specific value.

There we have it! actix-web stores data in application state's by inserting and retrieving them from a hashmap using keys derived from TypeId::of::<T>(), which returns unique identifiers for different types. And this is made possible by monomorphization in Rust!

Oldest comments (4)

Collapse
 
roms1383 profile image
Rom's

reading your article I'm wondering if I would ever store 2 PgConnection in AppData, for example 2 connections to different postgres databases, would there be some kind of conflicting type issue with app_data?

Collapse
 
fakeshadow profile image
fakeshadow • Edited

Data is typed based so you can always use new type for uniqueness. Like PgConnection1<PgConnection> and PgConnection2<PgConnection>. After that you can impl Deref and DerefMut for auto dereference to the inner type

Collapse
 
joshchoo profile image
Josh • Edited

Indeed! If we add the same types via .app_data(...), the most recently inserted one overrides the rest. Here's a relevant chapter from The Rust Programming Language book that @fakeshadow is talking about:

doc.rust-lang.org/stable/book/ch15...

Collapse
 
roms1383 profile image
Rom's

utterly makes sense actually, thanks @fakeshadow @joshchoo !