Fauna for Fauna, Inc.

Posted on Jul 25, 2022 • Originally published at fauna.com

Modernization of the database: DynamoDB to Fauna

#serverless #fauna #aws #database

Serverless databases like Fauna fill a crucial role for modern applications. Given the amount of cloud-based traffic and varying workloads that organizations from startups to enterprises manage, serverless databases are the natural fit to the rest of your serverless architecture, adding seamless flexibility, scaling, and low latency for your applications.

DynamoDB has been one of the most popular serverless databases. In addition to coming from AWS, it has benefited from being one of the earliest offerings in this space. Its popularity is hard to dispute, backed up by its proven ability to effectively handle traffic spikes and workload variations without overloading infrastructure or racking up unnecessary costs during periods of low traffic. Though DynamoDB has much to offer, Fauna is a much newer offering that comes with a host of powerful and unique features that enhance the serverless experience for organizations of all sizes.

In this article, we’ll learn about serverless databases, compare key differences between DynamoDB and Fauna, and provide insight on which one to choose for your next big project.

What makes a good serverless database?

A great serverless database should offer the following values:

Lower operation costs: One of the main reasons for choosing software is cost. Traditionally, organizations need to budget for the continuous management of their database(s) and infrastructure, and not just for application development. Serverless architectures promise “zero ops,” which eliminate these management concerns. As headcount makes up your highest spend, leaning out on operations overhead ultimately lowers your TCO (total cost of ownership).
Pay as you go: It should also offer a simple and transparent pricing model without any traps or unexpected costs. You should only have to pay for the resources you use, with billing being proportional to the amount of data stored and volume of transactions sent to the database.
High availability: There should be no maintenance windows or unplanned downtimes. Data and compute resources should be automatically distributed and replicated, offering high durability and resiliency to region/zone outages. The more these attributes are provided to the operator without any additional configuration, the better.
Low latency: Ensuring a good user experience is key to user engagement, ultimately impacting how successful your products are with customers. Serverless architectures should not trade off the ability to scale elastically with cold start-up delays or increased latency due to speed-of-light constraints. A great serverless database offering should be always-on and instantly scale; And provide multiple region choices so that it can be accessed as close to your compute resources as possible.

Why should you choose Fauna?

Serverless databases are critical components for managing unpredictable and rapidly changing workloads, essentially following a pay-as-you-use model. They are also a good fit for companies with a small workforce, enabling them to consume compute workloads and infrastructure without manual overhead. Using a serverless database allows you to simplify database operations and eliminate problems common with traditional databases, such as maintenance, upgrades and patching, and cost of operations.

Fauna is accessed as an API and there is nothing to install, maintain, or operate. You can quickly deploy a database in three button clicks, start coding, and immediately connect to the database. You can scale databases without any limitations and create an unlimited number of databases. It is chock full of cloud-native features: login with your GitHub account; integrate with third-party services like Auth0, Netlify, and Vercel; has built-in streaming, user authentication and fine-grain authorization. It supports user-defined functions – similar to stored procedures – which help eliminate redundant code, maintain Fauna Query Language (FQL) consistency, and remove code duplication. And it has a native GraphQL API, allowing organizations adopting GraphQL to quickly get up and running with a data source in minutes.

Fauna vs. DynamoDB

Fauna offers numerous features that distinguish it from DynamoDB and similar serverless databases. The following is a comparison between Fauna and DynamoDB and the components that each offers.

Geo-distribution

Businesses with global audiences need to ensure fast response and high performance for customers in multiple regions across the globe. This requires a database that allows data to be replicated globally and served from specified locations — the closer the proximity of the data, the better the experience and performance.

By default, DynamoDB is replicated across multiple availability zones in a single region. But with the “global tables” feature, DynamoDB provides a fully managed solution for deploying a multi-region, multi-active database. You specify the regions you want the database to replicate, and DynamoDB will propagate all data changes to them. Besides reducing read latency, multi-region replication ensures that single region failures do not result in any outages. Using global tables comes with quite a few caveats. Your application can read and write to any replica table, but if it requires strongly consistent reads, it must perform all its strongly consistent reads and writes against the same region. AWS documentation states that any newly written item is propagated to all replica tables within a second — not an insignificant latency — and that ACID guarantees only apply within the AWS Region where the write was originally made. When an AWS region suffers an outage or degradation, your application can redirect to a healthy region but it isn’t automatic: AWS documentation prompts developers to apply custom business logic to determine when to redirect requests to other regions. Global tables are more costly to operate. Your costs essentially multiply several times with the addition of every replica. In addition, you must also use the higher priced on-demand capacity mode, though this can be mitigated as they allow the use of provisioned capacity with auto-scaling.

Unlike DynamoDB, Fauna is multi-region-distributed by default — every database you create is replicated and distributed across geographic regions. In addition, Fauna also provides the Region Groups functionality, which allows developers to control which region their data resides in. Other than selecting the Region Group, there is no additional configuration and no custom business logic to implement. When your application makes a request to the Fauna API, it is automatically routed to the region closest to it. Reads are immediately served out of that region. And writes are automatically propagated to the other regions’ replicas automatically. Fauna publicly publishes its “internal latency” – separated into reads and writes – on status.fauna.com. There is no separate pricing scheme for single- vs multi-region with Fauna since your databases are always multi-region – providing you straightforward and transparent pricing.

Transactionality

While you want to ensure the real-time availability of data, your data needs to be consistent and accurate. Not all serverless databases can provide this.

DynamoDB, for instance, offers serializable read and write transactions but is only ACID-compliant within the region where the transaction occurs. This is a problem for multi-region deployments, because it can lead to errors during transactions that run simultaneously in different regions.

One of Fauna’s most touted innovations is its approach to isolation and distribution, which is derived from Calvin, an algorithm for achieving distributed consistency at scale. As an oversimplified explanation to what it entails, queries are handled before they interact with the storage system, while deterministic ordering – eschewing the need for expensive atomic-clocks – is employed to provide serializability, ensuring that no two concurrent transactions can commit to the same data. Through Fauna’s Calvin implementation, it is one of the few serverless databases that offers ACID guarantees while being globally distributed at the same time. Transactions are globally distributed, ACID-compliant, and serializable, with no additional configuration or separate pricing scheme.

Streaming

Real-time applications require an endpoint that enables persistent connectivity, allowing the server to push information to the client requesting it.

With DynamoDB, you’ll need to combine several services in order to implement streaming. The first part is setting up a DynamoDB Stream: DynamoDB integrates with AWS Lambda, where you set up triggers that respond to changes (DynamoDB Stream events) in your table(s). Thus, there’s a bit of configuration, and some coding involved in writing the Lambda. From there, you need to implement the persistent endpoint. AWS can accomplish this in a number of ways, and your choices are using either an EC2, API Gateway Websocket API, or AppSync (via GraphQL Streaming). Regardless, these are additional components that need to be implemented and/or configured and maintained. On top of that are the costs involved in running the additional services.

In contrast, Fauna’s API endpoints support streaming right out of the box. In order to consume a stream, all you need to do is write some lines of code on the client side of your application using a driver in your language of choice. For example, here are the instructions for instantiating a stream client in javascript. Fauna offers two kinds of streaming: document streaming and set streaming. With document streaming, an event notification is sent to the subscriber whenever a document is created, updated, or deleted. With set streaming, events are sent whenever one or more documents enter or exit the set while doing a create, delete, or update.

Security

Because they are accessed as APIs, great serverless databases must provide robust security models, so that developers do not have to build their own authentication and authorization.

Access to DynamoDB is based on AWS’s highly robust Identity and Access Management (IAM), in which granular permissions are applicable to the DynamoDB resources (tables, indexes, and stream). You can also specify IAM policies that grant access to perform specific actions (e.g, read, add, update, batch update, etc.) on specific resources for specific users.

Fauna supports both API keys and identity-based access tokens, and supports integration with external identity providers that support the OAuth flow (such as Auth0, Okta, OneLogin, etc.). Keys and tokens inherit the permissions of roles upon which they’re assigned. Roles are configured with granular permissions to access every single resource in the database, including collections (tables), indexes, user-defined functions, other roles, and schema. Fauna also provides attribute-based access control (ABAC), allowing permissions to be dynamically assigned based on tokens’ identities’ attributes. With ABAC, you can define custom business logic, creating dynamic rules that control access to all resources, all the way down to specific documents in a collection.

Automated functionalities

In DynamoDB, a host of configuration options allow you to operate various capacity modes, single- vs. multi-region, types of indexes and index-partitioning strategy used, logging, and more. The tradeoff for this flexibility is more manual work and time.

Fauna, however, automates multiple functions for you, such as database provisioning, maintenance, sharding, scaling, and replication. The “zero ops” model of Fauna lets you focus on building the critical aspects of your applications without worrying about the complexity of a distributed architecture. This is especially useful if your business needs to carefully manage limited engineering resources while dealing with unpredictable workloads.

DynamoDB includes an automated backup feature. It charges for backups separately, and you need to select from on-demand or continuous backups. The backup is optional and by project.

In Fauna, you can schedule daily backups on any database in the form of snapshots of the entire data-tree. You can also configure a retention period for them. Databases can be “restored” (overridden) in place from any backup, or you can use backups to seed new databases. Fauna also supports temporality, allowing you to go back through history and query data at any arbitrary point in time. This is because in Fauna, data is stored as snapshots across time. You can use this feature to implement point-in-time recovery and targeted data repairs in your database.

GraphQL API support

GraphQL is designed to make API development faster and more flexible for software engineers. API developers use GraphQL to have a schema that will showcase all the possible data that the clients will want to query through a specific endpoint. Unlike most other serverless databases like DynamoDB and Upstash, Fauna provides a native GraphQL API for data access in addition to its query language, FQL.

If you are looking to launch your product in less time and expect a lot of changes on your API endpoint, then using a serverless database that supports GraphQL APIs will be the better choice for you.

💡 Check out our GraphQL workshop that introduces you to Fauna. In two hours, you’ll build an application in either Next.js or SvelteKit that has authentication, user-defined functions, customer resolvers, and more.

Scalability and pricing

With some serverless databases like DynamoDB, automatic scaling is an option but isn’t set by default. There are basically three modes in DynamoDB:

The default – called provisioned capacity – lets you set the volume and scale that you need up front. You then either manually monitor and manage these parameters or be automatically throttled once the load hits your predefined capacities. This mode is ideal when you have steady, predictable loads, and you need to stay within a narrow budget.

Provisioned capacity with auto-scaling is the same as provisioned capacity but also dynamically adjusts throughput capacity based on actual traffic, allowing your application to handle sudden bursts in traffic without being throttled. Pricing is a factor of the baseline capacity you’ve configured and your application's actual load (above that baseline) that your application experiences. It is important to note that because provisioned capacity (with or without auto-scaling) involves setting a baseline desired capacity, you are always paying for that capacity, even when your traffic is zero.

On-demand capacity is priced significantly higher (on a per-unit read/write/etc. basis) than provisioned capacity, but is truly serverless and will scale to zero (where you pay nothing) if no traffic is observed or as high as needed when bursts are experienced.

The one and only mode in Fauna is on-demand. It autoscales without any intervention from you and you don’t need to manage the settings of the desired throughput. There is no need to monitor the database to keep your database safe from being saturated. As such, pricing is much more straightforward and only a factor of how much data you store, how many transactions occurred, and how large (complex) the transactions/queries are. Comparison pricing between Fauna and DynamoDB is highly based on context and use-case and especially how you plan to use DynamoDB, given its flexibility in setting things up. We cover these topics in “Comparing Fauna and DynamoDB: Pricing and features.”

Conclusion

Serverless databases offer numerous benefits, including lower costs, less operational workload, faster time to market, and high scalability. When choosing a serverless database, be sure to select one that offers you maximum consistency, high performance, and low latency.

While there are many serverless databases to choose from, Fauna offers the largest range of features for your organization’s needs. Automated scaling, multi-region ACID-compliant transactions, and GraphQL support will help you optimize your workload and improve your product for users. Your developers can focus on improving your applications and core services while worrying less about managing the database or infrastructure. This leads to a shorter development time and faster application delivery.

Interested in learning more about Fauna? Reach out to our team or sign up and start using Fauna for free.

DEV Community