Tips to Effectively Design in DynamoDB

CodeStax.Ai
6 min readJun 27, 2022

--

Over half of the tech customers across all of the “Big 3” cloud computing platforms: Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) have gone Serverless. With the growth of serverless applications the demand for serverless databases has also gone up. Regular databases with connections are overrun by connection storm issues with serverless applications. DynamoDB has been the go to database for serverless application because of its connection-less interface. But designing a DynamoDB schema is not a straightforward endeavor.

We are sharing a couple of Gotcha’s while using DynamoDB for your serverless applications.

Discover and compile the query patterns

Understand the business use cases and identify the data entities. Now list down the access patterns that are necessary and choose the most important and frequently accessed ones. These transpire into your table’s schema as partition key and sort key.

Avoid using scans

Scan go through the entire table without regard for any partitions. Any filters you use in the scan are only evaluated after the data is read from the disk and the cost of the scan includes all the data originally read from the disk.

Data can be read from the table much more effectively using queries which return data using partition key and optionally a sort key. Comparison operators (>, <, etc) and begins_with can be used on the sort key which makes the query return a range of data from the specified partition.

Don’t have too many tables

In RDBMS, we usually represent each entity and relationship as a table for ease of access. In DynamoDB it becomes very cumbersome to query across multiple tables as there are no joins. Instead, have tables for the primary entities and add the data for relevant relationships and the related entities into the same table by changing the key structure.

Utilize Local Secondary Indexes when different attributes need to be either sorted or queried frequently

When querying for the orders on a product made today where (Product ID, Order ID) is the (PK, SK), we would have to query for all orders and use a filter expression to get the results. If time of purchase is added as a LSI (Local Secondary Index), we can specify the condition in the query, which results in a much lower number of items that get read.

Be aware that you need to specify the LSI during the time of creation of a table and you can have a maximum of 5 LSIs per table.

Utilize Global Secondary Indexes for querying across partitions

Consider the example where you need to query all the orders for a given user. Using the above primary key structure (Product ID, Order ID), we would have to scan all the orders and then filter it by User ID. Instead, we can create a GSI (Global Secondary Index) with User ID as hash key and Order ID as range key. This lets us get the orders of a user with a simple query.

Keep in mind that GSIs incur additional throughput consumption and storage, so project only attributes needed into the index. Moreover, you can only have a maximum of 20 GSIs per table.

Use reverse indexes and transactions for maintaining attribute uniqueness

In DynamoDB, there is no out-of-the-box solution to ensure unique constraints for attributes that are not the primary key. So, to simulate this, we can make another entry into the table for the unique attribute as (name#uname1) in the primary key along with the usual data entry in a transaction. Be sure to also specify attribute_not_exists as a condition for the primary key in the unique attribute entry. This causes the transaction to only succeed when the name value will be unique after insertion. A detailed example is provided in an AWS blog: https://aws.amazon.com/blogs/database/simulating-amazon-dynamodb-unique-constraints-using-transactions/

Avail Dynamo Streams or Kinesis Streams when events need to be triggered upon data writes

Many business applications require large scale analytics. With how DynamoDB is built, performing frequent scans over the entire table can be too expensive. Rather, upon each write, update or delete, a DynamoDB stream can capture the event along with the data and pass it to an AWS Lambda instance. Here, you can transform the data to match the analytics schema and perform the corresponding insert/update/delete in the SQL/analytics table. DynamoDB Streams can also be used for aggregation across multiple tables, archiving/auditing, reporting, notifications/messaging and search.

Understand reading and writing costs

When using provisioned capacity, reading/writing cost is based on RCUs (Read Capacity Unit) and WCUs (Write Capacity Unit).

DynamoDB charges one WCU for each write per second (up to 1 KB) and two WCUs for each transactional write per second. For reads, DynamoDB charges one RCU for each strongly consistent read per second, two RCUs for each transactional read per second, and one-half of an RCU for each eventually consistent read per second (up to 4 KB).

When using on-demand capacity, reading/writing cost is based on Read Request Units and Write Request Units.

DynamoDB charges one write request unit for each write (up to 1 KB) and two write request units for transactional writes. For reads, DynamoDB charges one read request unit for each strongly consistent read (up to 4 KB), two read request units for each transactional read, and one-half read request unit for each eventually consistent read

Utilize DynamoDB TTL

When you need to expire an item in a table after a specific point of time, you can enable Time To Live (TTL) on the table. You specify an attribute while enabling and DynamoDB takes care of deleting the item from the table and the process itself is free of cost. It is important to note that the items that expire aren’t deleted instantly but through a background process that might take up to two days from expiration.

Limits

Finally, there are some important limits in DynamoDB such as:

LSI — 5 per table

GSI — 20 per table

Item Size — 400 KB including the name of the item

Expression Length — 4 KB

Transaction Length — 25 items

Query and Scan — 1 MB of results per operation

BatchGetItem — 100 items or 16 MB whichever is lesser

BatchWriteItem — 25 items or 16 MB whichever is lesser

About the Author

About CodeStax.Ai

At CodeStax.Ai, we stand at the nexus of innovation and enterprise solutions, offering technology partnerships that empower businesses to drive efficiency, innovation, and growth, harnessing the transformative power of no-code platforms and advanced AI integrations.

But the real magic? It’s our tech tribe behind the scenes. If you’ve got a knack for innovation and a passion for redefining the norm, we’ve got the perfect tech playground for you. CodeStax.Ai offers more than a job — it’s a journey into the very heart of what’s next. Join us, and be part of the revolution that’s redefining the enterprise tech landscape.

--

--

CodeStax.Ai
CodeStax.Ai

Written by CodeStax.Ai

Tech tales from our powerhouse Software Engineering team!

No responses yet