DynamoDB Single Table Design: A Practical Guide
TL;DR: Single Table Design puts all your entities in one DynamoDB table. It sounds insane, but with the right key design and well-defined access patterns, it gives you predictable performance at any scale. Just know the trade-offs before you commit.
What to expect:
- How DynamoDB works under the hood (partitions, keys, indexes)
- What Single Table Design is and why it exists
- The real downsides nobody talks about (ETLs, analytics, backups)
- A full multi-tenant, multi-branch architecture with real code
- When to use it and when to stay away
Introduction
I believe everyone starts studying databases with one of these two: MySQL or PostgreSQL.
Those are the core databases when you talk about SQL (Structured Query Language) databases.
They are pretty good, but, as with everything, they are not silver bullets. Lots of tables, restrictions, rollbacks, seeds, migrations, and everything else that you need to configure to keep track of everything.
New table, new migration. New column, new migration. That's a good thing, but can be pretty annoying. When you want flexibility, it can be a pain.
That's why NoSQL databases exist, and you need to understand them first, before we talk about Single Table Design.
Let's dive deep into what they are and what they're for.
What is a NoSQL database
NoSQL ("not only SQL") typically refers to databases that are non-relational, but don't get confused, you can relate data, you're just not tied to that relation.
Unlike SQL databases that store data in tables with columns, NoSQL databases can store data in several formats: documents, key-value, wide-column stores, and graphs. Nowadays, even multi-model (can store multiple formats) databases exist.
I'll briefly explain each type for you to have a better understanding.
Document-oriented databases
A document-oriented database stores data in documents that are pretty similar to JSON. Each document contains pairs of fields and values, and those values can be of several types: strings, numbers, booleans, arrays, and other objects.
{ "_id": "12345", "name": "Giovane Saes", "email": "giovanes.dev@gmail.com", "role": "product-engineer", "title": "One of the best product engineers out there!", "hobbies": ["games", "software", "coding", "writing"] }
Examples of databases that follow this structure: MongoDB, CouchDB, Firestore.
Key-value databases
A key-value store is a simpler type of database where each item consists of a key and its value.
Each key is unique and associated with a single value. Of course, that doesn't mean each key can support only numbers or strings, they can support JSON objects, lists, etc. as values.
Some key-value stores like Redis provide ultra-fast reads and writes because they store data in memory. Others, like DynamoDB, use SSDs for durable storage while still achieving single-digit millisecond latency.
Examples of databases that follow this structure: DynamoDB, Redis.
Wide-column stores
Wide-column stores store data in tables, rows, and dynamic columns. Like traditional SQL databases, the data is stored in tables, but unlike them, they are flexible, and different rows can have different sets of columns.
| Row Key | name | role | hobbies | |
|---|---|---|---|---|
| user:12345 | "Giovane Saes" | "giovanes.dev@..." | "product-eng" | ["games", ...] |
| user:67890 | "John Doe" | "john@..." | ||
| event:001 | "Tech Meetup" |
Notice how user:67890 doesn't have a role or hobbies column, and event:001 only has name. That's the flexibility of wide-column stores: different rows can have completely different sets of columns.
Examples of databases that follow this structure: Cassandra, HBase, ScyllaDB.
Graph databases
A graph database stores data in the form of nodes and edges. Nodes typically store information about the entities, while edges store information about the relationships between nodes.
Click a node to highlight its relationships
Examples of databases that follow this structure: Neo4J, Amazon Neptune.
What is DynamoDB
DynamoDB is a serverless, fully managed, distributed NoSQL database provided by Amazon.
To support a wide variety of use cases, DynamoDB supports both key-value and document data models, and its promise is to scale to virtually any workload while maintaining single-digit millisecond latency, whether you have 100 or 100 million records.
Unfortunately, DynamoDB is not open source, so there's a limit to what you can and cannot know about it under the hood.
Instead of providing screenshots of the terrifying AWS console interface, I'll provide examples using IaC. I'll use Serverless Framework YML files as examples, but in general, it's pretty easy to understand the concept.
Here's how you create a table in DynamoDB, using Serverless Framework:
resources:
Resources:
MyTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: MyAwesomeTable
AttributeDefinitions:
- AttributeName: PK # Partition Key
AttributeType: S
- AttributeName: SK # Sort Key
AttributeType: S
KeySchema:
- AttributeName: PK
KeyType: HASH
- AttributeName: SK
KeyType: RANGE
BillingMode: PAY_PER_REQUESTYou'll notice that I haven't defined any attribute. That's because DynamoDB is schemaless, which means you don't need to define a schema for your items when creating a table.
If you didn't understand what partition keys and sort keys are, there's a section for that.
How DynamoDB Works
In DynamoDB, data is organized into tables, where each table can have multiple items that represent individual records. The data is spread across tables, items, and attributes.
- Tables: the top-level data structure, each defined by a required primary key that uniquely identifies its items. Tables support secondary indexes, enabling queries on non-primary key attributes for versatility on data retrieval.
- Items: the same as rows in an SQL database, each item can have up to 400KB of data, including all of its attributes.
- Attributes: key-value pairs that store the data within an item. They can have a wide variety of types, such as: strings, numbers, booleans, string sets, number sets, etc. Attributes can also be nested, allowing complex data structures for a single item.
Now, you might think, how the hell do they pull that off?
That's because of the DynamoDB Partitions and a technique called Consistent Hashing.
Consistent Hashing
I'll try to explain this briefly, because it's a whole other topic, just enough for you to understand the concept. If you want to dive deeper, you might need to do some research.
Consistent Hashing is a technique used primarily in distributed systems to distribute data across different databases (physical locations) in a way that minimizes data movement when adding or removing instances. The original Dynamo paper (2007) described this approach, and while AWS hasn't disclosed the exact mechanism used in the current DynamoDB service, the concept illustrates how partitioned databases distribute data at scale.
To understand why it matters, consider the naive approach: modulo hashing. With 3 databases, you'd do hash(key) % 3 to pick where data goes. The problem? Add a 4th database and hash(key) % 4 gives completely different results, and almost every item needs to move. At scale, that's a disaster.
Consistent Hashing fixes this:
- Create a hash ring with a fixed number of points.
- Place databases (partitions) on the ring. Each one sits at a specific position.
- Hash the key and move clockwise. To determine which database stores an item, hash the item's key to a position on the ring, then move clockwise until you hit the next database.
The key insight: when you add or remove a database, only the items between the affected node and its neighbor need to move, roughly 1/N of the total data, instead of almost everything. That's what makes it practical at scale.
In practice, systems also use virtual nodes (multiple positions per physical database on the ring) to achieve a more even data distribution.
I know that's hard to visualize — here's something interactive:
Each data item is assigned to the nearest partition going clockwise. Hover a partition to see its data.
Partitions and Keys
DynamoDB tables are defined by a primary key, which can consist of one or two attributes: Partition Key and Sort Key.
Partition Keys (PKs) and Sort Keys (SKs) are the foundation of how DynamoDB scales.
- Partition Key: an attribute that, along with the sort key (if present), uniquely identifies each item on a table. The name "partition" is because that key defines the hash to find the physical storage location (SSD) where that item will be stored.
- Sort Key: an optional additional attribute that, when combined with the partition key, forms a composite primary key.
But what's happening under the hood?
As mentioned before, DynamoDB uses a hashing mechanism to route to the right partition. Within each partition, sort keys enable efficient range queries and ordering (similar to how B-trees work, though AWS hasn't disclosed the exact internal data structure).
When querying with both keys, DynamoDB first uses the partition key hash to find the right partition, then uses the sort key to efficiently locate and retrieve the items.
This approach allows both horizontal scaling and efficient querying within partitions, enabling DynamoDB to handle massive amounts of data while providing fast and predictable performance.
TENANT#abc123
TENANT#abc123#BRANCH
TENANT#abc123#BRANCH
TENANT#abc123#ACCOUNT
TENANT#...#TRANSACTION
TENANT#...#TRANSACTION
TENANT#...#TRANSACTION
TENANT#...#REPORTING
TENANT#...#REPORTING
Click a query to see how DynamoDB routes to the right partition
For example, let's imagine that I want to group everything of a given user. If I group all my items with a PK USER#id, I can retrieve everything of the user in a single query.
Although that's possible, we shouldn't do that, because a partition has limits: 10GB of storage and 3,000 RCUs / 1,000 WCUs of throughput. In practice, DynamoDB's adaptive capacity can burst beyond these throughput limits by borrowing unused capacity from other partitions, but the 10GB storage limit is hard. If one partition key gets too much traffic (a "hot partition"), requests will be throttled. So, if a user has a lot of data or high traffic, and you're following that approach, you might have problems.
Now, you might wonder, what if I want to query data by something that isn't the PK?
DynamoDB allows that with Secondary Indexes, which will be the next topic.
Secondary Indexes (SIs)
There are 2 types of secondary indexes, Global and Local Secondary Indexes:
- Global Secondary Index (GSI): an index with a partition key and an optional sort key that differs from the table's partition key. This is useful when you need access patterns that differ from the table's primary key. For example, if your PK is
USER#idand you want to query by email, you can add a GSI withUSER#emailas its partition key. You could search by attribute instead, but that requires iterating over all items, which doesn't scale. Since GSIs use a different partition key, the data is replicated (entirely or partially) to a separate physical location. - Local Secondary Index (LSI): an index with the same Partition Key as the table Primary Key, but with a different Sort Key. LSIs enable range queries and sorting within a partition. Since LSIs use the same partition key as the base table, they stay in the same physical location, so you need to be careful at scale.
An important thing to notice is that you have a limited number of indexes you can add to a table: 20 GSIs (can request an increase) and 5 LSIs, so you should avoid creating an index for each property. Also, be careful with LSIs since they stay in the same partition, due to the limit of 10GB of data in a single partition. In general, always prefer GSIs over LSIs.
One critical trade-off: GSI reads are always eventually consistent. You can't do strongly consistent reads on a GSI. This means that if you create an item and immediately query it via a GSI, it might not be there yet. For most use cases this is fine, but it's something to keep in mind for flows where you write and then immediately read through a GSI.
resources:
Resources:
MyTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: MyAwesomeTable
AttributeDefinitions:
- AttributeName: PK
AttributeType: S
- AttributeName: SK
AttributeType: S
- AttributeName: GSI1PK
AttributeType: S
- AttributeName: GSI1SK
AttributeType: S
KeySchema:
- AttributeName: PK
KeyType: HASH
- AttributeName: SK
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: GSI1
KeySchema:
- AttributeName: GSI1PK
KeyType: HASH
- AttributeName: GSI1SK
KeyType: RANGE
Projection:
ProjectionType: ALL
BillingMode: PAY_PER_REQUESTSingle Table Design
Okay, enough talking. Now that you have a better understanding of how DynamoDB works, let's shift into the main topic: Single Table Design.
The name is pretty suggestive and straightforward. ALL YOUR DATA LIVES IN THE SAME TABLE. Remember: partitions split data between different physical spaces, so they are in the same table, but not in the same physical space.
Imagine your PostgreSQL project, with dozens of tables and relations, in a single table?
I know, it's mind-twisting and hard to even think about the transition, but I'll guide you through it so you can understand better.
First things first, let's break down the mindset you need before even working with a Single Table Design architecture.
Access Patterns First
When working with Single Table Design, you can't even think about starting to implement things without having really well-defined access patterns.
And why is that? Because defining your PKs, SKs, and Secondary Indexes is the most important part of this approach.
There are two reasons for that:
- You need to think systematically about how to split your data, and how to access data, to avoid 2 things: hitting the limit of Indexes and creating bottlenecks of data in the same physical location.
- You need to think about how you want to query your data.
In DynamoDB, there is no ON DELETE SET NULL approach. If you change an access pattern and need to change an SK, you need to do an ETL (Extract, Transform, Load). This process, depending on the size of the project, can take hours and can be a real pain.
It's better to lose hours thinking about your data than dozens of hours running ETLs.
Of course, eventually, you might need to run an ETL, and that's okay, but the more you know about your business logic and how data will communicate, the more money and time you save.
How It Works
Now, let's see some real-world examples for you to understand it better.
For that, I'll consider a hypothetical multi-tenant, multi-branch system to store transactions.
Let's take a look at what a schema for that would look like:
| PK | SK | Entity |
|---|---|---|
| TENANT#abc123 | METADATA | Tenant |
| TENANT#abc123#BRANCH | branch-001 | Branch |
| TENANT#abc123#BRANCH | branch-002 | Branch |
| TENANT#abc123#ACCOUNT | acc-001 | Account |
| TENANT#abc123#BRANCH#branch-001#TRANSACTION | 2026-01-15#txn-001 | Transaction |
| TENANT#abc123#BRANCH#branch-001#TRANSACTION | 2026-01-16#txn-002 | Transaction |
| TENANT#abc123#BRANCH#branch-001#REPORTING | MONTHLY#2026-01 | Report |
Now, you might wonder: why hasn't he separated everything with the Tenant ID as the PK?
Well, that's because of the access patterns. Let's think about it: given a tenant (e.g. McDonald's), if I have a branch in California and one in Philadelphia, what matters is the data inside the branches.
So, by separating the data like that, we solve two problems: expensive queries (needing to sort by the SK), and well-separated partitions. Each branch has its own data well separated.
So, let's break it down:
- Tenant: PK is
TENANT#tenantId, SK isMETADATA. It's the root entity, there's only one metadata item per tenant, so the SK is a static value. - Branch: PK is
TENANT#tenantId#BRANCH, SK isbranchId. With this design, we can't list all branches of a tenant in a single query from the primary table. To do that, we'd need a GSI. In the project, we use GSI2 withBRANCH#documentas the key, which lets us find a branch by its document number. If we needed to list all branches for a tenant, we could add another GSI withTENANT#tenantIdas the partition key. This is a common trade-off in Single Table Design: you prioritize the most frequent access patterns on the primary keys, and handle the rest with GSIs. - Account: PK is
TENANT#tenantId#ACCOUNT, SK isaccountId. Accounts are scoped to a tenant. We use GSI1 withACCOUNT#emailas the partition key andTENANT#tenantIdas the sort key, so we can find an account by email without knowing the tenant ID (useful for login flows), and the result immediately tells us which tenant the account belongs to. - Transaction: PK is
TENANT#tenantId#BRANCH#branchId#TRANSACTION, SK isYYYY-MM-DD#transactionId. Transactions are scoped by branch, so each branch has its own partition. The SK is date-prefixed, which means DynamoDB naturally sorts them by date, enabling efficient range queries. - Report: PK is
TENANT#tenantId#BRANCH#branchId#REPORTING, SK isperiodType#periodValue. Pre-computed aggregations scoped by branch. The SK lets us query by period type (DAILY, MONTHLY, YEARLY) and do range queries across periods.
Also, it's worth noting that in DynamoDB, you should consider using sortable IDs like KSUID or ULID, since sort keys are already sorted by default.
The Good And The Bad
Before you go all-in and try to switch your 50-table SQL database and delete your migrations forever, we need to have a heart-to-heart. Single Table Design is a superpower, but every superpower has its Kryptonite. It's not about finding the "perfect" database; it's about choosing which trade-offs you're willing to live with.
The Upsides
- Single query, multiple entities: You can fetch related data in one request
- Cost efficiency: One table means fewer read/write capacity units to manage
- Simplified infrastructure: One table to monitor, one table to back up
- Atomic transactions: TransactWriteItems lets you write to multiple entities atomically in a single call
- Consistent performance at any scale: DynamoDB's partition-based architecture
- Scalability isn't a concern: DynamoDB will handle that for you
The Downsides
- Analytics are painful: You can't just
SELECT * FROM transactions WHERE category = 'X' AND date BETWEEN ...across your entire dataset. You need ETL pipelines to export data to S3/Athena for analytics. - Backup granularity: Point-in-time recovery restores the entire table, not specific entity types. If you need to restore just your Transactions but not touch Reports, you're out of luck. You'd need to restore to a new table and cherry-pick what you need.
- Hard to deal with changes: Prepare hours and some cronjobs to backfill data if you ever need to change an access pattern that you didn't expect.
- Steep learning curve: The mental model is completely different from relational databases. Your team needs to understand access patterns, key design, and GSI projections.
- Hard to evolve: Adding a new access pattern after the fact can be painful. You might need a new GSI or even a data migration.
- Tooling and debugging: Looking at a DynamoDB table in the AWS Console with mixed entities is not fun. Everything looks like a mess until you understand the key patterns.
Real Example: Multi-Tenant, Multi-Branch Architecture
Now that we've covered the theory, let me show you how all of this looks in a real project.
I've been working on a financial management application that supports multiple tenants (think of a restaurant chain or a franchise), where each tenant can have multiple branches (locations in different cities). Each branch manages its own financial transactions and generates reports.
The domain is straightforward: a Tenant has Branches and Accounts (users). Each Branch owns Transactions (income and expenses) and Reports (pre-computed aggregations). Let's see how this translates into a Single Table Design.
The Table Definition
Here's the actual table definition using Serverless Framework. Notice how we only define the key attributes (PK, SK, and GSI keys), but no business attributes. That's DynamoDB being schemaless:
MainTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: main-table
BillingMode: PAY_PER_REQUEST
PointInTimeRecoverySpecification:
PointInTimeRecoveryEnabled: true
DeletionProtectionEnabled: true
AttributeDefinitions:
- AttributeName: PK
AttributeType: S # String
- AttributeName: SK
AttributeType: S
- AttributeName: GSI1PK
AttributeType: S
- AttributeName: GSI1SK
AttributeType: S
- AttributeName: GSI2PK
AttributeType: S
- AttributeName: GSI2SK
AttributeType: S
KeySchema:
- AttributeName: PK
KeyType: HASH
- AttributeName: SK
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: GSI1
KeySchema:
- AttributeName: GSI1PK
KeyType: HASH
- AttributeName: GSI1SK
KeyType: RANGE
Projection:
ProjectionType: ALL # Copy all attributes
- IndexName: GSI2
KeySchema:
- AttributeName: GSI2PK
KeyType: HASH
- AttributeName: GSI2SK
KeyType: RANGE
Projection:
ProjectionType: ALLEntity Key Design
In the project, each entity has its own Item class that encapsulates the key construction. This pattern is commonly called key overloading: using generic attribute names (PK, SK) to store different entity types with different key structures in the same table. I really like it, because it keeps all the key logic in one place and makes it easy to understand how each entity is stored.
Now let's see how each entity builds its keys in TypeScript.
Here's how the Account entity builds its keys, including the GSI for email lookups:
export class AccountItem {
static getPK(tenantId: string) {
return `TENANT#${tenantId}#ACCOUNT`;
}
static getSK(accountId: string) {
return accountId;
}
static getGSI1PK(email: string) {
return `ACCOUNT#${email}`;
}
static getGSI1SK(tenantId: string) {
return `TENANT#${tenantId}`;
}
}The GSI1 here is what allows us to find an account by email without knowing the tenant ID. When a user logs in, we know their email but not which tenant they belong to, so we query GSI1 instead of the primary table. Notice how the SK carries the tenant ID — this means the query result immediately tells us which tenant the account belongs to, without needing to fetch additional attributes.
Here's the Transaction entity, the most interesting one because of the date-prefixed SK:
export class TransactionItem {
static getPK({ tenantId, branchId }: { tenantId: string; branchId: string }) {
return `TENANT#${tenantId}#BRANCH#${branchId}#TRANSACTION`;
}
static getSK({
createdAt,
transactionId,
}: {
createdAt: Date;
transactionId: string;
}) {
const year = createdAt.getUTCFullYear();
const month = String(createdAt.getUTCMonth() + 1).padStart(2, "0");
const day = String(createdAt.getUTCDate()).padStart(2, "0");
return `${year}-${month}-${day}#${transactionId}`;
}
}Notice how the SK combines the date with the transaction ID. This means that when we query all transactions for a branch, they come back naturally sorted by date. And if we want transactions from a specific date range, we can use the SK in a BETWEEN condition.
And the Report entity, which uses period type and value as the SK:
export class ReportItem {
static getPK({ tenantId, branchId }: { tenantId: string; branchId: string }) {
return `TENANT#${tenantId}#BRANCH#${branchId}#REPORTING`;
}
static getSK({
periodType,
periodValue,
}: {
periodType: string;
periodValue: string;
}) {
return `${periodType}#${periodValue}`;
}
}This allows us to query reports by period type (DAILY, MONTHLY, YEARLY) and do range queries like "give me all monthly reports from January to June".
Multi-Tenant Isolation
One of the best things about this approach is that tenant isolation is baked into the key design itself.
Every single query requires a tenantId in the partition key. As long as your application code uses Query operations (not Scan), there's no way to accidentally read another tenant's data, because you can't query a partition without knowing the tenant ID.
If I want to query transactions, I need to provide TENANT#tenantId#BRANCH#branchId#TRANSACTION as the PK. If I pass the wrong tenant ID, I just get an empty result, because the data lives in a completely different item collection.
The hierarchy is straightforward:
- A Tenant is the root entity. Everything belongs to a tenant.
- Branches and Accounts are scoped to a tenant. You can't have a branch without a tenant.
- Transactions and Reports are scoped to a branch within a tenant. You can't access them without knowing both the tenant and the branch.
This is more robust than SQL's WHERE tenant_id = ?, because the isolation is built into the key structure itself. In SQL, forgetting that WHERE clause leaks data. In DynamoDB, the tenant ID is required to even construct the query. It's isolation by design.
Querying in Practice
Let's see how the actual queries look using the AWS SDK.
Getting a specific item
The simplest operation. You know exactly what you want, you provide the PK and SK, and DynamoDB gives it to you:
async get({ branchId, createdAt, tenantId, transactionId }) {
const getCommand = new GetCommand({
TableName: "main-table",
Key: {
PK: TransactionItem.getPK({ tenantId, branchId }),
SK: TransactionItem.getSK({ transactionId, createdAt }),
},
});
const { Item } = await dynamoClient.send(getCommand);
return Item;
}That's it. One call, one item. No JOINs, no WHERE clauses. If the item exists, you get it. If it doesn't, you get undefined.
Listing with pagination and filters
This is where DynamoDB shines. We query by PK, sort descending (most recent first), and paginate with cursors. You'll notice ExpressionAttributeNames aliasing attributes with # prefixes — DynamoDB requires this to avoid conflicts with reserved words (and status, name, type are all reserved):
async getAll({ tenantId, branchId, limit = 50, exclusiveStartKey }) {
const queryCommand = new QueryCommand({
TableName: "main-table",
KeyConditionExpression: "#PK = :pk",
ExpressionAttributeNames: { "#PK": "PK" },
ExpressionAttributeValues: {
":pk": TransactionItem.getPK({ tenantId, branchId }),
},
Limit: limit,
ScanIndexForward: false,
ExclusiveStartKey: exclusiveStartKey,
});
const { Items = [], LastEvaluatedKey } = await dynamoClient.send(queryCommand);
return {
transactions: Items,
lastEvaluatedKey: LastEvaluatedKey,
hasMore: !!LastEvaluatedKey,
};
}ScanIndexForward: false is what makes the transactions come back sorted by most recent first, since our SK is date-prefixed. And ExclusiveStartKey is DynamoDB's way of handling pagination, think of it as a cursor.
If you want to add filters (by transaction type, amount range, category), you can use FilterExpression:
FilterExpression: "#transactionType = :type AND #amount BETWEEN :min AND :max",
ExpressionAttributeNames: {
"#transactionType": "transactionType",
"#amount": "amount",
},
ExpressionAttributeValues: {
":type": "EXPENSE",
":min": 100,
":max": 500,
},One important thing to note: filters are applied after the query, not before. DynamoDB first fetches all items matching the PK/SK condition, then applies the filter. This means you still consume read capacity for the filtered-out items. That's why key design matters so much.
Querying by GSI
Sometimes you need to find data by an attribute that isn't the PK. That's what GSIs are for. For example, finding an account by email:
async findByEmail(email: string) {
const queryCommand = new QueryCommand({
IndexName: "GSI1",
TableName: "main-table",
Limit: 1,
KeyConditionExpression: "#GSI1PK = :GSI1PK",
ExpressionAttributeNames: {
"#GSI1PK": "GSI1PK",
},
ExpressionAttributeValues: {
":GSI1PK": AccountItem.getGSI1PK(email),
},
});
const { Items = [] } = await dynamoClient.send(queryCommand);
return Items[0] || null;
}This is the email lookup pattern from earlier in action. The GSI lets us bypass the primary key entirely and find the account with just the email. Since the GSI1SK contains the tenant ID, the returned item already tells us which tenant this account belongs to — no extra lookups needed.
Conditional writes
In a financial application, preventing race conditions is critical. DynamoDB's ConditionExpression lets you enforce constraints at write time. For example, preventing duplicate entries:
async create({ tenantId, branchId, transaction }) {
const putCommand = new PutCommand({
TableName: "main-table",
Item: {
PK: TransactionItem.getPK({ tenantId, branchId }),
SK: TransactionItem.getSK({
transactionId: transaction.id,
createdAt: transaction.createdAt,
}),
...transaction,
},
ConditionExpression: "attribute_not_exists(PK)",
});
await dynamoClient.send(putCommand);
}attribute_not_exists(PK) ensures the item doesn't already exist. If it does, DynamoDB throws a ConditionalCheckFailedException instead of silently overwriting the data. This is a conditional write — DynamoDB's way of enforcing constraints at write time — and it's essential for any write where you care about data integrity.
Pre-Computed Aggregations
One of the coolest patterns in this architecture is the Report entity. Instead of running expensive queries to calculate totals every time someone opens a dashboard, we pre-compute the aggregations whenever a transaction is created or deleted.
The idea is simple: every time a transaction happens, we update the corresponding report (daily, monthly, yearly) with atomic counters. DynamoDB handles this with UpdateCommand and if_not_exists, meaning the report is automatically created the first time and incremented from there:
async updateCounters({ tenantId, branchId, periodType, periodValue, transactionType, amount }) {
const isIncome = transactionType === "INCOME";
const updateCommand = new UpdateCommand({
TableName: "main-table",
Key: {
PK: ReportItem.getPK({ tenantId, branchId }),
SK: ReportItem.getSK({ periodType, periodValue }),
},
UpdateExpression: `
SET
#totalIncome = if_not_exists(#totalIncome, :zero) + :incomeAmount,
#totalExpense = if_not_exists(#totalExpense, :zero) + :expenseAmount,
#transactionCount = if_not_exists(#transactionCount, :zero) + :countDelta
`,
ExpressionAttributeNames: {
"#totalIncome": "totalIncome",
"#totalExpense": "totalExpense",
"#transactionCount": "transactionCount",
},
ExpressionAttributeValues: {
":zero": 0,
":incomeAmount": isIncome ? amount : 0,
":expenseAmount": isIncome ? 0 : amount,
":countDelta": 1,
},
});
await dynamoClient.send(updateCommand);
}The if_not_exists is the key here. It means: "if this attribute doesn't exist yet, start from zero, then add the value." This way, we don't need to check if the report exists before updating it. DynamoDB handles the initialization automatically.
The reports also support breakdown maps, where we track totals by category, cost center, and payment method. This uses nested map updates:
// First, ensure the entry exists with default values
UpdateExpression: `
SET #byCategory.#entryKey = if_not_exists(#byCategory.#entryKey, :defaultEntry)
`
ExpressionAttributeValues: {
":defaultEntry": { totalIncome: 0, totalExpense: 0, transactionCount: 0 }
}
// Then, increment the values atomically
UpdateExpression: `
SET
#byCategory.#entryKey.#totalIncome = #byCategory.#entryKey.#totalIncome + :incomeAmount,
#byCategory.#entryKey.#totalExpense = #byCategory.#entryKey.#totalExpense + :expenseAmount,
#byCategory.#entryKey.#transactionCount = #byCategory.#entryKey.#transactionCount + :countDelta
`This two-step approach (initialize then increment) is necessary because DynamoDB can't use if_not_exists and arithmetic on nested attributes in the same expression. It's a small trade-off for having real-time aggregations without any external processing.
When someone wants to see the monthly report for a branch, we just do a simple GetCommand with the PK and SK. No scanning, no aggregation queries, no waiting. The data is already there.
In practice, you'd want the transaction creation and report update to happen atomically. If one succeeds and the other fails, your data is inconsistent. DynamoDB's TransactWriteItems handles this:
async createWithReport({ tenantId, branchId, transaction }) {
const transactCommand = new TransactWriteCommand({
TransactItems: [
{
Put: {
TableName: "main-table",
Item: {
PK: TransactionItem.getPK({ tenantId, branchId }),
SK: TransactionItem.getSK({
transactionId: transaction.id,
createdAt: transaction.createdAt,
}),
...transaction,
},
ConditionExpression: "attribute_not_exists(PK)",
},
},
{
Update: {
TableName: "main-table",
Key: {
PK: ReportItem.getPK({ tenantId, branchId }),
SK: ReportItem.getSK({
periodType: "DAILY",
periodValue: formatDate(transaction.createdAt),
}),
},
UpdateExpression: `
SET #totalIncome = if_not_exists(#totalIncome, :zero) + :incomeAmount,
#totalExpense = if_not_exists(#totalExpense, :zero) + :expenseAmount,
#transactionCount = if_not_exists(#transactionCount, :zero) + :one
`,
ExpressionAttributeNames: {
"#totalIncome": "totalIncome",
"#totalExpense": "totalExpense",
"#transactionCount": "transactionCount",
},
ExpressionAttributeValues: {
":zero": 0,
":incomeAmount": transaction.type === "INCOME" ? transaction.amount : 0,
":expenseAmount": transaction.type === "EXPENSE" ? transaction.amount : 0,
":one": 1,
},
},
},
],
});
await dynamoClient.send(transactCommand);
}If either operation fails, both are rolled back. This is especially important for financial data where you can't afford orphaned transactions or stale reports. One thing to keep in mind: transactional operations consume 2x the write capacity of standard operations, so use them where atomicity matters, not everywhere.
An alternative approach would be to use DynamoDB Streams with a Lambda function to update reports asynchronously whenever a transaction is written. This decouples the report update from the transaction creation but adds eventual consistency to your reporting data. Both approaches are valid. Synchronous is simpler and keeps reports immediately consistent, while streams scale better for high write throughput.
Conclusion
And that's Single Table Design. Not a silver bullet, but a really fun path to learn.
The core takeaway: invest the time upfront in your access patterns and key design. The partition key structure defines your tenant isolation, the sort key design enables your query flexibility, and the GSIs cover the rest. Get those right, and DynamoDB handles the scaling for you.
If your application needs flexible ad-hoc queries, your access patterns aren't well-defined yet, or your team isn't familiar with DynamoDB, a relational database will probably serve you better. But if you have clear access patterns and need predictable performance at scale, Single Table Design is worth the investment.
If you've never used it, I invite you to try it on a new project. I can guarantee you're going to have a lot of headaches, as I did. But once it clicks, you'll never look at data modeling the same way again.
If this post was helpful, consider leaving a like and sharing it with your friends. It helps me know what resonates with you.
Feel free to leave a comment about your experience with Single Table Design!
Farewell!