Dealing with Relational Data

With the basic structure of DynamoDB in mind, how do we go about building an application that has related data that would normally be split up in a relational database? For this example, imagine we have a software consultancy firm with a number of employees working on projects for clients. In addition, we want to track the technology used within these projects. Because everyone loves a good entity relationship diagram, the database may be structured like this:

Approaching the Problem

When using a relational database, these entities would be normalized into different tables, and we would use SQL queries to combine them dynamically into views our application can use. DynamoDB tries to reduce the computational overhead of such queries, and instead tries to store data in the format it would be consumed. In fact, according to AWS a well-designed application requires only one table! How can this be so?

If we tried to use Dynamo like RDBMS

You may imagine that we would create an individual table for clients, projects, employees, and technologies, however, this pattern doesn’t leverage DynamoDB in the most effective way. If we have a scenario where we need to delete a client, we would want their projects deleted as well. If we used multiple tables, our process would look like:

  • Find all client projects
  • Delete all projects
  • Delete the client

Problems:

  • Multiple API calls
  • Very error-prone as code gets refactored and extended
  • We lose all advantages of working with a NoSQL solution and have to deal with all the disadvantages

New Application Requirements

Let’s rethink the way our application is structured to leverage a NoSQL paradigm instead. If we use partition keys, sort keys, and indexes effectively, we can model that relational data in a way that allows complex queries, deletions, and insertions. Our new table may look like:

database item partition key sort key data
client (type+client Name) (type+client Id) (client contact info)
client:acme_shipping client_id:01 345 Maple Ave. Jeffersonville KY
client:some_healthcare_co client_id:02 ……
client relationship to project (type+client Name) (type+project name) (technologies)
client:some_healthcare_co project:claims_dashboard [ Node, AngularJS, REST ]
client:some_healthcare_co project:smart_data_intake [ Node, Angular, GraphQl ]
client:acme_shipping project:project_flyover [ Node, Angular, GraphQl ]
client:acme_shipping project:intersate_sync [ Express, React, GraphQl ]
client:acme_shipping project:iot_pilot [ Golang , Angular, AppSync ]
     
     

Note how the partition key is the same for both projects and clients, but the sort key is different. This allows you to set up relations to different items by querying on PK for the equivalent of a join. This pattern of design is called the Adjacency List Design Pattern, and you can read more about it in the DynamoDB context here.

Review

Modeling Relationships

  • Just like relational databases, there are guidelines for effectively dealing with one-to-one, many-to-one, and many-to-many data.
  • We store related items close together using partition keys.

Taking advantage of Sort Keys

  • The sort key lets us control the granularity of our data and the order in which it’s returned.
  • Partition keys often tell us what “cabinet” or “bucket” our item belongs to.
  • Sort keys are like folders in the cabinet, containing files.

Access Patterns

  • Unlike relational databases, we need to know what kind of queries we’re going to execute in advance to understand how to store our data.
  • We design our keys and any additional indexes based on the business questions we expect to ask most frequently.

Conclusions

Relational database systems (RDBMS) and DynamoDB have different strengths and weaknesses.

RDBMS

  • Pros
    • Ensures schema and relationships in data
    • Easier to maintain transactions
    • Data can be queried flexibly and store data efficiently
  • Cons

DynamoDB

  • Pros
    • Highly available and fast
    • Scales automatically
    • Schemaless so easier to change
  • Cons
    • Need to rethink normalization and consistency
    • Must understand the queries and materialized views when designing
    • Can only sort on sort keys or secondary indexes

As with any solution, there is no silver bullet. There are many ways to solve the problems at hand, and we try to find the solution that fits that need the best. For more information on DynamoDB and best practices, please visit: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html

 

Click Here to Read Part 1

Click Here to Read Part 2