Blog Categories

Comparing AWS Cloud Database Technologies: Non-Relational Databases

Omotayo Akinbode
Data Engineer

Non-Relational Databases AWS

A substantial amount of database types are now offered in the cloud to suite a variety of business use cases. This adds more flexibility to data and application solution design but increases the difficultly of distinguishing which type is best suited depending on the scenario. In a previous post, we looked at the relational databases available on AWS and defined how they can be best leveraged for a variety of solutions. This post explores the diametric set of databases that fall into the non-relational category and serves to disentangle those services which are offered on AWS.

  1. Non-Relational Databases
    1. Key-Value
    2. Document
    3. IN-Memory
    4. Graph
    5. Search
    6. Time-Series
    7. Ledger
    8. Table Comparison

Non-Relational Databases

This is the opposite of a Relational database, here the data is not stored in a tabular structure (columns and rows) and no predefined relationship exists among tables. This means it’s based on the type of data its storing which makes it very flexible and adaptable.

TYPES OF NON-RELATIONAL DATABASES: Key-Value, Document, IN-Memory, Graph, Search, Time-Series, Ledger

1) Key-Value

A key-value database stores data as a key-value pair. An example is a dictionary, where the key is the unique identifier, and the value holds the associated attributes. The two major Key-value databases available on AWS are Amazon DynamoDB and Amazon Keyspaces.

A) Amazon DynamoDB

This is a fully managed serverless NoSQL database in AWS, it is one of the most commonly used NoSQL databases because it supports Updating transactions across multiple tables (ACID) and it allows for in-memory caching with DAX. These are some of the features

  1. Global Tables: Multi-region, multi-master database
  2. Backups are allowed and supports point-in-time recovery
  3. Single-digit millisecond performance at any scale
  4. Supports CRUD (Create/Read/Update/Delete) operations through APIs
  5. No direct analytical queries (joins are not allowed)
  6. Access patterns must be known ahead of time for efficient design and performance

B) Amazon Keyspaces

Amazon Keyspaces is a fully managed serverless database that is used to execute Cassandra workloads on AWS. Cassandra is an open-source NoSQL database. Keyspaces is available in both On-demand and Provisioned mode.

C) AWS S3

S3 is also considered a Key-Value database which is used for storing huge volumes of data (semi-structured or unstructured data). For each uploaded file, the Key is the unique filename, and the value is the content of the file.


2) Document:

This is a document NoSQL database, it is used for storing and managing json-like documents. This data model is commonly used by developers because it has same data format used in their application code. Documents store data in field-value pairs. There is only one document database on AWS (Documents store data in field-value pairs)

A) Amazon DocumentDB

This is a Fully managed NoSQL document database for executing MongoDB workloads. These JSON documents are stored in collections. A collection is a group of documents similar to a table. It uses the same architecture as Aurora.


3) IN-Memory

Used for performing in-memory tasks mainly in situations where accessing data in a disk might be an expensive operation. So having an in-memory database helps save time and improve performance than pulling the same information from the disk.

A) Amazon ElastiCache

This is the main Fully managed in-memory service available on AWS. It has its own dedicated caching instance (Remote cache). Elasticache supports two in-memory engines (Redis and Memcached). Redis is suitable for complex applications including message queues, session caching, leaderboards etc. Memcached on the other hand is suitable for simple Application Aurora (w/ integrated cache) Database caches and is also useful when working with Multithreaded architecture

B) DynamoDB Accelerator (DAX)

Dax is an in-memory caching service that is used with DynamoDB, it allows for faster in-memory operations and better performance when working with DynamoDB. There are two types of DAX caches (item cache and query cache). Item cache stores results of index reads while Query cache stores results of Query and Scan operations. A DAX’s use case is when users access a small number of items more frequently than others.

Useful Scenario

If you have an application that needs to be accessed very often for the same information e.g. gaming Leaderboard. The best solution will be to have an in-memory database (Amazon ElastiCache for Redis), since it allows us to store frequently accessed data mainly to perform read operations


4) Graph

Graph database shows how data is interconnected.  It provides a high-level detail of the relationship between the data in a database using nodes (stores data entities) and edges (stores the relationship information between edges).

A) Amazon Neptune

Amazon Neptune is the Fully managed Graph database service available on AWS. This database makes it easy to quickly access complex relationships between connected datasets. It uses Apache TinkerPop Gremlin and RDF/SPARQL as the graph query languages

Useful Scenario

Fraud detection and Recommendation engines. For fraud detection, transactions will be stored as graphs and this will help identify related pieces in a dataset. Once the patterns are detected it then becomes easy to find the fraudulent ones.


This service makes it easy to search for any kind of information in your data warehouse and to provide near real-time visualizations and analytics of your data (this includes log files, text files, messages etc.).

A) Amazon OpenSearch Service

This is the fully managed search service available on AWS. This is an open-source fork of Elasticsearch and Kibana, it was recently renamed from Amazon Elasticsearch Service to Amazon OpenSearch Service.

Useful Scenario

This is mostly used by developers, and it can be used for Full-text search and Log analytics. An example is searching documents for a particular word. What this does is it gives an aggregate count of the word and summarizes the data.


6) Time-Series

Time series database is used to effectively store and retrieve trillions of events in real-time. This data is stored as a pair of time and associated value. Using this process makes it easy to analyze time series since we are working with data points in time.

A) Amazon Timestream    

This is a fully managed serverless time-series database service. It is used to process a huge volume of data over time.

Useful Scenario

Stock market and IoT device data of high volume where trending of patterns centered on time is the most important dimension to analyze data by. AWS Timestream has in-memory capabilities which make real-time use case (on analyzing the most recent data) extremely performant.


7) Ledger

This database is an append-only NoSQL database i.e. it is an immutable, transparent, and cryptographically verifiable ledger

  1. Amazon QLDB

This is a Fully managed serverless ledger database that uses PartiQL as the query language and stores the data in Amazon ION format. The three main features of QLDB are Ledger, Journal, and tables.

  • Ledger contains a journal and list of tables
  • Journal holds the ordered history of the cryptographically verifiable entry of every change made in the tables.
  • Tables are set of documents i.e. actual data and are stored in the amazon ion format

Useful Scenerio

A government agency requires a method to track the history of Vehicle ownership. In this case, a non-immutable record-based system such as AWS QLDB which stores the history of Vehicle ownership over time could be a perfect fit.

Lastly, I have highlighted the differences between the NoSQL databases in the table below:

DatabaseData TypeWorkloadsData SizePerformance
Amazon DynamoDBSemi-structuredTransactional Key-Value / Document StoreHigh TB rangeUltra-high throughput, low latency (ultra-low latency with Dax)
Amazon KeyspacesSemi-structuredCassandraN/ALow latency
Amazon DocumentDBSemi-structuredMongoDBUp to 64 TBHigh throughput, low latency
Amazon ElastiCacheSemi-structured/ UnstructuredIn-memory cachingLow TB RangeHigh throughput, ultra-low latency
Amazon NeptuneGraph-StructuredHighly connected graph datasetsMid TB RangeHigh throughput, low latency
Amazon QLDBStructured/ Semi-structuredTransactionalN/AHigh throughput, low latency


In Conclusion

The explosion of non-relational database services has simplified and optimized backend architectures for a variety of old and new use cases. These non-relational databases cater to specific use cases and what AWS has been doing in recent times is bringing all these databases into their platform, therefore making it easy to have access to them in one environment. For example, Amazon QLDB is used for developing ledger databases and Amazon Keyspaces is used to run Cassandra workloads. If you are looking for a cloud solution that fits your business, you can reach out to us directly.

Indellient takes a customer-first approach to help you build a modern cloud strategy on Amazon Web Services, Windows Azure, and Google Cloud Platform. Our team can help you build, replatform, migrate and integrate applications, so you can benefit from the scalability, agility, and performance available through cloud technologies.

Indellient is an IT Professional Services Company that specializes in Data AnalyticsCloud ServicesDevOps Services, and Business Process Management.

Learn More

About The Author

Hello, my name is Akinbode Omotayo, Senior Data Engineer at Indellient. The skillset I have honed is from over 7 years in the IT field, after completing my Master’s program from University of Ottawa’s Systems Science program. I have an enthusiasm for Big Data programming projects (data warehousing and Big Data Analytics projects)