AWS cloud practitioner notes - Storage and databases

Last updated Apr 6, 2024 Published Jan 7, 2021

The content here is under the Attribution 4.0 International (CC BY 4.0) license

This module describes the AWS services for storage and databases. AWS has different services that go from file storage services to serverless databases.

Previous: AWS cloud practitioner notes - Networking

Module 5 - Instance stores and Amazon Elastic Block Store

Block storage levels are places to store files, EC2 has different types of storage as well.

  • instance store volumes (physical attached to AWS host) - temporary type [1]
  • EBS are virtual hard drives or EBS volumes - persistent type

Incremental snapshots (backup) can be taken from EBS volumes and restored later.

Module 5 - Amazon Simple Storage Service (Amazon S3)

Amazon S3 is a storage service that allows you to store and retrieve files at any scale and pay only for what you use.

  • Store data as an object
  • Storage objects in buckets
  • Upload maximum object size of 5TB
  • Version objects
  • Create multiple buckets

S3 classes (or tiers)

  • S3 standard 99,99999999999% of durability
  • S3 standard infrequent access (backups, disaster recovery files)
  • S3 One Zone-Infrequent access (S3 one Zone-IA)
  • S3 intelligent-Tiering (unknown or changing access patterns)
  • AWS Glacier to archive data (Able to retrieve objects in minutes)
  • AWS Glacier deep archive (Able to retrieve objects within 12 hours)

It is possible to move objects between tiers through S3 lifecycle management. For example, from s3 standard to s3 infrequent access.

Comparing Amazon EBS and Amazon S3

EBS S3
up to 16TB Unlimited storage
Survive EC2 termination Individual objects up to 5 TB
Solid state by default Write once/read many
HDD options 99,99999999999% durability

Use case 1 - App to upload a photo file

S3 is the preferred approach here, for the following reasons:

  • Web-enabled
  • Regionally distributed
  • Offers cost savings
  • Serverless

Use case 2 - Video editing on a file

Object storage treats every file as a complete discrete object, perfect for files that are consumed as a whole.

Block storage breaks the files into smaller pieces (blocks), for a bunch of small changes, EBS is preferable. In short:

  • Complete changes = S3
  • Complex read, write, change functions = EBS

Module 5 - Amazon Elastic File System (EFS)

Multiple instances can access the data in EFS at the same time, it scales up and down as needed. The differences between EBS and EFS are:

EBS EFS
Amazon EBS are attached to EC2 instances Multiple instances reading and writing simultaneously
Availability zone level resource True file system/multiple availability zones
Need to be in the same availability zone to the attached EC2 instance Regional resource
EBS volumes do not automatically scales Automatically scales up and down

Module 5 - Amazon Relational Database Service (RDS)

  • Automated patching
  • Backups
  • Redundancy
  • Failover
  • Disaster recovery

Amazon Aurora

  • MySQL or PostgreSQL support
  • 1/10th cost of commercial databases
  • Data replication
  • Up to 15 read replicas
  • Automated backup to S3

Module 5 - Amazon DynamoDB

DynamoDB is serverless in the sense that you don’t have to provision, install, maintain or operate the server that the database is in. DynamoDB scales automatically to adjust the changes in the database.

  • Non-relational database
  • Millisecond response time
  • Fully managed
  • Highly scalable

Comparing Amazon RDS and DynamoDB

RDS DynamoDB
Automatic high availability Key-value
Customer ownership of data Massive throughput capabilities
Customer ownership of schema PB size potential
Customer control of network Granular API access

Use case 1 - sales supply chain application

RDS is the choice as its application is built for analytics and requires complex relationships between the data.

Use case 2 - Employee contact list application

Single table territory, is potentially relational, but not required as the relationship between data would create an overhead in maintaining the relationships.

Module 5 - Amazon Redshift

Amazon Redshift is a data warehouse service used for analytics. You can collect data from many sources and see the relationships across the data.

Module 5 - Amazon Database Migration Service (AWS DMS)

Amazon Database Migration Service helps you to migrate databases into AWS.

Homogenous databases

The first type of migration is: homogenous. Homogenous databases are migrations across the same database type. For example:

  • MySQL to Amazon RDS for MySQL
  • Microsoft SQL Server to Amazon RDS for SQL Server
  • Oracle to Amazon RDS for Oracle

Heterogeneous databases

The second type of migration is heterogeneous databases. Which provides a migration for different database vendors. For this type of migration, there are two steps, the first is the conversion from the database source into the origin database. Then the last step is to migrate.

Module 5 - Additional database services

  • Amazon DocumentDB - document database that supports MongoDB
  • Amazon Neptune - Graph database service
  • Amazon Quantum Ledger Database - Review a complete history of all the changes that have been made to your application data
  • Amazon Managed Blockchain - A service used to create managed blockchain networks
  • Amazon ElastiCache - A service that adds a caching layer to improve application response time
  • Amazon DynamoDB accelerator - In-memory cache for DynamoDB

Resources

Up next

Security

References

  1. [1]AWS, “Amazon EC2 instance store,” 2021 [Online]. Available at: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html. [Accessed: 07-Jan-2021]