AZ-204 Developer Associate: Exploring Azure Storage Solutions for Developers
The content here is under the Attribution 4.0 International (CC BY 4.0) license
Storage is one of the main concepts to get familiar with for AZ-204 (and also other exams). In this section, we will go over different aspects of the storage account in Microsoft and its services, more specifically: access keys, azcopy tool, blobs, redundancy and cosmos DB.
Storage solutions
To follow up the code examples, it is required to have the dotnet sdk installed. For linux, instructions can be found under Microsoft’s official documentation
- Azure storage account provides storage on the cloud
- blob: general purposes (images, videos)
- requires a container
- container can be private, read access for blobs and read access for containers and blobs
- files are stored as an object (blob)
- requires a container
- table: table data
- queue: pub/sub
- file: shared files between vms
- blob: general purposes (images, videos)
access keys, shared access signatures and Azure active directory
- azure storage explore (desktop app)
- access keys
- SAS (Shared access signatures)
- can define the expiration date
- limited access to specific services
- can´t be revoked
- Store Access Policies
- can be revoked, fine-grained access
- Active Directory
- Access control (IAM - Identify Access Management)
- SAS (Shared access signatures)
Revoke SAS
- Revoke the delegation keys
- Remove the role assignment for the security principle
Access tiers
hot
- selected by default when creating a storage account
cool (cost least)
- when you create the storage account you set the type (cool or hot)
archive
- it adds a rehydrate step to retrieve the file (it takes time, up to 15h for standard or 1h for high priority)
- can send single files to the archive
- to retrieve an object in archive mode, go to edit and change the tier to standard or high
Storage account v1 does not support event grid
Performance
- Standard
- Premium
- Block blobs
- File shares
- Page blobs
Redundancy
- LRS (Locally Redundant Storage)
- GRS (Globally Redundant Storage)
- ZRS (Zone)
- GZRS (Globally and Zoned)
Data protection
soft deletes
Resources marked as deleted (with soft delete) are retained for a specified period (90 days by default). The service further provides a mechanism for recovering the deleted object, essentially undoing the deletion.
Blobs
- Life cycle rules
- add a rule to all blobs or limit blobs with filters
- example of rules
- if the blob has not been modified for one day move it to the cool storage
- rules take 24 hours to be applied
- Blob versioning
- blobs are immutable - pricing?
- To enable blob: data protection -> check the “Turn on versioning for blobs” box
- Blobs snapshots
- picture of a point in time of a particular blob
- deleting a blob requires deleting a snapshot
-
Soft delete
- Data protection -> check Turn on soft delete for blob (based on number of days)
- It recovers blob snapshots
Using blob storage sdk
Note: to follow the code examples Azure.Storage.Blobs package is required.
The exam refers to the version 11/12, but in the mock exams version 12 and 13 can be found.
- Azure.Storage.Blobs - v12
- BlobServiceClient class
- package Azure.Storage.Blobs
- Work with Azure storage resources and blob containers (create container)
- BlobContainerClient class
- package Azure.Storage.Blobs
- Work with Storage containers and blobs (uploads blob to container sync or async - it also offers a flag to overwrite if the file already exists in the container, list all blobs in a container)
- BlobClient class
- Work with Storage blobs (download blob)
- BlobDownloadInfo class
- Represents the content returned from a downloaded blob
- BlobSasBuilder
- package using Azure.Storage.Sas;
- enables to set shared access signatures (SAS) programmatically (BlobSasBuilder)
- Each blob can hold metadata
- accessing metadata programmatically can be achieved through .getProperties under BlobClient
- Lease (Exclusive lock)
- To acquire the lease, the method GetBlobLeaseClient is used to retrieve the lease representation
- Under the lease representation the method Acquire is called with the time of the lease
- in the end, call the method Release to release the lease - if no Release is called, will Azure release the lock after the time specified?
- Streams can be used to make changes to a file in memory
- MemoryStream
- StreamReader
- StreamWriter
- To acquire the lease, the method GetBlobLeaseClient is used to retrieve the lease representation
- BlobServiceClient class
- ARM templates
- Automation of storage account process
- Change feed (streams of the changes in Apache Avro)
- General purpose v2 and Blob storage are supported
- Data protection =-> turn on change feed (it will create a folder in the storage account container named $blobchangefeed)
Authorization for blob or queue
The authentication via API to consume storage account services is based on permissions that are defined when a new application is registered (when authenticating via Azure ad credentials).
- type of grant: user_impersonation
- Authorize access to blob or queue data from a native or web application
ARM template
ARM template to deploy three storage accounts:
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"storageCount": {
"type": "int",
"defaultValue": 3
}
},
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2019-04-01",
"name": "[concat(copyIndex(),'storage', uniqueString(resourceGroup().id))]",
"location": "[resourceGroup().location]",
"sku": {
"name": "Standard_LRS"
},
"kind": "Storage",
"properties": {},
"copy": {
"name": "storagecopy",
"count": "[parameters('storageCount')]"
}
}
]
}
AzCopy tool
Copy files from one storage account to the other, installation instructions are available under Microsoft’s official documentation.
From local to azure storage account container (Upload)
- azcopy make - creates a container (it requires a url)
- azcopy copy FILE_NAME CONTAINER_URL
- azcopy copy “dir/*” CONTAINER_URL (not recursive, will not include folders)
- azcopy copy “dir/*” CONTAINER_URL –recursive (will include subdirectories) - the parameter –recursive=true is also accepted
From Azure storage account container to local (Download)
- azcopy FILE_AND_CONTAINER_URL my_local_file.txt
- az copy CONTAINER_URL “.” –recursive (will include subdirectories)
Copy between two storage accounts
- azcopy copy SOURCE_STORAGE_ACCOUNT_URL DESTINATION_STORAGE_ACCOUNT
- azcopy copy SOURCE_STORAGE_ACCOUNT_URL DESTINATION_STORAGE_ACCOUNT –recursive (to copy everything in the storage)
From azure account to another azure container
azcopy sync FROM_STORAGE TARGET_STORAGE
Azure CLI tool
- az storage copy blob
- az storage blob delete
- az storage blob download
- az storage blob sync
- az storage blob upload
- az storage blob run-command <—- uses SAS
Official documentation for azcopy in the azure cli can be found here.
File shares (kind of a dropbox)
- Types
- Hot - for frequent access
- Cool - for files that are not consumed frequently - cost is higher
- Premium - disabled if performance is standard
- Transactional optimized - general purpose storage
- it is supported across windows, linux and macos
- Once the file shares is created it creates a drive on the host to be used and share files
- Multiple machines can use the same file share
- Firewall in place usually blocks access
Table storage
- store data based on key and attributes values
- non relation structured data
- partition key
- divide the logical data into different partition to speed up lookup
- row key
- Microsoft.Azure.Cosmos.Table package
- CloudStorageAccount holds the container for a given account
- based on the container CloudTableClient class references the table
- Map EntityTable to a custom entity
- TableOperation performs the CRUD operation
- Batch operation is supported through the class TableBatchOperation
- TableOperation.Retrieve is used to fetch data
Tips: Can I retrieve without partition key?
Storage queue
- Storage queue service
- queues are used to decouple applications
- simples solution compared with service bus
- To interact with queue in c# package used is Azure.Storage.Queues
- QueueClient connects to the queue (connection and the queue name)
- use the method SendMessage to send items to the queue
- PeekMessage or PeekMessages (Azure.Storage.Queues.Models) in the queue does not remove the message in the queue
- ReceiveMessage returns a QueueMessage, it requires a manual delete with DeleteMessage method
It is possible to create a azure function to trigger a queue.
- Azure functions requires the message to be base64 encoded (to read and push)
- Azure function will try 5 times, if no success it will create a queue-poison and store the messages there
- The package used for functions and queues is Microsoft.Azure.WebJobs
It is possible to use queues and store the information in the table via azure functions
- Azure service bus
Azure Cosmos DB
Some of the common use cases described in the official documentation for using Azure Cosmos DB rely on different areas of knowledge in society, such as:
- IoT solution
- Gaming
- Mobile applications
- Fully managed NoSQL database
- there are no fks or relationship in cosmos, there is the concept of embedded data instead( like nested objects)
- High available (see availability options below)
- API’s available for cosmos are SQL API, Table API, MongoDB API, Gremlin API and Cassandra API
- You can choose which one to consume when creating the cosmos db instance
- Capacity mode
- charged by the storage
- charged by request units
- 400 RU and 5GB of storage is offer for free
- packages to interact with cosmos db is Microsoft.Azure.Cosmos
- to connect with cosmosdb the string connection is under keys
- CosmosClient is used to connect to the database
- Cosmos db provides a change feed (when updating a document)
- Cosmos db supports stored procedures and triggers
- Composite indexes
- are required when ordering data by two fields otherwise an error will rise
- to add a composite index
- Under container
- Settings
- Indexing Policy
- Time to live (TTL)
- Cosmos db table uses partitions to enable efficient queries
- Change feed design patterns in Azure Cosmos DB
Full API microsoft documentation to create cosmos db instance from azure cli is also a good thing to have in mind
Consistency
Cosmos DB offers different levels of consistency, named:
- Strong (Always in sync across databases, increases latency)
- Bounded Staleness (Replicates data async, readers within the same region will see strong consistency)
- Allow out of order data with a maximum of 5 seconds tolerance window
- Data can be stale by at most 2 versions
- Session (Client centered, client sharing the same session)
- Prefix (Readers will never see out of orders writes)
- Never see out of order writes
- Eventual (No order guarantee for reads)
- Read out of order writes
The way microsoft documentation depicts the levels of consistency are as follows:
------------------------------------------------------------------------------------------------------------------------
| |
| Strong --- Bounded Staleness --- Session ---- Consistent Prefix - Eventual |
| |
| High availability ------------------------------------------> Higher throughput |
|______________________________________________________________________________________________________________________|
Partition key
Partition keys are used to spread evenly the workload in cosmos db, as such, there are some cases in which the partition key is not well defined. For example, if you have hundreds of values but there are no distinct values among them, making this split becomes difficult.
The alternatives to that end are:
- concatenation of multiple property values with a radom suffix (known as synthetic keys - apparently this applies to the SQL API)
- Using a hash suffix that is appended to a property value
Triggers
Cosmos db offers triggers to be executed in enums based on different events, named:
- Delete
- Update
- Create
- Replace
- All
CLI with cosmos
- az cosmosdb create
resourceGroup=my-group
accountName=my-name
databaseName=my-db
consistencyLevel=strong
az cosmosdb create --name $accountName \
--resource-group $resourceGroup \
--max-interval 5 \
--enable-automatic-failover true
--default-consistency-level=$consistencyLevel \
--locations regionName=southcentralus failoverPriority=0 isZoneRedundant=False \
--locations regionName=northcentralus failoverPriority=1 isZoneRedundant=True
refs az cosmosdb create
CosmosDB RBAC
Data factory
- ETL (Extract Transform Load) tool used to handle data
- Resources -> data factory
- container -> upload csv file
- data factory
- source (can even be s3)
- destination
Blob trigger azure function output to cosmos db
- package for that: Microsoft.Azure.WebJobs.Extensions.CosmosDB (blob trigger)
Table of contents
Got a question?
If you have question or feedback, don't think twice and click here to leave a comment. Just want to support me? Buy me a coffee!