Design Pattern- Azure Storage

Storage Decision #

Storage capabilities #

ScenarioSuggested Azure servicesConsiderations for suggested services
I have as SMB or NFS file share.Azure Files or Azure NetApp FilesThe choice of premium or standard Azure Files tiers depends on IOPS, throughput, and your need for latency consistency. If you have an on-premises deployment of NetApp, consider using Azure NetApp Files. If you need to migrate your access control lists and timestamps to the cloud, Azure File Sync can bring these settings to your Azure file shares.
I have an on-premises object storage system for petabytes of data, such as Dell-EMC ECS.Blob StorageAzure Blob Storage provides premium, hot, cool, and archive tiers to match your workload performance and cost needs.
I have a tape library for backup and disaster recovery or long-term data retention.Blob StorageA Blob Storage archive tier has the lowest possible cost. It might require hours to copy the offline data to a cool, hot, or Premium tier to allow access. Cool tiers provide instantaneous access at low cost.
I have file or object storage configured to receive my backups.Blob Storage or Azure File SyncTo back up data for long-term retention with lowest-cost storage, move data to Blob Storage and use cool and archive tiers. To enable fast disaster recovery for file data on a server, sync shares to individual Azure file shares by using Azure File Sync. With Azure file share snapshots, you can restore earlier versions. Sync them back to connected servers or access them natively in the Azure file share.
I run data replication to a disaster recovery site.Azure Files or Azure File SyncAzure File Sync removes the need for a disaster recovery server and stores files in native Azure SMB shares. Fast disaster recovery rebuilds any data on a failed on-premises server quickly. You can even keep multiple server locations in sync or use cloud tiering to store only relevant data on-premises.
I manage data transfer in disconnected scenarios.Azure Stack Edge or Data Box GatewayUsing Data Stack Edge or Data Box Gateway, you can copy data in disconnected scenarios. When the gateway is offline, it saves all files you copy in the cache, then uploads them when you’re connected.
I manage an ongoing data pipeline to the cloud.Azure Stack Edge or Data Box GatewayMove data to the cloud from systems that are constantly generating data by having them copy that data to the storage gateway.
I have bursts of data that arrive at the same time.Azure Stack Edge or Data Box GatewayManage large quantities of data that arrive at the same time. Some examples are when an autonomous car pulls into the garage or a gene sequencing machine that finishes its analysis. Copy all that data to Data Box Gateway at fast local speeds. Then let the gateway upload it as your network allows.
I have a data lake such as a Hadoop cluster for HDFS data.Data Lake Storage Gen2 or Standard SSD or Premium SSDThe Data Lake Storage Gen2 feature of Blob Storage provides server-side HDFS compatibility and petabyte scale for parallel analytics. It also offers high availability and reliability. Software like Cloudera can use Premium SSD or Standard SSD on controller/worker nodes, if needed.
I have a disaster recovery site with strict RPO/RTO that syncs from my primary servers.Azure page blobsPage blobs are used by replication software to enable low-cost replication to Azure without the need for compute virtual machines until failover occurs. For more information, see Backup and disaster recovery for Azure IaaS disks. Note: Page blobs support a maximum of 8 TB.
I have as SMB or NFS file share.Azure Files or Azure NetApp FilesThe choice of premium or standard Azure Files tiers depends on IOPS, throughput, and your need for latency consistency. If you have an on-premises deployment of NetApp, consider using Azure NetApp Files. If you need to migrate your access control lists and timestamps to the cloud, Azure File Sync can bring these settings to your Azure file shares.
I have an on-premises object storage system for petabytes of data, such as Dell-EMC ECS.Blob StorageAzure Blob Storage provides premium, hot, cool, and archive tiers to match your workload performance and cost needs.
I have a tape library for backup and disaster recovery or long-term data retention.Blob StorageA Blob Storage archive tier has the lowest possible cost. It might require hours to copy the offline data to a cool, hot, or Premium tier to allow access. Cool tiers provide instantaneous access at low cost.
I have file or object storage configured to receive my backups.Blob Storage or Azure File SyncTo back up data for long-term retention with lowest-cost storage, move data to Blob Storage and use cool and archive tiers. To enable fast disaster recovery for file data on a server, sync shares to individual Azure file shares by using Azure File Sync. With Azure file share snapshots, you can restore earlier versions. Sync them back to connected servers or access them natively in the Azure file share.
I manage data transfer in disconnected scenarios.Azure Stack Edge or Data Box GatewayUsing Data Stack Edge or Data Box Gateway, you can copy data in disconnected scenarios. When the gateway is offline, it saves all files you copy in the cache, then uploads them when you’re connected.
I manage an ongoing data pipeline to the cloud.Azure Stack Edge or Data Box GatewayMove data to the cloud from systems that are constantly generating data by having them copy that data to the storage gateway.
I have bursts of data that arrive at the same time.Azure Stack Edge or Data Box GatewayManage large quantities of data that arrive at the same time. Some examples are when an autonomous car pulls into the garage or a gene sequencing machine that finishes its analysis. Copy all that data to Data Box Gateway at fast local speeds. Then let the gateway upload it as your network allows.
  • Will you need to provide downloadable images, documents, or other media as part of your workloads? Azure Blob Storage hosts static files, which are then accessible for download over the internet. For more information, see Static website hosting in Azure Storage.

  • Will you need a location to store virtual machine logs, application logs, and analytics data? You can use Blob Storage to store Azure Monitor log data. See Storage Analytics.

  • Will you need to provide a location for backup, disaster recovery, or archiving workload-related data? Blob Storage provides backup and disaster recovery capabilities. For more information, see Backup and disaster recovery for Azure IaaS disks.

You can also use Blob Storage to back up other resources, like on-premises or IaaS virtual machine-hosted SQL Server data. See SQL Server Backup and Restore.

  • Will you need to support big data analytics workloads? Azure Data Lake Storage Gen2 is built on Azure Blob Storage. Data Lake Storage Gen2 supports large-enterprise data lake functionality. It also can handle storing petabytes of information while sustaining hundreds of gigabits of throughput.

  • Will you need to provide cloud-native file shares? Azure has two services that provide cloud-hosted file shares:
    • Azure NetApp Files provides high-performance NFS shares that are well suited to common enterprise workloads like SAP.
    • Azure Files provides file shares accessible over SMB 3.0 and HTTPS.

  • Will you need to support hybrid cloud storage for on-premises high-performance computing (HPC) workloads? Avere vFXT for Azure is a hybrid caching solution. You can expand your on-premises storage capabilities by using cloud-based storage. Avere vFXT for Azure is optimized for read-heavy HPC workloads that involve 1,000 to 40,000 CPU cores. Avere vFXT for Azure can integrate with on-premises hardware network attached storage (NAS), Blob Storage, or both.

  • Will you need to perform large-scale archiving and syncing of your on-premises data? Azure Data Box products are designed to help you move large amounts of data from your on-premises environment to the cloud.
    • Azure Data Box Gateway is a virtual device that is on-premises. Data Box Gateway helps you manage large-scale data migration to the cloud.
    • Azure Stack Edge accelerates processing and the secure transfer of data to Azure. If you need to analyze, transform, or filter data before you move it to the cloud, use Data Stack Edge.

Azure Storage services #

Blob Storage #

Blob Storage is Microsoft’s object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn’t adhere to a specific data model or definition, such as text or binary data.

  1. Block blobs store text and binary data. Block blobs are made up of blocks of data that can be managed individually. Block blobs can store up to about 190.7 TiB.
  2. Append blobs are made up of blocks like block blobs, but are optimized for append operations. Append blobs are ideal for scenarios such as logging data from virtual machines.
  3. Page blobs store random access files up to 8 TiB in size. Page blobs store virtual hard drive (VHD) files and serve as disks for Azure virtual machines. For more information about page blobs, see Overview of Azure page blobs


Use Blob Storage for the following needs:

  • Serving images or documents directly to a browser.
  • Storing files for distributed access.
  • Streaming video and audio.
  • Writing to log files.
  • Storing data for backup and restore, disaster recovery, and archiving.
  • Storing data for analysis by an on-premises or Azure-hosted service.

Data Lake Storage Gen2 #

Blob Storage supports Data Lake Storage Gen2, Microsoft’s enterprise big data analytics solution for the cloud. Data Lake Storage Gen2 offers a hierarchical file system, with the advantages of Blob Storage, including low-cost tiered storage, high availability, strong consistency, and disaster recovery capabilities.

Azure disk storage #

Azure disk storage offers persistent, high-performance block storage to power Azure Virtual Machines. Azure disks are highly durable, secure, and offer the industry’s only single-instance service-level agreement (SLA) for virtual machines that use Premium SSD or Ultra Disk Storage. Azure disks provide high availability with availability sets and Availability Zones for your Azure Virtual Machines fault domains. Azure manages disks as a top-level resource. Azure Resource Manager capabilities are provided, such as Azure role-based access control (Azure RBAC), policy, and tagging by default.

Azure Files #

Azure Files provides fully managed, native SMB file shares, without the need to run a virtual machine. You can mount an Azure Files share as a network drive to any Azure virtual machine or on-premises computer.

Azure NetApp Files #

The Azure NetApp Files service is an enterprise-class, high-performance, metered file storage service. Azure NetApp Files supports any workload type and is highly available by default. You can select service and performance levels and set up snapshots through the service.

Azure Stack Edge #

Azure Stack Edge is an on-premises network device that moves data into and out of Azure. Data Stack Edge has AI-enabled edge compute to pre-process data during upload. Data Box Gateway is a virtual version of the device but with the same data transfer capabilities.

Data Box Gateway #

Data Box Gateway is a storage solution that enables you to seamlessly send data to Azure. Data Box Gateway is a virtual device based on a virtual machine provisioned in your virtualized environment or hypervisor. The virtual device is on-premises and you write data to it by using the NFS and SMB protocols. The device then transfers your data to Azure block blobs or page blobs, or to Azure Files.

Avere vFXT for Azure #

Avere vFXT for Azure is a filesystem caching solution for data-intensive HPC tasks. Take advantage of cloud computing’s scalability to make your data accessible, even for data that’s stored in your own on-premises hardware.

Azure Storage – Security #

Azure Storage for data security and encryption:

  • Secure the storage account by using Azure RBAC and Azure Active Directory (Azure AD).
  • Secure data in transit between an application and Azure by using client-side encryption, HTTPS, or SMB 3.0.
  • Set data to be encrypted when it’s written to Azure Storage by using Azure Storage encryption.
  • Grant delegated access to the data objects in Azure Storage by using shared access signatures.
  • Use analytics to track the authentication method that someone is using when they access storage in Azure.
RecommendationComments
Use the Azure Resource Manager deployment modelCreate new storage accounts using the Azure Resource Manager deployment model for important security enhancements, including superior Azure role-based access control (Azure RBAC) and auditing, Resource Manager-based deployment and governance, access to managed identities, access to Azure Key Vault for secrets, and Azure AD-based authentication and authorization for access to Azure Storage data and resources. If possible, migrate existing storage accounts that use the classic deployment model to use Azure Resource Manager. For more information about Azure Resource Manager, see Azure Resource Manager overview.
Enable Microsoft Defender for all of your storage accountsMicrosoft Defender for Storage provides an additional layer of security intelligence that detects unusual and potentially harmful attempts to access or exploit storage accounts. Security alerts are triggered in Microsoft Defender for Cloud when anomalies in activity occur and are also sent via email to subscription administrators, with details of suspicious activity and recommendations on how to investigate and remediate threats. For more information, see Configure Microsoft Defender for Storage.
Turn on soft delete for blobsSoft delete for blobs enables you to recover blob data after it has been deleted. For more information on soft delete for blobs, see Soft delete for Azure Storage blobs.
Turn on soft delete for containersSoft delete for containers enables you to recover a container after it has been deleted. For more information on soft delete for containers, see Soft delete for containers.
Lock storage account to prevent accidental or malicious deletion or configuration changesApply an Azure Resource Manager lock to your storage account to protect the account from accidental or malicious deletion or configuration change. Locking a storage account does not prevent data within that account from being deleted. It only prevents the account itself from being deleted. For more information, see Apply an Azure Resource Manager lock to a storage account.
Store business-critical data in immutable blobsConfigure legal holds and time-based retention policies to store blob data in a WORM (Write Once, Read Many) state. Blobs stored immutably can be read, but cannot be modified or deleted for the duration of the retention interval. For more information, see Store business-critical blob data with immutable storage.
Require secure transfer (HTTPS) to the storage accountWhen you require secure transfer for a storage account, all requests to the storage account must be made over HTTPS. Any requests made over HTTP are rejected. Microsoft recommends that you always require secure transfer for all of your storage accounts. For more information, see Require secure transfer to ensure secure connections.
Limit shared access signature (SAS) tokens to HTTPS connections onlyRequiring HTTPS when a client uses a SAS token to access blob data helps to minimize the risk of eavesdropping. For more information, see Grant limited access to Azure Storage resources using shared access signatures (SAS).

Azure Storage – Identity and access management #

RecommendationComments
Use Azure Active Directory (Azure AD) to authorize access to blob dataAzure AD provides superior security and ease of use over Shared Key for authorizing requests to Blob storage. For more information, see Authorize access to data in Azure Storage.
Keep in mind the principal of least privilege when assigning permissions to an Azure AD security principal via Azure RBACWhen assigning a role to a user, group, or application, grant that security principal only those permissions that are necessary for them to perform their tasks. Limiting access to resources helps prevent both unintentional and malicious misuse of your data.
Use a user delegation SAS to grant limited access to blob data to clientsA user delegation SAS is secured with Azure Active Directory (Azure AD) credentials and also by the permissions specified for the SAS. A user delegation SAS is analogous to a service SAS in terms of its scope and function, but offers security benefits over the service SAS. For more information, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
Secure your account access keys with Azure Key VaultMicrosoft recommends using Azure AD to authorize requests to Azure Storage. However, if you must use Shared Key authorization, then secure your account keys with Azure Key Vault. You can retrieve the keys from the key vault at runtime, instead of saving them with your application. For more information about Azure Key Vault, see Azure Key Vault overview.
Regenerate your account keys periodicallyRotating the account keys periodically reduces the risk of exposing your data to malicious actors.
Disallow Shared Key authorizationWhen you disallow Shared Key authorization for a storage account, Azure Storage rejects all subsequent requests to that account that are authorized with the account access keys. Only secured requests that are authorized with Azure AD will succeed. For more information, see Prevent Shared Key authorization for an Azure Storage account.
Keep in mind the principal of least privilege when assigning permissions to a SASWhen creating a SAS, specify only those permissions that are required by the client to perform its function. Limiting access to resources helps prevent both unintentional and malicious misuse of your data.
Have a revocation plan in place for any SAS that you issue to clientsIf a SAS is compromised, you will want to revoke that SAS as soon as possible. To revoke a user delegation SAS, revoke the user delegation key to quickly invalidate all signatures associated with that key. To revoke a service SAS that is associated with a stored access policy, you can delete the stored access policy, rename the policy, or change its expiry time to a time that is in the past. For more information, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
If a service SAS is not associated with a stored access policy, then set the expiry time to one hour or lessA service SAS that is not associated with a stored access policy cannot be revoked. For this reason, limiting the expiry time so that the SAS is valid for one hour or less is recommended.
Disable anonymous public read access to containers and blobsAnonymous public read access to a container and its blobs grants read-only access to those resources to any client. Avoid enabling public read access unless your scenario requires it. To learn how to disable anonymous public access for a storage account, see Configure anonymous public read access for containers and blobs.

Azure Storage – Networking #

RecommendationComments
Configure the minimum required version of Transport Layer Security (TLS) for a storage account.Require that clients use a more secure version of TLS to make requests against an Azure Storage account by configuring the minimum version of TLS for that account. For more information, see Configure minimum required version of Transport Layer Security (TLS) for a storage account
Enable the Secure transfer required option on all of your storage accountsWhen you enable the Secure transfer required option, all requests made against the storage account must take place over secure connections. Any requests made over HTTP will fail. For more information, see Require secure transfer in Azure Storage.
Enable firewall rulesConfigure firewall rules to limit access to your storage account to requests that originate from specified IP addresses or ranges, or from a list of subnets in an Azure Virtual Network (VNet). For more information about configuring firewall rules, see Configure Azure Storage firewalls and virtual networks.
Allow trusted Microsoft services to access the storage accountTurning on firewall rules for your storage account blocks incoming requests for data by default, unless the requests originate from a service operating within an Azure Virtual Network (VNet) or from allowed public IP addresses. Requests that are blocked include those from other Azure services, from the Azure portal, from logging and metrics services, and so on. You can permit requests from other Azure services by adding an exception to allow trusted Microsoft services to access the storage account. For more information about adding an exception for trusted Microsoft services, see Configure Azure Storage firewalls and virtual networks.
Use private endpointsA private endpoint assigns a private IP address from your Azure Virtual Network (VNet) to the storage account. It secures all traffic between your VNet and the storage account over a private link. For more information about private endpoints, see Connect privately to a storage account using Azure Private Endpoint.
Use VNet service tagsA service tag represents a group of IP address prefixes from a given Azure service. Microsoft manages the address prefixes encompassed by the service tag and automatically updates the service tag as addresses change. For more information about service tags supported by Azure Storage, see Azure service tags overview. For a tutorial that shows how to use service tags to create outbound network rules, see Restrict access to PaaS resources.
Limit network access to specific networksLimiting network access to networks hosting clients requiring access reduces the exposure of your resources to network attacks.
Configure network routing preferenceYou can configure network routing preference for your Azure storage account to specify how network traffic is routed to your account from clients over the Internet using the Microsoft global network or Internet routing. For more information, see Configure network routing preference for Azure Storage.

Powered by BetterDocs