Storage Decision #
Storage capabilities #
Scenario | Suggested Azure services | Considerations for suggested services |
I have as SMB or NFS file share. | Azure Files or Azure NetApp Files | The choice of premium or standard Azure Files tiers depends on IOPS, throughput, and your need for latency consistency. If you have an on-premises deployment of NetApp, consider using Azure NetApp Files. If you need to migrate your access control lists and timestamps to the cloud, Azure File Sync can bring these settings to your Azure file shares. |
I have an on-premises object storage system for petabytes of data, such as Dell-EMC ECS. | Blob Storage | Azure Blob Storage provides premium, hot, cool, and archive tiers to match your workload performance and cost needs. |
I have a tape library for backup and disaster recovery or long-term data retention. | Blob Storage | A Blob Storage archive tier has the lowest possible cost. It might require hours to copy the offline data to a cool, hot, or Premium tier to allow access. Cool tiers provide instantaneous access at low cost. |
I have file or object storage configured to receive my backups. | Blob Storage or Azure File Sync | To back up data for long-term retention with lowest-cost storage, move data to Blob Storage and use cool and archive tiers. To enable fast disaster recovery for file data on a server, sync shares to individual Azure file shares by using Azure File Sync. With Azure file share snapshots, you can restore earlier versions. Sync them back to connected servers or access them natively in the Azure file share. |
I run data replication to a disaster recovery site. | Azure Files or Azure File Sync | Azure File Sync removes the need for a disaster recovery server and stores files in native Azure SMB shares. Fast disaster recovery rebuilds any data on a failed on-premises server quickly. You can even keep multiple server locations in sync or use cloud tiering to store only relevant data on-premises. |
I manage data transfer in disconnected scenarios. | Azure Stack Edge or Data Box Gateway | Using Data Stack Edge or Data Box Gateway, you can copy data in disconnected scenarios. When the gateway is offline, it saves all files you copy in the cache, then uploads them when you’re connected. |
I manage an ongoing data pipeline to the cloud. | Azure Stack Edge or Data Box Gateway | Move data to the cloud from systems that are constantly generating data by having them copy that data to the storage gateway. |
I have bursts of data that arrive at the same time. | Azure Stack Edge or Data Box Gateway | Manage large quantities of data that arrive at the same time. Some examples are when an autonomous car pulls into the garage or a gene sequencing machine that finishes its analysis. Copy all that data to Data Box Gateway at fast local speeds. Then let the gateway upload it as your network allows. |
I have a data lake such as a Hadoop cluster for HDFS data. | Data Lake Storage Gen2 or Standard SSD or Premium SSD | The Data Lake Storage Gen2 feature of Blob Storage provides server-side HDFS compatibility and petabyte scale for parallel analytics. It also offers high availability and reliability. Software like Cloudera can use Premium SSD or Standard SSD on controller/worker nodes, if needed. |
I have a disaster recovery site with strict RPO/RTO that syncs from my primary servers. | Azure page blobs | Page blobs are used by replication software to enable low-cost replication to Azure without the need for compute virtual machines until failover occurs. For more information, see Backup and disaster recovery for Azure IaaS disks. Note: Page blobs support a maximum of 8 TB. |
I have as SMB or NFS file share. | Azure Files or Azure NetApp Files | The choice of premium or standard Azure Files tiers depends on IOPS, throughput, and your need for latency consistency. If you have an on-premises deployment of NetApp, consider using Azure NetApp Files. If you need to migrate your access control lists and timestamps to the cloud, Azure File Sync can bring these settings to your Azure file shares. |
I have an on-premises object storage system for petabytes of data, such as Dell-EMC ECS. | Blob Storage | Azure Blob Storage provides premium, hot, cool, and archive tiers to match your workload performance and cost needs. |
I have a tape library for backup and disaster recovery or long-term data retention. | Blob Storage | A Blob Storage archive tier has the lowest possible cost. It might require hours to copy the offline data to a cool, hot, or Premium tier to allow access. Cool tiers provide instantaneous access at low cost. |
I have file or object storage configured to receive my backups. | Blob Storage or Azure File Sync | To back up data for long-term retention with lowest-cost storage, move data to Blob Storage and use cool and archive tiers. To enable fast disaster recovery for file data on a server, sync shares to individual Azure file shares by using Azure File Sync. With Azure file share snapshots, you can restore earlier versions. Sync them back to connected servers or access them natively in the Azure file share. |
I manage data transfer in disconnected scenarios. | Azure Stack Edge or Data Box Gateway | Using Data Stack Edge or Data Box Gateway, you can copy data in disconnected scenarios. When the gateway is offline, it saves all files you copy in the cache, then uploads them when you’re connected. |
I manage an ongoing data pipeline to the cloud. | Azure Stack Edge or Data Box Gateway | Move data to the cloud from systems that are constantly generating data by having them copy that data to the storage gateway. |
I have bursts of data that arrive at the same time. | Azure Stack Edge or Data Box Gateway | Manage large quantities of data that arrive at the same time. Some examples are when an autonomous car pulls into the garage or a gene sequencing machine that finishes its analysis. Copy all that data to Data Box Gateway at fast local speeds. Then let the gateway upload it as your network allows. |
- Will you need to provide downloadable images, documents, or other media as part of your workloads? Azure Blob Storage hosts static files, which are then accessible for download over the internet. For more information, see Static website hosting in Azure Storage.
- Will you need a location to store virtual machine logs, application logs, and analytics data? You can use Blob Storage to store Azure Monitor log data. See Storage Analytics.
- Will you need to provide a location for backup, disaster recovery, or archiving workload-related data? Blob Storage provides backup and disaster recovery capabilities. For more information, see Backup and disaster recovery for Azure IaaS disks.
You can also use Blob Storage to back up other resources, like on-premises or IaaS virtual machine-hosted SQL Server data. See SQL Server Backup and Restore.
- Will you need to support big data analytics workloads? Azure Data Lake Storage Gen2 is built on Azure Blob Storage. Data Lake Storage Gen2 supports large-enterprise data lake functionality. It also can handle storing petabytes of information while sustaining hundreds of gigabits of throughput.
- Will you need to provide cloud-native file shares? Azure has two services that provide cloud-hosted file shares:
- Azure NetApp Files provides high-performance NFS shares that are well suited to common enterprise workloads like SAP.
- Azure Files provides file shares accessible over SMB 3.0 and HTTPS.
- Will you need to support hybrid cloud storage for on-premises high-performance computing (HPC) workloads? Avere vFXT for Azure is a hybrid caching solution. You can expand your on-premises storage capabilities by using cloud-based storage. Avere vFXT for Azure is optimized for read-heavy HPC workloads that involve 1,000 to 40,000 CPU cores. Avere vFXT for Azure can integrate with on-premises hardware network attached storage (NAS), Blob Storage, or both.
- Will you need to perform large-scale archiving and syncing of your on-premises data? Azure Data Box products are designed to help you move large amounts of data from your on-premises environment to the cloud.
- Azure Data Box Gateway is a virtual device that is on-premises. Data Box Gateway helps you manage large-scale data migration to the cloud.
- Azure Stack Edge accelerates processing and the secure transfer of data to Azure. If you need to analyze, transform, or filter data before you move it to the cloud, use Data Stack Edge.
Azure Storage services #
Blob Storage #
Blob Storage is Microsoft’s object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn’t adhere to a specific data model or definition, such as text or binary data.
- Block blobs store text and binary data. Block blobs are made up of blocks of data that can be managed individually. Block blobs can store up to about 190.7 TiB.
- Append blobs are made up of blocks like block blobs, but are optimized for append operations. Append blobs are ideal for scenarios such as logging data from virtual machines.
- Page blobs store random access files up to 8 TiB in size. Page blobs store virtual hard drive (VHD) files and serve as disks for Azure virtual machines. For more information about page blobs, see Overview of Azure page blobs
Use Blob Storage for the following needs:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
- Writing to log files.
- Storing data for backup and restore, disaster recovery, and archiving.
- Storing data for analysis by an on-premises or Azure-hosted service.
Data Lake Storage Gen2 #
Blob Storage supports Data Lake Storage Gen2, Microsoft’s enterprise big data analytics solution for the cloud. Data Lake Storage Gen2 offers a hierarchical file system, with the advantages of Blob Storage, including low-cost tiered storage, high availability, strong consistency, and disaster recovery capabilities.
Azure disk storage #
Azure disk storage offers persistent, high-performance block storage to power Azure Virtual Machines. Azure disks are highly durable, secure, and offer the industry’s only single-instance service-level agreement (SLA) for virtual machines that use Premium SSD or Ultra Disk Storage. Azure disks provide high availability with availability sets and Availability Zones for your Azure Virtual Machines fault domains. Azure manages disks as a top-level resource. Azure Resource Manager capabilities are provided, such as Azure role-based access control (Azure RBAC), policy, and tagging by default.
Azure Files #
Azure Files provides fully managed, native SMB file shares, without the need to run a virtual machine. You can mount an Azure Files share as a network drive to any Azure virtual machine or on-premises computer.
Azure NetApp Files #
The Azure NetApp Files service is an enterprise-class, high-performance, metered file storage service. Azure NetApp Files supports any workload type and is highly available by default. You can select service and performance levels and set up snapshots through the service.
Azure Stack Edge #
Azure Stack Edge is an on-premises network device that moves data into and out of Azure. Data Stack Edge has AI-enabled edge compute to pre-process data during upload. Data Box Gateway is a virtual version of the device but with the same data transfer capabilities.
Data Box Gateway #
Data Box Gateway is a storage solution that enables you to seamlessly send data to Azure. Data Box Gateway is a virtual device based on a virtual machine provisioned in your virtualized environment or hypervisor. The virtual device is on-premises and you write data to it by using the NFS and SMB protocols. The device then transfers your data to Azure block blobs or page blobs, or to Azure Files.
Avere vFXT for Azure #
Avere vFXT for Azure is a filesystem caching solution for data-intensive HPC tasks. Take advantage of cloud computing’s scalability to make your data accessible, even for data that’s stored in your own on-premises hardware.
Azure Storage – Security #
Azure Storage for data security and encryption:
- Secure the storage account by using Azure RBAC and Azure Active Directory (Azure AD).
- Secure data in transit between an application and Azure by using client-side encryption, HTTPS, or SMB 3.0.
- Set data to be encrypted when it’s written to Azure Storage by using Azure Storage encryption.
- Grant delegated access to the data objects in Azure Storage by using shared access signatures.
- Use analytics to track the authentication method that someone is using when they access storage in Azure.