Migration & Architecture Options for Microsoft SQL Server on AWS FSx and FCI

Deploying and managing SQL Server Always On Failover Cluster Instances (FCIs) in the cloud has traditionally required third-party software and a lot of overhead ... until now. With the introduction of Amazon FSx for Windows File Server, we have a simple, AWS-native option for integrating fully managed, highly available shared storage to host our SQL databases.

Body:

As part of the SQL Server Always On offering, Always On Failover Cluster Instances (FCIs) leverage Windows Server Failover Clustering (WSFC) functionality to provide local high availability through redundancy at the server-instance level – namely a failover cluster instance (FCI). An FCI is a single instance of SQL Server that is installed across WSFC nodes, typically across multiple subnets and Availability Zones (AZs) in AWS. On the network, the FCI appears to be an instance of SQL Server running on a single computer. But the FCI actually provides failover from one WSFC node to another, if the active node becomes unavailable.

The SQL Server Always On FCI deployment option has been traditionally difficult to deploy and manage on AWS due to the shared storage requirement (shared storage needs to be accessible from all participating nodes). To get around this, a common deployment approach involved integrating third-party replication solutions such as SIOS DataKeeper or StarWind Virtual SAN, which allowed us to build the SQL Server cluster by simulating shared storage (disk data on the primary node was replicated to the secondary node). The solutions integrated with and extended WSFC functionality to manifest the FCI deployment scenario in AWS, but with increased management, overhead, and cost.

The introduction of AWS FSx for Windows File Server is a game changer. This fully managed service provides shared storage, can automatically replicate the data synchronously across two Availability Zones and provide high availability with automatic failure detection, failover and failback, and fully supports the SMB Continuous Availability (CA) feature. You can use Amazon FSx for Windows File Server as a shared storage tier for FCIs in two ways: as an SMB File Share Witness, and/or for Active SQL Server Data Files.

Using Amazon FSx as an SMB File Share Witness - WSFC deployments commonly deploy an SMB file share witness to maintain quorum of the cluster's resources. The job of the witness is to provide an additional quorum vote when necessary to ensure that a cluster continues to run if there is a site outage. The witness file share can run on a Single-AZ file system and requires only a small amount of storage (32 GB should be sufficient) to hold quorum information (see diagram below).

Using Amazon FSx for Active SQL Server Data Files - Microsoft SQL Server can be deployed with an SMB file share as the storage option for active data files. Amazon FSx is optimized to provide shared storage by supporting continuously available (CA) file shares, which are specifically designed for applications like SQL Server that require uninterrupted access to shared file data. Note that it is required that you use CA shares on Multi-AZ file systems for all SQL Server deployments, whether HA or not. Amazon FSx provides file shares with the following characteristics and capabilities to host SQL Server databases:

  • Support for SMB versions 2.0 through 3.1.1.
  • Encryption in transit using SMB Kerberos.
  • Customizable performance (throughput and IOPS).
  • Multi-AZ deployment (HA) with support for SMB Continuous Availability (also known as SMB Transparent Failover).
  • Predictable cost with no charge for Inter-AZ data replication.
  • Single-AZ file system can be used as a target for SQL backups.

Configuration

The configuration of this setup is also significantly less complex than previous deployment methodologies. You can follow these high-level steps as a general guide:

  1. Create your domain-joined, Multi-AZ Amazon FSx file system.
  2. Determine your required file system characteristics.
  3. Determine applicable network settings.
  4. Select your Active Directory.
  5. Create the Microsoft SQL Server file share on your file system and set proper permissions and availability settings.
  6. Map a share (eg. D$) using the FSx systems' DNS name and create a folder (eg. SQL) within it.
  7. Sett the ContinuouslyAvailable property of the file share to "True".
  8. Grant SMB share permissions to the appropriate identities (the SQL service account, a global group for DBAs, and another Global Group in which all SQL Servers are members).
  9. Edit and set NTFS permissions, granting "Full Access" to the target folder to the same set of identities (the SQL service account, a global group for DBAs, and another Global Group in which all SQL Servers are members).
  10. Install SQL Server as failover cluster instance on the first node.
  11. Install SQL Server as failover cluster instance on the second node.

Benefits

There are several benefits in migrating to Amazon FSx for shared SQL storage which can apply to customers in many different use cases. Consider these scenarios:

  • Customers who use Enterprise Edition licenses to support AG groups can now use Standard Edition licenses which will save ~50–60% in license costs, not to mention simplifying the complexity and ongoing management of the SQL deployment (although you can run Basic AGs on SQL Server Standard Edition starting from SQL Server 2016, only one database per AG is supported which becomes a challenge when reconciling multiple databases, listeners and private IPs, EC2 instance type and ENI private IP constraints, etc.).
  • Customers with third-party storage replication solutions, will be able to save on licensing, administration and support, while simplifying and consolidating their technology footprint by using another AWS native service instead.
  • Customers with existing on-premise or hybrid deployments are likely using a combination of FCI and AG (FCI for high availability within a primary data center site, and AG to provide a disaster recovery solution across sites). AWS's Availability Zone architecture and Amazon FSx's support for highly available shared storage deployed across multiple Availability Zones now make it possible for them to eliminate the need for separate high availability and disaster recovery solutions, again reducing costs and simplifying deployment complexities.

For additional information on the technologies referenced in this article, please feel free to contact us directly. You can also reference this links:

https://aws.amazon.com/sql/

https://aws.amazon.com/fsx/windows/

https://docs.aws.amazon.com/fsx/latest/WindowsGuide/sql-server.html

https://www.sios-apac.com/tag/sql-server-failover-cluster-instance/

https://www.starwindsoftware.com/resource-library/starwind-virtual-san-aws-ec2-deployment-guide/