This session covers basic to advance vSAN topic. Watch this video if you to learn basics and few of the advance areas of vSAN. NOTE: This video is roughly 60 minutes in length so it would be worth blocking out some time to watch it!
The VMware Validated Design for SDDC is a blueprint for the private cloud that results in an SDDC that is consistent, thoroughly documented, extensively tested from end-to-end, and continuously validated to incorporate new releases of software components. In this video, we will demonstrate the configuration of VMware Virtual SAN for use in this design. Learn more about VMware Validated Designs at www.vmware.com/go/vvd or follow updates on Twitter @VMwareSDDC
The Storage and Availability solutions team is excited to announce that vSAN 6.2 was named the overall winner in the Software-Defined Technology category for the CRN 2016 Products of the Year Awards! We are extremely humbled by this recognition and couldn’t be more proud of the team.
CRN’s annual Products of the Year Awards are given to standout products and services that represent best-in-breed technological innovation backed by a supportive channel partner program. For the first time, CRN did the judging a little differently. CRN editors selected five finalists among 17 technology categories and then asked solution providers to rate the products to determine a winner based on the subcategories of Technology, Revenue and Profit, and Customer Demand. The finalists and winners were originally selected from a survey that netted more than 5,000 responses. This captured real-world satisfaction among customers and partners.
vSAN 6.2 introduced key space efficiency features like deduplication, compression and erasure coding including Quality of Service (QoS) and Software Checksum. With vSAN’s fast pace of innovation, the storage and Availability Solutions team introduced vSAN 6.5 which GA’d on November 15, 2016 and introduced features like: 2-node direct connect which could save customers up to 20% per ROBO site, full-featured PowerCLI for scalability and ease of enterprise-class automation and support for next-gen hardware including large capacity drives with 512e support.
This rapid innovation, has led to rapid adoption of vSAN. We’re adding ~100 customers per week and this positive recognition is consistent with the customer feedback we’ve been getting from customers who have adopted vSAN 6.2. Yellow Pages Canada is one such customer who adopted vSAN all-flash to power their front end apps, search engines, BI and SQL databases and to support their mixed workload environments. Learn more about Yellow Pages Canada here:
Storage and Availability Business Unit
v 6.2.0 / March 2016 / version 0.30
VMware Virtual SAN 6.1, shipping with vSphere 6.0 Update 1, introduced a new feature called VMware Virtual SAN Stretched Cluster. Virtual SAN Stretched Cluster is a specific configuration implemented in environments where disaster/downtime avoidance is a key requirement. This guide was developed to provide additional insight and information for installation, configuration and operation of a Virtual SAN Stretched Cluster infrastructure in conjunction with VMware vSphere. This guide will explain how vSphere handles specific failure scenarios and discuss various design considerations and operational procedures.
Virtual SAN Stretched Clusters with Witness Host refers to a deployment where a user sets up a Virtual SAN cluster with 2 active/active sites with an identical number of ESXi hosts distributed evenly between the two sites. The sites are connected via a high bandwidth/low latency link.
The third site hosting the Virtual SAN Witness Host is connected to both of the active/active data-sites. This connectivity can be via low bandwidth/high latency links.
Streched Cluster Configuration
Each site is configured as a Virtual SAN Fault Domain. The nomenclature used to describe a Virtual SAN Stretched Cluster configuration is X+Y+Z, where X is the number of ESXi hosts at data site A, Y is the number of ESXi hosts at data site B, and Z is the number of witness hosts at site C. Data sites are where virtual machines are deployed. The minimum supported configuration is 1+1+1 (3 nodes). The maximum configuration is 15+15+1 (31 nodes).
In Virtual SAN Stretched Clusters, there is only one witness host in any configuration. A virtual machine deployed on a Virtual SAN Stretched Cluster will have one copy of its data on site A, a second copy of its data on site B and any witness components placed on the witness host in site C. This configuration is achieved through fault domains alongside hosts and VM groups, and affinity rules. In the event of a complete site failure, there will be a full copy of the virtual machine data as well as greater than 50% of the components available. This will allow the virtual machine to remain available on the Virtual SAN datastore. If the virtual machine needs to be restarted on the other site, vSphere HA will handle this task.
Virtual SAN Stretched Cluster configurations require vSphere 6.0 Update 1 (U1) or greater. This implies both vCenter Server 6.0 U1 and ESXi 6.0 U1. This version of vSphere includes Virtual SAN version 6.1. This is the minimum version required for Virtual SAN Stretched Cluster support.
vSphere & Virtual SAN
Virtual SAN version 6.1 introduced features including both All-Flash and Stretched Cluster functionality. There are no limitations on the edition of vSphere used for Virtual SAN. However, for Virtual SAN Stretched Cluster functionality, vSphere DRS is very desirable. DRS will provide initial placement assistance, and will also automatically migrate virtual machines to their correct site in accordance to Host/VM affinity rules. It can also help will locating virtual machines to their correct site when a site recovers after a failure. Otherwise the administrator will have to manually carry out these tasks. Note that DRS is only available in Enterprise edition and higher of vSphere.
Hybrid and All-Flash support
Virtual SAN Stretched Cluster is supported on both hybrid configurations (hosts with local storage comprised of both magnetic disks for capacity and flash devices for cache) and all-flash configurations (hosts with local storage made up of flash devices for capacity and flash devices for cache).
VMware supports Virtual SAN Stretched Cluster with the v2 on-disk format only. The v1 on-disk format is based on VMFS and is the original on-disk format used for Virtual SAN. The v2 on-disk format is the version which comes by default with Virtual SAN version 6.x. Customers that upgraded from the original Virtual SAN 5.5 to Virtual SAN 6.0 may not have upgraded the on-disk format for v1 to v2, and are thus still using v1. VMware recommends upgrading the on-disk format to v2 for improved performance and scalability, as well as stretched cluster support. In Virtual SAN 6.2 clusters, the v3 on-disk format allows for additional features, discussed later, specific to 6.2.
Features supported on VSAN but not VSAN Stretched Clusters
The following are a list of products and features support on Virtual SAN but not on a stretched cluster implementation of Virtual SAN.
SMP-FT, the new Fault Tolerant VM mechanism introduced in vSphere 6.0, is supported on standard VSAN 6.1 deployments, but it is not supported on stretched cluster VSAN deployments at this time. *The exception to this rule, is when using 2 Node configurations in the same physical location.
The maximum value for NumberOfFailuresToTolerate in a Virtual SAN Stretched Cluster configuration is 1. This is the limit due to the maximum number of Fault Domains being 3.
In a Virtual SAN Stretched Cluster, there are only 3 Fault Domains. These are typically referred to as the Preferred, Secondary, and Witness Fault Domains. Standard Virtual SAN configurations can be comprised of up to 32 Fault Domains.
The Erasure Coding feature introduced in Virtual SAN 6.2 requires 4 Fault Domains for RAID5 type protection and 6 Fault Domains for RAID6 type protection. Because Stretched Cluster configurations only have 3 Fault Domains, Erasure Coding is not supported on Stretched Clusters at this time.
VMware Virtual SAN 6.5 is the latest release of the market-leading, enterprise-class storage solution for hyper-converged infrastructure (HCI). Virtual SAN 6.5 builds on the existing features introduced in 6.2 by enhancing automation, further reducing total cost of ownership (TCO), and setting the stage for next-generation cloud native applications.
Virtual SAN continues to see rapid adoption with more than 5000 customers utilizing the solution for a number of use cases including mission-critical production applications and databases, test and development, management infrastructures, disaster recovery sites, virtual desktop deployments, and remote office implementations. Virtual SAN is used by 400+ Fortune-1000 organizations across every industry vertical in more than 100 countries worldwide.
Let’s take a look at the new features included with Virtual SAN 6.5…
The Virtual SAN API and vSphere PowerCLI have been updated in this release. It is now possible to automate the configuration and management of cluster settings, disk groups, fault domains, and stretched clusters. Activities such as maintenance mode and cluster shutdown can also be scripted. This video demonstrates some of the capabilities of of the Virtual SAN API and PowerCLI: Creating a Cluster and Configuring Virtual SAN PowerCLI can be used to monitor the health of a Virtual SAN cluster. Health issue remediation and re-sync activities can be automated with this latest release.
20-50% Additional TCO Savings
Now that flash devices have become the preferred choice for storage, it makes sense to adjust the Virtual SAN licensing model to account for this change in the industry. All Virtual SAN 6.5 licenses include support for both hybrid and all-flash configurations. Please note, however, that deduplication, compression, and erasure coding still require Virtual SAN Advanced or Enterprise licenses. Adding support for the use of all-flash configurations with all licensing editions provides organizations more deployment options and the ability to take advantage of increased performance while minimizing licensing costs.
Virtual SAN supports the use of network crossover cables in 2-node configurations. This is especially beneficial in use cases such as remote office and branch office (ROBO) deployments where it can be cost prohibitive to procure, deploy, and manage 10GbE networking equipment at each location. This configuration also reduces complexity and improves reliability.
While we are on the subject of ROBO deployments, it is also important to mention a related Virtual SAN licensing change. previously did not support the use of all-flash Virtual SAN cluster configurations and the corresponding space efficiency features. A new license has been added with the release of Virtual SAN 6.5 and it is called >strong>Virtual SAN for ROBO Advanced. This new license includes support for using deduplication, compression, and erasure coding. Using these features lowers the cost-per-usable-GB of flash storage, which further reduces TCO. Organizations get the best of both worlds: The extreme performance of flash at a cost that is on par with or lower than similar hybrid solutions.
Virtual SAN 6.5 extends workload support to physical servers and clustered applications with the introduction of an iSCSI target service. Virtual SAN continues its track record of being radically simple by making it easy to access Virtual SAN storage using the iSCSI protocol with just a few vSphere Web Client mouse clicks. iSCSI targets on Virtual SAN are managed the same as other objects with Storage Policy Based Management (SPBM). Virtual SAN functionality such as deduplication, compression, mirroring, and erasure coding can be utilized with the iSCSI target service. CHAP and Mutual CHAP authentication is supported.
Enable vSAN iSCSI target service
Utilizing Virtual SAN for physical server workloads and clustered applications can reduce or eliminate the dependency on legacy storage solutions while providing the benefits of Virtual SAN such as simplicity, centralized management and monitoring, and high availability.
Scale To Tomorrow
New application architecture and development methods have emerged that are designed to run in today’s mobile-cloud era. For example,“DevOps” is a term that describes how these next-generation applications are developed and operated. “Container” technologies such as Docker and Kubernetes are a couple of the many solutions that have emerged as options for deploying and orchestrating these applications. Cloud native applications naturally require persistent storage just the same as traditional applications. Virtual SAN is an excellent choice for next-generation cloud native applications. Here are a few examples of the efforts that are underway:
vSphere Integrated Containers Engine is a container runtime for vSphere, allowing developers familiar with Docker to develop in containers and deploy them alongside traditional virtual machine workloads on vSphere clusters. vSphere Integrated Containers Engine enables these workloads to be managed through the vSphere GUI in a way familiar to vSphere admins. Availability and performance features in vSphere and Virtual SAN can be utilized by vSphere Integrated Containers Engine just the same as traditional virtual machine environments.
Docker Volume Driver for vSphere enables users to create and manage Docker container data volumes on vSphere storage technologies such as VMFS, NFS, and Virtual SAN. This driver makes it very simple to use containers with vSphere storage and provides the following key benefits:
– DevOps-friendly API for provisioning and policy configuration.
– Seamless movement of containers between vSphere hosts without moving data.
– Single platform to manage – run virtual machines and containers side-by-side
Next-Gen Hardware Support
vSphere 6.5 and Virtual SAN 6.5 also introduce support for 512e drives, which will enable larger capacities to meet the constantly growing space requirements of today’s and tomorrow’s applications. New hardware innovations such as NVMe provide dramatic performance gains for Virtual SAN with up to 150k IOPS per host. This level of performance combined with the ability to scale up to 64 hosts in a single cluster sets the stage for running any app, any scale on Virtual SAN.
This video demonstrates how to troubleshoot Virtual SAN on-disk format upgrade to 3.0, which may fail in small Virtual SAN clusters or ROBO/stretched clusters.
Attempting an on-disk upgrade in certain VSAN configurations may result in failure. Configurations that can cause these errors include:
The stretched VSAN Cluster consists of two ESXi Hosts and the Witness Node (ROBO configuration)
Each Host in the Stretched Cluster contains a single VSAN Disk Group
A Virtual SAN cluster consists of three normal nodes, with one disk group per node
A Virtual SAN cluster is very full, preventing the “full data migration” disk-group decommission mode
To allow an upgrade to proceed in these configurations, a compromise as to availability must be made. Data accessibility will be maintained, but the redundant copy of the data will be lost and rebuilt during the upgrade process. As a result, data will be exposed to faults and failures such as the loss of a disk on another node may result in data loss. This exposure to additional failure risk is referred to as “reduced redundancy,” and must be manually specified in the Ruby vSphere Console (RVC) to allow the upgrade to proceed. It is not possible to specify reduced redundancy when using the vSphere Web Client to start the upgrade.
Caution: During upgrade, a single point of failure is exposed. Follow all VMware best practices, and your business practices, regarding the backup of important data and virtual machines.
This video demonstrates how to troubleshoot Virtual SAN on-disk format upgrade to 3.0, which may fail in small Virtual SAN clusters or ROBO/stretched clusters.
Attempting an on-disk upgrade in certain VSAN configurations may result in failure. Configurations that can cause these errors are:
– The stretched VSAN Cluster consists of two ESXi Hosts and the Witness Node also called ROBO configuration
– Each Host in the Stretched Cluster contains a single VSAN Disk Group
– A Virtual SAN cluster consists of three normal nodes, with one disk group per node
– A Virtual SAN cluster is very full, preventing the full data migration disk-group decommission mode
During this upgrade, a single point of failure is exposed. Follow all VMware best practices, and your business practices, regarding the backup of important data and virtual machines.
This exposure to additional failure risk is referred to as “reduced redundancy,” and must be manually specified in the Ruby vSphere Console or RVC to allow the upgrade to proceed.
The purpose of this document is to explain how to size bandwidth requirements for Virtual SAN in Stretched Cluster configurations. This document only covers the Virtual SAN network bandwidth requirements.
In Stretched Cluster configurations, two data fault domains have one or more hosts, and the third fault domain contains a witness host or witness appliance. In this document each data fault domain will be referred to as a site.
Virtual SAN Stretched Cluster configurations can be spread across distances, provided bandwidth and latency requirements are met.
Streched Cluster Configuration
The bandwidth requirement between the main sites is highly dependent on the workload to be run on Virtual SAN, amount of data, and handling of failure scenarios. Under normal operating conditions, the basic bandwidth requirements are:
Basic bandwidth requirements
Bandwidth Requirements Between Sites
Workloads are seldom all reads or writes, and normally include a general read to write ratio for each use case.
A good example of this would be a VDI workload. During peak utilization, VDI often behaves with a 70/30 write to read ratio. That is to say that 70% of the IO is due to write operations and 30% is due to read IO. As each solution has many factors, true ratios should be measured for each workload.
Using the general situation where a total IO profile requires 100,000 IOPS, of which 70% are write, and 30% are read, in a Stretched configuration, the write IO is what is sized against for inter-site bandwidth requirements.
With Stretched Clusters, read traffic is, by default, serviced by the site that the VM resides on. This concept is called Read Locality.
The required bandwidth between two data sites (B) is equal to Write bandwidth (Wb) * data multiplier (md) * resynchronization multiplier (mr):
B = Wb * md * mr
The data multiplier is comprised of overhead for Virtual SAN metadata traffic and miscellaneous related operations. VMware recommends a data multiplier of 1.4
The resynchronization multiplier is included to account for resynchronizing events. It is recommended to allocate bandwidth capacity on top of required bandwidth capacity for resynchronization events.
Making room for resynchronization traffic, an additional 25% is recommended.
Bandwidth Requirements Between Witness & Data Sites
Witness bandwidth isn’t calculated in the same way as inter-site bandwidth requirements. Witnesses do not maintain VM data, but rather only component metadata.
It is important to remember that data is stored on Virtual SAN in the form of objects. Objects are comprised of 1 or more components of items such as:
VM Home or namespace
VM Swap object
Objects can be split into more than 1 component when the size is >255GB, and/or a Number of Stripes (stripe width) policy is applied. Additionally, the number of objects/components for a given Virtual Machine is multiplied
when a Number of Failures to Tolerate (FTT) policy is applied for data protection and availability.
The required bandwidth between the Witness and each site is equal to ~1138 B x Number of Components / 5s.