What Is Storage Virtualization | Introduction and Implementation
Enterprises in the past were often troubled by the costs of managing more and more data and storage devices; today, storage virtualization, a modern technology, is a great solution to this problem. In this article, I will introduce what is storage virtualization and the types of it.
Why we need storage virtualization
If you are familiar with virtual environments and products, you may also be heard of many types of virtualizations. Storage virtualization is a significant one of them.
Storage virtualization is the pooling of physical storage from multiple storage devices into a seemingly single storage device that is managed by a central console. It is commonly used for virtual machines (VMs) in virtual environments.
Compares to traditional storage technologies, storage virtualization has the following advantages:
- Higher disk utilization: While disk utilization with traditional storage technologies is typically only 30%-70%, disk utilization with storage virtualization can reach 70%-90%.
- Greater flexibility: Storage virtualization can accommodate different vendors and different classes of heterogeneous storage platforms, providing greater flexibility in storage resource management.
- Simpler IT environments: storage virtualization reduces the amount of hardware needed to run applications, which simplifies the complexities in the datacenter.
- Easier management: Storage virtualization provides users a centralized way to manage large-capacity storage systems, which can effectively avoid the hassles caused by expansion of storage devices.
- Higher performance: virtualized storage system allocates the bandwidth required for each data access to more storage modules, increasing the overall system access speed.
- Less costly: Storage virtualization requires fewer resources (hardware, physical storage, IT support, etc) than the complex infrastructure and network of traditional datacenter, which reduces up-front investment and post-maintenance costs.
- Modernization adaptable: Transitioning to storage virtualization opens up new opportunities for organizations to take advantage of the latest developments in virtualization, software-defined technologies and hyperconvergence.
These distinct advantages are the reason why many enterprises need storage virtualization. Next, I will introduce how storage virtualization works and the different types of it.
How storage virtualization works
In essence, storage virtualization is a technology that abstracts physical storage resources to remove the traditional boundaries of physical storage devices.
It separates storage management software and the underlying hardware infrastructure, abstracting arrays and disks into a vast pool of virtual storage, in order to provide system administrators with a seamless virtual view of storage resources.
To provide access to the data stored on the physical storage devices, the virtualization software needs to either create a map using metadata or an algorithm to dynamically locate the data on the fly. Then the virtualization software intercepts read and write requests from applications and uses the created map to find or save the data to the appropriate physical device.
2 Types of storage virtualization
In general, there are 2 basic storage virtualization technologies: Block and file-level storage.
✦ Block-level storage virtualization: the most common storage virtualization type. It abstracts the system’s logical storage (drive partitions) from its physical components (memory blocks and storage media).
The storage virtualization engine discovers all available blocks on multiple arrays and individual media, regardless of the storage system’s physical location, logical partitions, or manufacturer. The engine leaves data in its physical location and maps the address to the virtual storage pool. This enables the engine to present multi-vendor storage system capacity to servers, as if the storage were a single array.
✦ File-level storage virtualization: NAS devices are physically and logically independent of each other and need to be managed, optimized and configured separately. Therefore, managing multiple NAS devices can be time-consuming and costly, especially when migrating data between them.
Storage virtualization of files breaks the dependency between the data being accessed and the physical memory location in a normal NAS array, and pooling of NAS resources makes it easier to handle file migration in the background, which greatly simplifies the process of managing multiple NAS devices through a single management console and helps improve performance.
Block-level storage virtualization implementation approaches
There are 3 block-level storage virtualization implementation approaches: host-based, storage device-based, and network-based storage virtualization.
Here is a summarized comparison of them:
|Host compatibility||Storage compatibility||Scalability||Implementation||Performance|
Next, I will further introduce the details of these 3 storage virtualization approaches and their storage virtualization solutions.
Host-based storage virtualization
The virtualization and management are implemented at the host level via agents or software that installed on one or more hosts. The physical storage can be almost any device and array.
- Supports heterogeneous storage systems.
- Low cost of equipment as no additional hardware is required.
- Various products and service vendors.
- Good workload balancing mechanism in host and small SAN structures.
- Takes up host resources, and tends to affect performance.
- May lead to incompatible problems between operating systems (OSs) and applications.
- Add difficulties to host maintenance, upgrade and expansion, and affects system stability.
- Requires complex data migration process, which may also affect business continuity.
Host-based storage virtualization solutions:
Symantec Veritas Volume Manager LVM
Storage device-based storage virtualization
The virtualization is implemented in the device controller with close proximity to the physical disk drives.
A primary storage controller provides the pooling, metadata management, replication, and migration services across controllers, and allows direct attachment to other storage controllers from the same or different vendors.
- No additional hardware or infrastructure requirements.
- Does not add latency to individual I/Os.
- Does not consume host resources.
- Abundant data management-related features.
- Storage utilization optimized only across the connected controllers
- Replication and data migration only possible across the connected controllers and same vendors device for long distance support.
- Increases the overall cost of data management software that needed to configure multiple storage devices.
Storage device-based storage virtualization solutions:
Network-based storage virtualization
The network-based storage virtualization device uses iSCSI or FC Fibre channel networks to connect as a SAN and provides the layer of abstraction between the hosts performing the I/O and the storage controllers providing the storage capacity.
There are two commonly available implementations of network-based storage virtualization, appliance-based and switch-based.
- True heterogeneous storage virtualization
- Does not consume host resources.
- Caching of data is possible when in-band.
- A single unified management interface with good scalability for all virtualized storage.
- Data management is limited by vendors support.
- Difficult to implement fast meta-data updates in switched-based devices
Network-based storage virtualization solutions:
Due to its significant advantages over traditional storage, virtualized storage has become a major trend in the IT industry as soon as it was developed.
In addition, not only will virtualization storage software be further developed, but standards such as the Storage Management Initiative Specification (SMI-S) will also be widely accepted to make virtualization products more compatible with a wide range of storage systems. Enterprises will be provided with more options.