Storage Glossary: Basic Storage Terms
Published: March 8, 2005
A B C D F G H I J L M N P R S T V Z
After data has been written to the primary storage site, new writes to that site can be accepted, without having to wait for the secondary (remote) storage site to also finish its writes. Asynchronous Replication does not have the latency impact that synchronous replication does, but has the disadvantage of incurring data loss, should the primary site fail before the data has been written to the secondary site. See also replication.
A two step process. Information is first copied to non-volatile disk or tape media. In the event of computer problems (such as disk drive failures, power outages, or virus infection) resulting in data loss or damage to the original data, the copy is subsequently retrieved and restored to a functional system.
A basic disk is a physical disk that can be accessed by MS–DOS and all Windows-based operating systems. Basic disks can contain up to four primary partitions, or three primary partitions and an extended partition with multiple logical drives. Compare to dynamic disks.
Raw data which does not have a file structure imposed on it. Database applications such as Microsoft SQL Server and Microsoft Exchange Server transfer data in blocks. Block transfer is the most efficient way to write to disk.
The ability of an organization to continue to function even after a disastrous event, accomplished through the deployment of redundant hardware and software, the use of fault tolerant systems, as well as a solid backup and recovery strategy.
A group of servers that together act as a single system, enabling load balancing and high availability. Clustering can be housed in the same physical location (basic cluster) or can be distributed across multiple sites (geo-dispersed clusters) for disaster recovery.
DAS (Direct Attached Storage)
DAS is storage that is directly connected to a server by connectivity media such as parallel SCSI cables. This direct connection provides fast access to the data; however, storage is only accessible from that server. DAS include the internally attached local disk drives or externally attached RAID (redundant array of independent disks) or JBOD (just a bunch of disks). Although Fibre Channel can be used for direct attached, it is more commonly used in storage area networks.
DFS (Distributed File System)
DFS allows administrators to group shared folders located on different servers by transparently connecting them to one or more DFS namespaces. A DFS namespace is a virtual view of shared folders in an organization.
The ability to recover from the loss of a complete site, whether due to natural disaster or malicious intent. Disaster recovery strategies include replication and backup/restore.
A dynamic disk is a physical disk that provides features that basic disks do not, such as support for volumes spanning multiple disks. Dynamic disks use a hidden database to track information about dynamic volumes on the disk and other dynamic disks in the computer.
A Fibre Channel (or iSCSI) topology with at least one switch present on the network.
In the event of a physical disruption to a network component, data is immediately rerouted to an alternate path so that services remain uninterrupted. Failover applies both to clustering and to multiple paths to storage. In the case of clustering, one or more services (such as Exchange) is moved over to a standby server in the event of a failure. In the case of multiple paths to storage, a path failure results in data being rerouted to a different physical connection to the storage.
Fault–tolerance is the ability of computer hardware or software to ensure data integrity when hardware failures occur. Fault-tolerant features appear in many server operating systems and include mirrored volumes, RAID– volumes, and server clusters.
Data which has an associated file system.
A high–speed interconnect used in storage area networks (SANs) to connect servers to shared storage. Fibre Channel components include HBAs, hubs, switches, and cabling. The term Fibre Channel also refers to the storage protocol.
File Replication service (FRS) is a technology that replicates files and folders stored in the SYSVOL shared folder on domain controllers and Distributed File System (DFS) shared folders. When FRS detects that a change has been made to a file or folder within a replicated shared folder, FRS replicates the updated file or folder to other servers.
A geo–dispersed, or multi-site, cluster is a cluster configuration used to help ensure high system and application availability in the event of site disaster. In this configuration, servers are separated geographically and the physical storage (quorum disk) is synchronously replicated between sites.
Global File System
In some configurations, as with clusters or multiple NAS boxes, it is useful to have a means to make the file systems on multiple servers or devices look like a single file system. A global or dispersed file system would enable storage administrators to globally build or make changes to file systems. To date this remains an emerging technology.
A continuously available computer system is characterized as having essentially no downtime in any given year. A system with 99.999% availability experiences only about five minutes of downtime. In contrast, a high availability system is defined as having 99.9% uptime, which translates into a few hours of planned or unplanned downtime per year.
HBA (Host Bus Adapter)
The HBA is the intelligent hardware residing on the host server which controls the transfer of data between the host and the target storage device.
ILM (Information Lifecycle Management)
The process of managing information growth, storage, and retrieval over time, based on its value to the organization. Sometimes referred to as data lifecycle management.
An initiator is the device (usually contained within a server) that makes the application requests; which are then sent to the target device.
iSCSI (Internet SCSI)
A protocol that enables transport of block data over IP networks, without the need for a specialized network infrastructure, such as Fibre Channel.
JBOD (Just a Bunch of Disks)
As the name suggests, a group of disks housed in its own box; JBOD differs from RAID in not having any storage controller intelligence or data redundancy capabilities.
Referring to the ability to redistribute load (read/write requests) to an alternate path between server and storage device, load balancing helps to maintain high performance networking.
LUN (Logical Unit Number)
A logical unit is a conceptual division (a subunit) of a storage disk or a set of disks. Logical units can directly correspond to a volume drive (for example, C: can be a logical unit). Each logical unit has an address, known as the logical unit number (LUN), which allows it to be uniquely identified.
A method to restrict server access to storage not specifically allocated to that server. LUN masking is similar to zoning, but is implemented in the storage array, not the switch.
A mount point is a directory on a volume that an application can use to "mount" (set up for use) a different volume. Mount points overcome the limitation on drive letters and allow more logical organization of files and folders.
Multipathing is the use of redundant storage network components responsible for transfer of data between the server and storage. These components include cabling, adapters and switches and the software that enables this.
NAS (Network Attached Storage)
A NAS device is a server that runs an operating system specifically designed for handling files (rather than block data). Network-attached storage is accessible directly on the local area network (LAN) through LAN protocols such as TCP/IP. Compare to DAS and SAN.
NTFS File System
A file system that provides performance, security, reliability, and advanced features that are not found in any version of the file allocation table (FAT) filesystem. For example, NTFS guarantees volume consistency by using standard transaction logging and recovery techniques. If a system fails, NTFS uses its log file and checkpoint information to restore the consistency of the file system. NTFS also provides advanced features, such as file and folder permissions, encryption, disk quotas, and compression.
A partition is the portion of a physical disk or LUN that functions as though it were a physically separate disk. Once the partition is created, it must be formatted and assigned a drive letter before data can be stored on it. On basic disks, partitions can contain basic volumes, which include primary partitions and logical drives. On dynamic disks, partitions are known as dynamic volumes, which include simple, striped, spanned, mirrored, and RAID–5 (striped with parity) volumes.
The physical connection point on computers, switches, storage arrays, etc, which is used to connect to other devices on a network. Ports on a Fibre Channel network are identified by their Worldwide Port Name (WWPN) IDs; on iSCSI networks, ports are commonly given an iSCSI name. Not to be confused with TCP/IP ports, which are used as virtual addresses assigned to each IP address.
RAID (Redundant Array of Independent Disks)
A way of storing the same data over multiple physical disks to ensure that if a hard disk fails a redundant copy of the data can be accessed instead. Example schemes include mirroring and RAID–5.
The duplication of information or hardware equipment components to ensure that should a primary resource fail, a secondary resource can take over its function.
Replication is the process of duplicating mission critical data from one highly available site to another. The replication process can be synchronous or asynchronous; duplicates are known as clones, point-in-time copies, or snapshots, depending on the type of copy being made.
SAN (Storage Area Network)
A storage area network (SAN) is a specialized network that provides access to high performance and highly available storage subsystems using block storage protocols. The SAN is made up of specific devices, such as host bus adapters (HBAs) in the host servers, switches that help route storage traffic, and disk storage subsystems. The main characteristic of a SAN is that the storage subsystems are generally available to multiple hosts at the same time, which makes them scalable and flexible. Compare with NAS and DAS.
SCSI (Small Computer System Interface)
A set of standards allowing computers to communicate with attached devices, such as storage devices (disk drives, tape libraries etc) and printers. SCSI also refers to a parallel interconnect technology which implements the SCSI protocol.
A shadow copy is a high fidelity point–in–time copy of the original data. In the Windows environment, shadow copies are created using the Volume Shadow Copy Service (VSS); third party applications can create shadow copies also.
A subsystem which houses a group of disks (or tapes), together controlled by software usually housed within the subsystem.
Providing such functionality as disk aggregation (RAID), I/O routing, and error detection and recovery, the controller provides the intelligence for the storage subsystem. Each storage subsystem contains one or more storage controllers.
An intelligent device residing on the network responsible for directing data from the source (such as a server) or sources directly to a specific target device (such as a specific storage device) with minimum delay. Switches differ in their capabilities; a director class switch, for example, is a high end switch that provide advanced management and availability features.
In synchronous replication, each write to the primary disk and the secondary (remote) disk must be complete before the next write can begin. The advantage of this approach is that the two sets of data are always synchronized. The disadvantage is that if the distance between the two storage disks is substantial, the replication process can take a long time and slows down the application writing the data. See also asynchronous replication.
A target is the device to which the initiator sends data. Most commonly the target is the storage array, but the term also applies to bridges, tape libraries, tape drives or other devices.
Data is stored according to its intended use. For instance, data intended for restoration in the event of data loss or corruption is stored locally, for fast recovery. Data required to be kept for regulatory purposes is archived to lower cost disks.
VDS (Virtual Disk Service)
VDS is a set of application programming interfaces (APIs) that provides a single interface for managing disks in Windows Server 2003 operating systems. VDS provides a means of managing storage hardware and disks, and for creating volumes on those disks.
In storage, virtualization is a means by which multiple physical storage devices are viewed as a single logical unit. Virtualization can be accomplished in–band (in the data path) or out-of-band. Out–of–band virtualization does not compete for host resources, and can virtualize storage resources irrespective of whether they are DAS, NAS or SAN.
A volume is an area of storage on a hard disk. A volume is formatted by using a file system, such as file allocation table (FAT) or NTFS, and typically has a drive letter assigned to it. A single hard disk can have multiple volumes, and volumes can also span multiple disks.
VSS (Volume Shadow Copy Service)
The Volume Shadow Copy Service provides the backup infrastructure for the Microsoft Windows XP and Microsoft Windows Server 2003 operating systems, as well as a mechanism for creating consistent point-in-time copies of data known as shadow copies.
A method used to restrict server access to storage resources that are not allocated to that server. Zoning is similar to LUN masking, but is implemented in the switch and operates on the basis of port identification (either port numbers on the switch or by WWPN of the attached initiators and targets).