Home Articles Benchmarks Information Resources VPR

ArticlesHP-UX 10.0

How HP improved the performance, reliability, and ease of use of its flagship PA-RISC operating system

John Sontag

HP-UX 10.0, the latest operating-system release for Hewlett-Packard's PA-RISC family of servers and workstations, should have just shipped by the time you read this. HP designed it to perform better on SMP (symmetric multiprocessing) systems and under heavy loads.

To increase reliability, the new HP-UX has a journaled file system and the ability to replicate services. It is also easier to install, update, and manage thanks to a new software distribution utility, better system administration tools, and a simpler bundling scheme that rolls features formerly available only on workstations or servers or as add-ons into a single standard package. Finally, it's more standard. Version 10.0 complies with most of the COSE SPEC 1170, the Unix System V release 4.0 file system layout, NFS 4.2 with diskless support, and the Posix real-time interfaces.

Making It Faster

Processors keep getting faster, but overall system performance can't scale accordingly unless I/O throughput does too. Version 10.0 attacks the I/O bottleneck in several ways. It coalesces I/O requests that access sequential areas of a disk into a single I/O operation, which can improve disk throughput by up to 30 percent. Its read-ahead routines are tuned to predict file access behavior more accurately and exploit I/O coalescence. On SMP systems, disk throughput on a 12-CPU system was improved sixfold by more intelligent management of I/O initiations and interrupts across the set of processors.

Traditionally in Unix, a high-priority process in need of more physical memory can force a lower-priority process to swap to disk. In the case of a large application, the swap can involve hundreds of megabytes, and even with a fast wide SCSI device, the high-priority process might have to wait 5 or 6 seconds for the swap to complete.

In version 10.0, the virtual memory manager achieves much smoother operation under high load by replacing process swap with process deactivation. When physical memory runs low or processes begin to thrash memory, low-priority processes can be deactivated--that is, taken off the run queue. The system then need only displace small clusters of pages when a high-priority process requests memory. Instead of waiting 5 seconds or more, the process waits only about 20 milliseconds--a major improvement in response time.

To speed up systems with a large memory load, version 10.0 adds the serialize command (and system call). Consider two concurrent instances of a simulation program, each of which randomly touches a large data array. In earlier versions of HP-UX, these two instances would thrash physical memory because the least recently used and priority page replacement algorithms would not be able to predict which pages to keep in memory for best throughput. The serialize command tells the virtual memory system to make a process eligible to be serialized behind other processes. It's analogous to the Unix nice command. When there is plenty of memory, serialize has no effect. But when memory runs low, the system runs serialized processes one at a time. Once the highest-priority serialized processes run to completion, the lower-priority processes can complete. In some test cases where two instances touch large arrays of data randomly, this technique cuts clock time by a factor of eight. As with the nice command, you can return a serialized process to normal priority.

With Process Resource Manager, an alternate process scheduler, the system administrator can create groups of users and guarantee each group a minimum share of the total CPU time. Mission-critical applications must respond to users in a hurry. This technique ensures that background activities don't get in the way, even if they run at a higher priority than the mission-critical software.

Making It More Reliable

LVM (Logical Volume Manager), which came from OSF/1, manages collections of disk drives. It partitions drives, mirrors data for redundancy, and stripes data across multiple disks for higher performance. HP-UX 10.0 upgrades these features to improve the resiliency of mass-storage subsystems. When you're running a 24x7 (24 hours a day, 7 days a week) operation, there's never a convenient time to do a backup. Now you can remove a drive from a mirrored pair to enable off-line backup from any node with no interruption of service. LVM also lets you take one copy of a data set spread across many disks off-line and back it up. The backup utility can access a frozen set of data and operate at high speed. When backup is complete, the disks are brought back on-line and synchronized with the live system data. LVM exploits RAID disks with multiple controllers, offering an automatic fail-over capability.

Version 10.0 further protects data integrity with a journaled file system, VxFS (the Veritas file system) (see "The Great Little File System," February BYTE). Compared to the BSD 4.2 HFS and NFS, VxFS has superior data integrity, recovery, and performance. It maintains an intent log of uncommitted meta-data transactions. If the system crashes, recovery is a simple process of reading the intent log and applying or backing out changes. With add-on products, it is possible to resize and reorganize file systems on-line, control caching options, and use the intent log for fast, synchronous writes. All these features add up to a file system that is much more resilient across system failures and, potentially, much faster.

Resilience to memory faults becomes increasingly important as system memories grow larger and denser. In version 10.0, the diagnostic system and the operating system can mark bad pages and then avoid using them, thereby preventing system panics. If a page shows two occurrences of recoverable errors at the same address or one unrecoverable error, it's removed from service. Information about bad pages resides in a nonvolatile RAM, where it survives across system boots.

Ultimately, of course, reliability means keeping applications up and running no matter what. To that end, version 10.0 offers MC/ServiceGuard, a facility for mutual backup of services across clusters of up to four servers. When a protected service (or the system supporting it) fails, MC/ServiceGuard resurrects it on another system in the cluster. Applications are made highly available, without having to be rewritten, by means of packaging.

A package defines the set of resources an application needs to run, including disks and network resources. When a system, network connection, or application fails, a clusterwide monitor notices the service interruption and launches a package on a backup system. Using multiple disk connections, the backup system can commandeer the failed system's disk drives, and its networking interfaces can adopt the failed system's IP address.

When a mirrored disk or a redundant network interface fails, repair can occur in under 10 seconds. If an entire system fails, users will be able to continue in 1 to 2 minutes, once application recovery is complete. Depending on the application, you might lose some data entry, or you might even have to reenter the application--this isn't nonstop computing. But it's an extremely cost-effective way to have servers back up each other. It offers peak performance when all is well. When a failure occurs, service remains available with some degradation of performance because one system now must do the work of two.

Making It Easier to Use

HP-UX's system management tools share a common interface thanks to the OBAM (Object-Action Manager), which encapsulates diverse disciplines, including X Window System, Motif, international and CDE (Common Desktop Environment) support, on-line help, regression testing, and character-based terminal support. Using the GUI-based SAM (System Administration Manager), an HP-UX administrator configures and manages auditing and security, backup and recovery, disks and file systems, diskless cluster configuration, the kernel and devices, networks, peripherals, printers, processes, and user and group accounts. New with version 10.0 is a major reorganization of SAM, with emphasis on typical administrative tasks. The administrator also delegates such tasks to other users--with appropriate security restrictions--and adds user-defined utilities to the SAM menus.

Version 10.0's Software Distributor, or SD-UX, includes tools that package, distribute, and manage applications and operating-system software, as well as data. With SD-UX, users can pull software off the network and install it locally. Using an add-on product, administrators' networks will be able to push software to any of the nodes in the system. SD-UX runs on top of a DCE (Distributed Computing Environment) RPC (remote procedure call) and will exploit a secure RPC, as well as DCE authorization, authentication, and directory services where available. SD-UX also lets customers define bundles made of products and partial products, install them on their systems, interrogate systems to determine what is installed, and remove software from systems.

Making It More Standard

HP-UX 10.0 adds support for many industry standards. It complies with most of the COSE SPEC 1170 and lacks only System V signals and internationalized curses, both of which are due later this year. The HP-UX real-time scheduler, available since 1986, complies with Posix 1003.1B, which defines interfaces to a real-time scheduler and a set of high-resolution timers.

HP-UX 10.0 also converts to the standard Unix SVR4 file system layout, so SVR4-oriented users can find files in the directories where they're traditionally kept. (Links to the HP-UX 9.0-style directories ensure compatibility with the prior HP-UX tradition.)

HP-UX 10.0 bundles the DCE client. The base technology, from OSF 1.0.3, has been upgraded to include the security and RPC features of OSF 1.1. The disk footprint for the DCE client shrinks by 75 percent, and the RPC code has been tuned for about a 30 percent performance boost. Version 10.0 also bundles the Streams architecture with the base HP-UX product.

With NFS 4.2 support, version 10.0 enables diskless systems to be served using the NFS diskless protocol. Initially for HP systems only, this capability will later extend to other vendors' systems, too. Version 10.0 also offers full support for the 4-byte EUC (Extended Unix Code) code sets, so programmers can internationalize their applications.

HP considers HP-UX a unified product--one that is equally at home on a small uniprocessor desktop machine or a 12-way SMP superserver in the data center.

High Availability with MC/ServiceGuard

illustration_link (12 Kbytes)

(A) Administrators define packages of resources that applications need to run, and they assign packages to servers. (B) When a server fails, MC/ServiceGuard migrates packages to backup servers in the cluster.

John Sontag is the chief architect for HP-UX at Hewlett-Packard and has been working with HP-UX on PA-RISC systems since 1983. You can reach him on the Internet at sontag@cup.hp.com or on BIX c/o "editors."
UplevelPrevNextSearchComment  Copyright © 1994-1996Logo