Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

CPU monitoring and tuning

Get rid of your CPU bottlenecks and improve performance

Wayne Huang (huangw@us.ibm.com), Senior Consultant, IBM
Wayne Huang is a Senior Consultant for the IBM eServer pSeries and AIX systems, with a focus on e-business, banking, finance, and securities industries. He provides AIX support to ISVs in the areas of application design, problem determination, system performance tuning, and application benchmarks. He holds a BS in Physics from National Taiwan University and an MS in Computer Science from the University of Texas in Austin, TX. You can reach him at huangw@us.ibm.com.
Lee Cheng (chenglc@us.ibm.com), Senior Consultant, IBM
Lee Cheng currently works as a Senior Consultant for pSeries (RS/6000) and AIX software vendors. She provides support to them in the areas of application benchmarks, performance tuning, application porting, and internationalization. Before joining the RS/6000 ISV Technical Support group, she was a developer for compilers and the AIX system management component. She holds an MS in Computer Science from the University of Kentucky. You can reach her at chenglc@us.ibm.com.
Matthew Accapadi (accapadi@us.ibm.com), Senior Software Engineer, IBM
Matthew Accapadi is an IBM Senior Software Engineer responsible for performance of AIX and Oracle on AIX, tuning benchmarks for optimal performance, and teaching courses on AIX performance tuning. He works with other vendors to improve their application performance on AIX. He has a BS in Computer Science from Texas A&M University. You can reach him at accapadi@us.ibm.com.
Nam Keung, Senior Programmer, IBM
Nam Keung works as a Senior Programmer at IBM. Nam works in the area of AIX communication development, AIX multimedia, SOM/DSOM development, and Java™ performance. His current assignment involves helping ISVs in application design, deploying applications, performance tuning, and education for the pSeries platform. You can contact him at namkeung@us.ibm.com.

Summary:  Learn how standard AIX® tools can help you determine CPU bottlenecks. IBM performance experts show you how to interpret the reports generated by these tools for CPU utilization, thread priority, and scheduling to improve performance. They also provide two case studies to give you real-world examples.

Date:  28 Jul 2005 (Published 01 Mar 2002)
Level:  Introductory
Also available in:   Korean  Russian

Activity:  79896 views
Comments:  

Introduction

AIX 5L™ Version 5.3 is the latest version of the AIX® operating system that offers simultaneous multi-threading (SMT) on eServer™ p5 systems to deliver industry leading throughput and performance levels. With support for advanced virtualization, AIX 5L Version 5.3 helps you to dramatically increase your server utilization and consolidate workloads for more efficient management.

A review of computing history and operating systems shows that computer scientists have developed many CPU scheduling policies. First-in, first-out (FIFO), shortest job first, and round robin are just a few. Scheduling policies are important because a single policy might not be best suited to all applications. Some applications in certain workloads can run well in a default scheduling policy. However, the same applications with a different workload might require a scheduling policy adjustment in order to achieve the optimal performance.

Note: This article is an update for AIX 5.3 performance. The advanced virtualization is not discussed in this article. It has enhancements and updates to emphasize AIX 5L Version 5.3 features, tools, and capabilities.


What is SMT?

SMT is the ability of a single physical processor to concurrently dispatch instructions from more than one hardware thread. In AIX 5L Version 5.3, a dedicated partition created with one physical processor is configured as a logical two-way by default. Two hardware threads can run on one physical processor at the same time. SMT is a good choice when overall throughput is more important than the throughput of an individual thread. For example, Web servers and database servers are good candidates for SMT.

Viewing processor and attribute information

By default, the SMT is enabled, as shown in Listing 1 below.


Listing 1. SMT
# smtctl

This system is SMT capable.

SMT is currently enabled.

SMT threads are bound to the same physical processor.

Proc0 has 2 SMT threads

Bind processor 0 is bound with proc0

Bind processor 2 is bound with proc0

Proc2 has 2 SMT threads

Bind processor 1 is bound with proc2

Bind processor 3 is bound with proc2

# lsattr -El proc0

frequency   1656376000     Processor Speed       False

smt_enabled true           Processor SMT enabled False

smt_threads 2              Processor SMT threads False

state       enable         Processor state       False

type        PowerPC_POWER5 Processor type        False

The smtctl command provides privileged users and applications the ability control utilization of processors with SMT support. With this command, you can turn SMT on or off. The smtctl command syntax is:

smtctl [-m off | on [ -w boot | now] ]

What are shared processors?

Shared processors are physical processors that are allocated to partition on a timeslice basis. You can use any physical processor in the shared processor pool to meet the execution needs of any partition using the shared processor pool. An eServer p5 system can contain a mix of shared and dedicated partitions. A partition must be all shared or all dedicated, and you can not use dynamic LPAR (DLPAR) commands to change between the two. You need to bring down the partition and switch it from using dedicated to shared, or vice versa.

Processing units

After a partition is configured, you can assign it an amount of processing units. A partition must have a minimum of 1/10 of a processor. And after that requirement has been met, you can configure processing units at the granularity on 1/100 of a processor. A partition that uses shared processors is often called a shared partition. A dedicated partition is one that uses dedicated processors.

Each partition is configured with a percentage of execution dispatch time for each 10 milliseconds (ms) timeslice. For example:

  • A partition with 0.2 processing units is entitled to 20 percent capacity during each timeslice.
  • A partition with 1.8 processing units is entitled to 18ms processing time for each 10ms timeslice (using multiple processors).

There is no accumulation of unused cycles. If a partition does not use the entitled processing capacity, the excess processing time is ceded back to the shared processing pool.

Partitions with shared processors are either capped or uncapped. The capped partition is assigned with a hard limit capacity. If a partition needs an extra CPU cycle (more than its total processing units), it can utilize unused capacity in the shared pool.


Scheduling algorithms

AIX 5 implements the following scheduling policies: FIFO, round robin, and a fair round robin. The FIFO policy has three different implementations: FIFO, FIFO2, and FIFO3. The round robin policy is named SCHED_RR in AIX, and the fair round robin is called SCHED_OTHER. We discuss these policies in greater detail in the upcoming sections.

Scheduling policies can have a major impact on system performance, depending on how one assigns and manages them (response time and throughput). For example, FIFO is a good choice for the job that uses a lot of CPU, but it also can choke out all of the other jobs waiting in line. A basic round robin gives a "timeslice" or "quantum" to each job in a time-shared manner. As a result, it tends to discriminate against I/O-intensive tasks, since those tasks often give up CPU voluntarily due to I/O wait. The fair round robin is "fair" because scheduling priorities change as the jobs accumulate quantums of CPU time during execution. This allows the operating system to demote a CPU hugger so that an I/O bound job has a fair chance to use the CPU resource.

Let's go over two important concepts before getting into the scheduling details: the nice value and the AIX priority and run queue structure.

The nice and renice commands

AIX has two important scheduling commands: nice and renice. A user job in AIX carries a base priority level of 40 and a default nice value of 20. Together, these two numbers form the default priority level of 60. This value applies to most of the jobs you see in a system.

When you start a job with a nice command, such as nice -n 10 myjob, the number 10 becomes the delta_NICE. This number is added to the default 20 to create the new nice value of 30. In AIX, the higher this number, the lower the priority. Using this example, your job now starts with a priority of 70, which is 10 levels worse in priority than the default.

The renice command applies to a job that has already started. For example, the renice -n 5 -p 2345 command causes process 2345 to have a nice value of 25. Note that the renice value is always applied to a base nice of 20, regardless of the current nice value of the process.

AIX priority and run queue structure

A thread carries a priority range from 0 to 255 (the range is from 0 to 127 on systems prior to AIX 5). Priority 0 is the highest or the most favorable, and 255 is the lowest or least favorable. AIX maintains a run queue in the form of a 256-level priority queue to efficiently support the 256 priority levels of threads.

AIX also implements a 256-bit array to map to the 256 levels of the queue. If a particular queue level is empty, the corresponding bit is set to 0. This design allows the AIX scheduler to quickly identify the first non-empty level and start the first ready-to-run job in that level. See the AIX run queue structure in Figure 1 below.


Figure 1. Scheduler run queue
Scheduler run queue

In Figure 1, the scheduler maintains a run queue of all the threads that are ready to be dispatched. All dispatchable threads of a given priority occupy consecutive positions in the run queue.

AIX 5L implements one run queue for each CPU and a global queue. For example, there are 32 run queues and one global queue in an eServer pSeries® p590 machine. With a per-CPU run queue, a thread has better chance to go back to the same CPU after a preemption, which is an affinity enhancement. Also, the contention among CPUs to lock the run queue structure is much reduced with multiple run queues.

However, for some situations, a multiple run queue structure might not be desirable. Exporting a system environment variable RT_GRQ=ON can cause a thread to be placed on the global run queue when it becomes runnable. This can improve performance for threads that are interrupt-driven and running SCHED_OTHER. If schedo –o fixed_pri_global =1 is run on AIX 5L Version 5.2 and later, threads running the fixed priority are placed on the global run queue.

For local run queues, the dispatcher picks the best priority thread in the run queue when a CPU is available. When a thread has been running on a CPU, it tends to stay on that CPU's run queue. If that CPU is busy, then the thread can be dispatched to another idle CPU and assigned to that CPU's run queue.

FIFO

Although the FIFO policy is the simplest, it is rarely used because of its non-preemptive nature. A thread with this scheduling policy runs all the way to completion, unless one of the following happens:

  • It gives up the CPU voluntarily by executing a function that would put the thread to sleep, such as sleep() or select().
  • It gets blocked due to resource contention.
  • It has to wait for I/O completion.

The checkout lane at a grocery store uses a typical FIFO policy. Imagine yourself in the checkout lane with only one TV dinner (and you're hungry), but the person in front has a full load in his cart. What can you do? Not much. Since this is a FIFO, you must wait patiently for your turn.

Similarly, it is obvious that job response time can suffer severely if several tasks are running FIFO mode in AIX. Consequently, FIFO is rarely used in AIX. Only a process owned by root can set itself or another thread to FIFO with the thread_setsched() system call.

There are two variations of the FIFO policy: FIFO2 and FIFO3. FIFO2 says that a thread is put at the head of its run queue if it was asleep for only a short period of time less than a predefined number of ticks (affinity_lim ticks, tunable with the schedo -p command). This allows a thread to have a good chance to reuse the cache content. For FIFO3, a thread is always put at the head of the queue when it becomes runnable.

Round robin

The well-known round robin scheduling policy is even older than UNIX® itself. AIX 5L implements round robin on top of its multilevel priority queue of 256 levels. At a given priority level, a round robin thread shares the CPU timeslices with all other entries of the same priority. A thread is scheduled to run until one of the following occurs:

  • It yields the CPU to other tasks.
  • It is blocked for I/O.
  • It uses up its timeslice.

When the timeslice is exhausted, if a thread of equal or better priority is available to run on that CPU, the thread that is currently running is then placed at the end of the queue for the next turn to own the processor. A thread can be preempted because of a higher priority job waking up or a device interrupt (for example, after an I/O is done).

For a round robin task only, this preempted thread is placed at the beginning of its queue level, because AIX wants to ensure that a round robin job has a full timeslice before it is moved to the end of the round robin chain. It is important to note that the priority of a round robin thread is fixed and does not change over time. This makes the priority of a round robin task persistent (as opposed to the changing priorities in fair round robin) and more predictable.

Since a round robin thread has special status, only root can set a thread to run with the round robin scheduling policy. To set SCHED_RR for a thread, use one of the following application programming interfaces (APIs): thread_setsched() or setpri().

SCHED_OTHER

This last scheduling policy is also the default. While trying to establish the fairest policy among tasks, this innovative SCHED_OTHER algorithm was created with a not so innovative POSIX™-defined name. The AIX SCHED_OTHER is a priority-queue round robin design at the core, with one major difference: the priority is no longer fixed. If a task is using an excessive amount of CPU time, its priority level should be downgraded to allow other jobs an opportunity to access the CPU.

If a task is at a priority level so low (a high number) that it does not have an opportunity to run, then its priority should be upgraded to a higher level (a lower number) so it can run to finish. A new concept was also implemented to further enhance the effectiveness of the nice value: If a task is nice (the UNIX nice value) at the beginning, the system will then force it to be nice all the time. I discuss this feature later.

Traditional CPU utilization

Prior to AIX 5.3 or with SMT disabled, AIX processor utilization uses a sample-based approach to approximate:

  • Percentage of processor time spent executing user programs
  • System code
  • Waiting for disk I/O
  • Idle time

AIX produces 100 interrupts per second to take samples. At each interrupt, a local timer tick (10ms) is charged to the current running thread that is preempted by the timer interrupt. One of the following utilization categories is chosen based on the state of the interrupted thread:

  • If the thread was executing code in the kernel using system call, the entire tick is charged to the process system time.
  • If the thread was executing application code, the entire tick is charged to the process user time. Otherwise, if the current running thread was the operating system's idle process, the tick is changed in a separate variable. The problem with this method is the process receiving the tick most likely did not run for the entire timer period and happened to be executing when the timer expired. With AIX 5.3 SMT enabled, the traditional utilization metrics are misleading as treating due to the two logical processors.
  • If one thread is 100 percent busy, one idle thread would result in 50 percent utilization. But in reality, if one SMT thread is using all CPU resources, then that CPU is 100 percent busy, as reported using the new Processor Utilization Resource Register- (PURR) based method.

PURR

Beginning in AIX 5.3, the number of dispatch cycles for each thread can be measured using a new register called the PURR. Each physical processor has two PURR registers (one for each hardware thread). The PURR is a new register provided by the POWER5 processor, which is used to provide an actual count of physical processing time units that a logical processor has used. All performance tools and APIs utilize this PURR value to report CPU utilization metrics for SMT systems. This register is a special-purpose register that can be read or written by the POWER™ Hypervisor™; however, it is read-only by the operating system. The hardware increments for PURRs is based on how each thread is using the resources of the processor, including the dispatch cycles that are allocated to each thread. For a cycle in which no instructions are dispatched, the PURR of the thread that last dispatched an instruction is incremented. The register advances automatically so that the operating system can always get the current up-to-date value.

When the processor is in single-thread mode, the PURR increments by one every eight processor clock cycles. When the processor is in SMT mode, the thread that dispatches a group of instructions in a cycle increments the counter by 1/8 in that cycle. If no group dispatch occurs in a given cycle, both threads increment their PURR by 1/16. Over a period of time, the sum of the two PURR registers, when running in SMT mode, should be very close, but not greater than the number of timebase ticks.

AIX 5.3 CPU utilization

In AIX 5L V5.3, there are new metrics that are collected by the kernel that are stated-based rather than a sample-based approach. State-based is the collection of information based on PURR increments rather than a set time of 10ms. AIX 5.3 uses PURR for process accounting. Instead of charging the entire 10ms clock tick to the interrupted process as before, processes are charged on the PURR delta for the hardware thread since the last interval. At each interrupt:

  • The elapsed PURR is calculated for the current sample period.
  • This value is added to the appropriated utilization category (user, sys, iowait, and idle), instead of the fixed-size increment (10 ms) that was previously added.

There are two different ways to measure: the thread’s processor time and the elapsed time. To measure the elapsed time, the time-based register (TB) is still used. The physical resource utilization metrics for a logical processor are:

  • (delta PURR/delta TB) represents the fraction of the physical processor consumed by a logical processor.
  • (delta PURR/delta TB) * 100 over an interval represent the percentage of dispatch cycles given to a logical processor.

CPU utilization example

Assume two threads are running on one physical processor with SMT enabled. Both SMT threads of a physical CPU are busy. Using the old tick-based method, both SMT threads would be reported as 100 percent busy but, in reality, they are really sharing the CPU resources evenly. This means the new PURR-based method would show each SMT thread as 50 percent busy.

Using the PURR methods, each logical processor reports a utilization of 50 percent representing the proportion of physical processor resources that it used, assuming equal distribution of physical processor resources to both the hardware threads.

Additional CPU utilization metrics

The following metrics uses the per-thread PURR method to measure the thread's processor time and uses the TB register to measure the elapsed time.


Table 1. Per-thread PURR method
Additional CPU utilization metricsInformation provided
%sys=(delta PURR in system mode/entitled PURR) * 100 where entitled PURR – (ENT * delta TB), and ENT is entitlement in # of processors (entitlement/100)Physical CPU utilization metrics are calculated using the PURR-based samples and entitlement.
sum (delta PURR/delta TB) for each logical processor in a partitionThe Physical Processor Consumed over an interval.
(PPC/ENT) * 100The percentage of entitlement consumed.
(delta PIC/delta TB) where PIC is the Pool Idle count, which represents the clock ticks where POWER Hypervisor was idleIt provides the available pool of processors.
Sum of traditional 10ms tic-based %sys and %userLogical processor utilization helps you to determine if more virtual processors should be added to a partition.

AIX 5.3 command changes

When AIX is running with SMT enabled, commands that display CPU information, such as vmstat, iostat, topas, and sar, display the PURR-based statistics, rather than the traditional sample-based statistics. In SMT mode, additional columns of information are displayed, as show in Table 2 below.


Table 2. SMT mode
ColumnDescription
pc or physcPhysical Processor Consumed by the partition
pec or %entcPercentage of Entitlement Consumed by the partition

Another tool that needed modification was trace/trcrpt and several other tools that are based on the trace utility. In an SMT environment, trace can optionally collect PURR register values at each trace hook, and trcrpt can display elapsed PURR.

Table 3 below shows the arguments to use for an SMT.


Table 3. Arguments for SMT
ArgumentDescription
trace – r PURRCollects the PURR register values. Only valid for a trace run on a 64-bit kernel.
trcrpt –O PURR=[on|off] Tells trcrpt to show the PURR, along with any timestamps.
netpmon –r PURRUses the PURR time instead of timebase in percent and CPU calculation. Elapsed time calculations are unaffected.
pprof –r PURRUses the PURR time instead of timebase in percent and CPU calculation. Elapsed time calculations are unaffected.
gprofGPROF is the new environment variable to support the SMT.
curt –r PURRSpecifies the use of PURR register to calculate CPU times.
splat –p Specifies the use of PURR register to calculate CPU times.

Thread priority formulas

You can calculate the priority of a thread using the formulas, as shown in Listing 2 below. It is a function of the nice value, the CPU usage c, and a tuning factor r.


How AIX calculates the new priority

The clock timer interrupt occurs every 10ms or 1 tick on each CPU. The timers are staggered so that a CPU's clock timer does not go off at the same time as another CPU's clock timer. When the CPU clock timer interrupt occurs (even before the thread has run for a full 10ms), the thread has its CPU usage value (the CPU charge) incremented by one, up to a maximum of 120. If a job does not get a full 10ms slice and is running RR policy, the system dispatcher changes the thread's priority in the run queue to allow it to run again soon.

The priority of most user processes varies with the amount of CPU time the process has used recently. The CPU scheduler's priority calculations are based on two parameters that are set with schedo, sched_R, and sched_D. The sched_R and sched_D values are in 1/32 seconds. The scheduler uses this formula to calculate the amount to add to a process's priority value as a penalty for recent CPU use. For example:

CPU penalty = (recently used CPU value of the process) * (r/32)

The recalculation (once per second) of the recently used CPU value of each process is:

New recently used CPU value = (old recently used CPU value of the process) * (d/32)

Both r (sched_R parameter) and d (sched_D parameter) have default values of 16.

The recent CPU charge C is then used to determine the priority penalty and to recalculate the new thread priority. Using the first formula as a reference (see Listing 2), you know that a newly started user task, which carries a base priority 40, a default nice value of 20, and no CPU charge so far (C=0), begins with a priority level 60.

Also, in the first formula, the value r determines the penalty ratio with a range from zero to 32. An r value of zero means a no-charge penalty for the CPU, since it is always zero (C*r/32). If r=32, it yields the highest possible penalty charge for a CPU -- each tick (10ms) of CPU usage translates to one priority-level downgrade.

In most cases, the value of r lies near the middle between zero and 32. AIX defaults r to 16; that is, every two ticks of CPU charge become one level of priority penalty. When the r value is high, the impact of a nice value becomes less important since the CPU usage penalty prevails. A smaller r, on the contrary, makes the effect of the nice value more obvious.

Based on this discussion, the effectiveness of the nice value diminishes after a while. The reason for this is because the CPU charge grows in time and gradually becomes the main factor in determining the new priority.

This formula has been modified in AIX 5L to increase the weight of the nice value in calculating the priority level. With all the different versions of AIX, two new factors have been introduced : x_nice and x_nice_factor ("extra nice" and "extra nice factor"). See the second formula in Listing 2 below.


Listing 2. Thread priority formulas
<Formula 1 : The Basic Formula>
Priority = p_nice + (C * r/32)                 (1)

<Formula 2 : for AIX 5L>
Priority = x_nice + (C * r/32 * x_nice_factor) (2)
Where:
   p_nice = base_PRIORITY + NICE
      base_PRIORITY = 40
      NICE = 20 + delta_NICE
      (20 is the default nice value)
      That is, 
   P_nice = 60 + delta_NICE

   C is the CPU usage charge
      The maximum value of C is 120
   If NICE <= 20 then x_nice = p_nice
   If NICE > 20 then
   x_nice = p_nice * 2 - 60 or
   x_nice = p_nice + delta_NICE, or         (3)
   x_nice = 60 + (2 * delta_NICE)           (3a)
   x_nice_factor = (x_nice + 4)/64          (4)
   Priority has a maximum value of 255

As you can see from Formula 2 and Formula 3, the x_nice now has doubled the increased nice value. The x_nice_factor further strengthens the r ratio. For example, an initial nice 16, which gives a nice value of 36, results in a new x_nice_factor of 1.5. This value is a 50 percent higher CPU charge penalty for the CPU usage part over the lifetime of the thread.

Decaying the CPU usage

It is possible that a thread can get a priority so low that it never has a chance to run. This would occur if you use only Formulas 1 and 2 without a mechanism to push a thread's priority level back up.

When a thread runs with SCHED_OTHER, its priority is degraded for its use of CPU time. When it is not running and is waiting for its turn, AIX tries to regain its priority by "decaying" its CPU charges, about once a second. The rule is simple: A CPU-bound job should be assigned a lower priority to allow other jobs to run, but it should not be discriminated against to the point that it cannot finish itself. All threads' CPU charge is decayed based on a predefined factor of once per second, as follows:

New Charge C = (Old Charge C) * d / 32              (5)

A kernel process Swapper does this job. Once every second, Swapper wakes up and handles the CPU charge decaying for all the threads. The default decay factor is 0.5 or d=16, which "discounts" or "waives" half of the CPU charge.

With this mechanism, a CPU-intensive job accumulates CPU charge, gets to a lower priority level, and then advances to a much higher level at the end of a second. On the other hand, an I/O-intensive job does not vary its priority up and down as much, since it generally accumulates less CPU time.


Have you exhausted your CPU?

Now that you understand how the AIX scheduler prioritizes the workload, let's look at several commonly used commands. If AIX seems to take too long to finish your workload or it does not respond quickly enough, try these commands to investigate whether your system is CPU-bound: vmstat, iostat, and sar.

We do not discuss all the possible ways to use these commands, but instead emphasize the information they convey to you. For a detailed description of these commands, see your AIX publications or visit the IBM System p and AIX Information Center at http://publib16.boulder.ibm.com/pseries/index.htm. Scroll down, if necessary, and click AIX Version 5L Version 5.3 Version 5.3 information center to start using the AIX 5 publications.

The priority change history of a thread

Listing 3 shows how the CPU charge can change the priority of a thread:


Listing 3. Change of CPU charge and the priority of a thread
Base priority is 40
Default NICE value is 20, assume task was run using the
default nice value
p_nice = base_priority + NICE = 40 + 20 = 60
Assume r = 2 to slow down the penalty increase (default
r value is 16)
Priority = p_nice + C*r/32 = 60 + C * r / 32
Tick 0 P = 60 + 0 * 2 / 32 = 60
Tick 1 P = 60 + 1 * 2 / 32 = 60
Tick 2 P = 60 + 2 * 2 / 32 = 60
….
Tick 15 P = 60 + 15 * 2 / 32 = 60
Tick 16 P = 60 + 16 * 2 / 32 = 61
Tick 17 P = 60 + 17 * 2 / 32 = 61
….
….
Tick 100 P = 60 + 100 * 2 / 32 = 66
Tick 100 Swapper decays all CPU usage charges for all threads.
New C CPU Charge = (Current CPU Charge) * d / 32
Assume d = 16 (the default)
For the test thread, new C = 100 * 16 / 32 = 50

Tick 101 P = 60 + 51 * 2 / 32 = 63

Listing 4 shows how to specify a fast or slow priority:


Listing 4. Priority change of a typical CPU-bound job (fast verses slow)
fast.c:
main(){for (;;)}

slow.c:
main() {sleep 80;}


Common commands

The vmstat, iostat, and sar commands are used frequently for CPU monitoring. You should be familiar with the usage and the meaning of the reports each command generates.

vmstat

The vmstat command provides an overview of resource utilization through a report of CPU, disk, and memory activity in a one-line-per-report format. The sample output in Listing 5 is generated on an AIX 5L Version 5.3 system running "vmstat 1 6". This report was generated every second, as requested. Since a count of six was specified following the interval, reporting stops after the sixth report. One popular way to run the vmstat command is to leave out the count parameter; vmstat then generates reports continuously until the command terminates.

Except for the avm and fre columns, the first report contains average statistics per second since system startup. Subsequent reports contain statistics collected during the interval since the previous report.

Beginning with AIX 5L Version 5.3, the vmstat command reports the number of physical processors consumed (pc) and the percentage of entitlement consumed (ec) in the Micro-Partitioning™ and SMT environments. These metrics only display on Micro-Partitioning and SMT environments.

AIX 5L adds a useful new option "-I" to vmstat that shows the number of threads waiting for the raw I/O to complete (p column) and the number of file pages paged in/out per second (fi/fo columns).

The following detailed descriptions of the columns convey useful information about CPU utilization. Listing 5 shows the output of the vmstat 1 6 command:


Listing 5. Output of the vmstat 1 6 command from a p520 system (two CPUs)
vmstat 1 6 
System configuration: lcpu=4 mem=15808MB  
kthr   memory     page     faults      cpu 
-----  -------    ------   --------    ----------- 
r b avm    fre    re  pi  po  fr  sr  cy  in   sy   cs  us sy id wa 
1 1 110996 763741 0   0   0   0    0   0 231   96   91  0  0  99  0 
0 0 111002 763734 0   0   0   0    0   0 332 2365  179  0  1  99  0 
0 0 111002 763734 0   0   0   0    0   0 330 2283  139  0  5  93  1 
0 0 111002 763734 0   0   0   0    0   0 310 2212  153  0  0  99  0 
1 0 111002 763734 0   0   0   0    0   0 314 2259  173  0  0  99  0 
0 0 111002 763734 0   0   0   0    0   0 321 2261  177  0  1  99  0

Figure 2 shows the output of the command vmstat -I 1 (issued during a software installation):


Figure 2. Output of the vmstat -I 1 command
XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width.

See Table 4 below for a listing of relevant columns with descriptions.


Table 4. Description of relevant columns
ColumnDescription
kthrKernel thread state changes per second over the sampling interval.
rNumber of kernel threads placed in run queue.
bNumber of kernel threads placed in the Virtual Memory Manager (VMM) wait queue (awaiting resource, awaiting input/output).
pThe number of threads waiting on raw I/Os (bypassing journaled file system (JFS)) to complete. This is only available on AIX 5 and later.
fi/foNumber of file pages paged in/out per second. Note: This column is available only on AIX 5 and later systems.
cpuBreakdown of percentage usage of CPU time. For multiprocessor systems, CPU values are global averages among all processors. Also, the I/O wait state is defined system-wide and not per processor.
usAverage percentage of CPU time executing in the user mode.
syAverage percentage of CPU time executing in the system mode.
idAverage percentage of time that CPUs were idle and the system did not have an outstanding disk I/O request.
waCPU idle time during which the system had outstanding disk/NFS I/O request(s). If there is at least one outstanding I/O to a disk when wait is running, the time is classified as waiting for I/O. Unless asynchronous I/O is being used by the process, an I/O request to disk causes the calling process to block (or sleep) until the request has been completed. Once an I/O request for a process completes, it is placed on the run queue. If the I/Os were completing faster, more CPU time could be used.
pcNumber of physical processors consumed. Displayed only if the partition is running with shared processor.
ecThe percentage of entitled capacity consumed. Displayed only if the partition is running with the shared processor.

A CPU is marked wio at the time of a clock interrupt (every 1/100 ms), if the CPU is idling and an outstanding I/O was initiated on that CPU. If a CPU is only idling with no outstanding I/O from that CPU, it is marked as id instead of wa. For example, a system with four CPUs and one thread doing I/O reports a maximum of 25 percent wio time. A system with 12 CPUs and one thread doing I/O reports a maximum of 8.3 percent wio time. To be precise, the wio measures the percent of time the CPU is idle as it waits for an I/O to complete.

These four columns should total 100 percent, or very close. If the sum of user and system (us and sy) CPU-utilization percentages consistently approach a 100 percent, the system might be encountering a CPU bottleneck.

iostat

The iostat command is used primarily to monitor system input and output devices, but it can also provide CPU utilization data. Beginning with AIX 5.3, the iostat command reports number of physical processors consumed (physc) and the percentage of entitlement consumed (% entc) in Micro-Partitioning and SMT environments. These metrics are only displayed on Micro-Partitioning/SMT environments. When SMT is enabled, iostat automatically uses a new PURR-based data and formula for:

  • %user
  • %sys
  • %wait
  • %idle

Listing 6 is generated on an AIX 5L Version 5.3 system by entering "iostat 5 3", as follows:


Listing 6. iostat report
System configuration: lcpu=4 drives=9

tty: tin tout avq-cpu: %user %sys %idle %iowait
     0.0 4.3        0.2   0.6  98.8   0.4
Disks: %tm_act Kbps tps      Kb_read Kb_wrtn
hdisk0   0.0   0.2  0.0       7993    4408
hdisk1   0.0   0.0  0.0       2179    1692
hdisk2   0.4   1.5  0.3      67548   59151
cd0      0.0   0.0  0.0          0       0
tty: tin tout cpu: %user %sys %idle %iowait
     0.0 30.3       8.8   7.2  83.9    0.2
Disks: %tm_act Kbps tps      Kb_read Kb_wrtn
hdisk0   0.2   0.8  0.2          4       0
hdisk1   0.0   0.0  0.0          0       0
hdisk2   0.0   0.0  0.0          0       0
cd0      0.0   0.0  0.0          0       0
tty: tin tout cpu: %user %sys %idle %iowait
     0.0 8.4        0.2   5.8   0.0   93.8
Disks: %tm_act Kbps tps      Kb_read Kb_wrtn
hdisk0   0.0   0.0  0.0          0       0
hdisk1   0.0   0.0  0.0          0       0
hdisk2  98.4  75.6 61.9        396    2488
cd0      0.0   0.0  0.0          0       0

Example iostat with SPLAR configuration
#iostat –t 2 3
System Configuration: lcpu=4 ent=0.80
avg-cpu   %user   %sys    %idle     %iowait   physc     %entc
           0.1     0.2     99.7       0.0      0.0       0.9
           0.1     0.4     99.5       0.0      0.0       1.1
           0.1     0.2     99.7       0.0      0.0       0.9

Just like the vmstat command report, the first report contains statistic averages since the system started up. Subsequent reports contain statistics collected during the interval since the previous report.

The four columns that show the breakdown of CPU usage time convey the same information as the vmstat command. The columns should total approximately 100 percent. If the sum of user and system (us and sy) CPU-utilization percentages consistently approach 100 percent, the system might be encountering a CPU bottleneck.

On systems running one application, a high I/O wait percentage might be related to the workload. On systems with many processes, some will be running while others wait for I/O. In this case, the %iowait can be small or zero because running processes "hide" some wait time. Although %iowait is low, a bottleneck can still limit application performance. If the iostat command indicates that a CPU-bound situation does not exist and %iowait time is greater than 20 percent, you might have an I/O or disk-bound situation.

sar

The sar command has two forms: The first form samples, displays, and/or saves system statistics and the second form processes and displays previously captured data. The sar command can provide queue and processor statistics just like the vmstat and iostat commands. However, it has two additional features:

  • Each sample has a leading time stamp, so an overall average appears at the end of the samples.

  • The -P option can be used to generate per-processor statistics, in addition to the global averages among all processors. The sample code below shows sample output from a four-way symmetric multiprocessor (SMP) system that resulted from entering two commands:
    • sar -o savefile 5 3 > /dev/null & 

      Note: This command collects the data three times at five-second intervals, saves the collected data in savefile, and redirects the report to null so that no report is written to the terminal.
    • sar -P ALL -u -f savefile 

      Note: The -P ALL is specified to get per-processor statistics for each individual processor and -u CPU usage data. In addition, -f savefile tells sar to generate the report using the data saved in savefile. The sar –P All output for all logical processors with SMT enabled shows the physical processor consumed physc (delta PURR/delta TB). This column shows the relative SMT split between processors -- in other words, it illustrates the measurement of fraction of time a logical processor was getting physical processor cycles. Whenever the percentage of entitled capacity consumed is under 100 percent, a line beginning with U is added to represent the unused capacity. When running in shared mode, sar displays the percentage of entitlement consumed %entc, which is ((PPC/ENT)*100).


Listing 7. A typical sar report from a 2-way p520 system with dedicated LPAR configuration
AIX nutmeg 3 5 00CD241F4C00    06/14/05
 
System configuration: lcpu=4
 
11:51:33 cpu    %usr    %sys    %wio   %idle   physc
11:51:34  0        0       0       0     100    0.30
          1        1       1       1      98    0.69
          2        2       1       0      96    0.69
          3        0       0       0     100    0.31
          -        1       1       0      98    1.99
11:51:35  0        0       0       0     100    0.31
          1        0       0       0     100    0.69
          2        0       0       0     100    0.73
          3        0       0       0     100    0.31
          -        0       0       0     100    2.04
11:51:36  0        0       0       0     100    0.31
          1        0       0       0     100    0.69
          2        0       0       0     100    0.70
          3        0       0       0     100    0.31
          -        0       0       0     100    2.01
11:51:37  0        0       0       0     100    0.31
          1        0       0       0     100    0.69
          2        0       0       0     100    0.69
          3        0       0       0     100    0.31
          -        0       0       0     100    2.00
 
Average   0        0       0       0     100    0.31
          1        0       0       0      99    0.69
          2        1       0       0      99    0.70
          3        0       0       0     100    0.31
          -        0       0       0      99    2.01

mpstat

The mpstat command collects and displays performance statistics for all logical CPUs in the system. If SMT is enabled, the mpstat –s command displays physicals as well as usage of logical processors, as shown in Listing 8 below.


Listing 8. A typical mpstat report from a 2-way p520 system with SPLAR configuration
System configuration: lcpu=4
 
Proc0           Proc1
63.65%          63.65%
 
cpu2    cpu0    cpu1    cpu3
58.15%   5.50%  61.43%   2.22%

lparstat

The lparstat command provides a report of LPAR-related information and utilization statistics. This command provides a display of current LPAR-related parameters and hypervisor information, as well as utilization statistics for the LPAR. An interval mechanism retrieves numbers of reports at a certain interval.

The following statistics are displayed only when the partition type is shared:

physcShows the number of physical processors consumed.
%entc Shows the percentage of the entitled capacity consumed.
lbusyShows the percentage of logical processor(s) utilization that occurred while executing at the user and system level.
appShows the available physical processors in the shared pool.
phintShows the number of phantom (targeted to another shared partition in this pool) interruptions received.

The following statistics are displayed only when the -h flag is specified:

%hypvShows the percentage of time spent in hypervisor.
hcallsShows number of hypervisor calls executed.

Listing 9. A typical lparstat report from a 2-way p520 machine
System configuration: type=Dedicated mode=Capped smt=On lcpu=4 mem=15808
 
%user  %sys  %wait  %idle
-----  ----  -----  -----
  0.0   0.1    0.0   99.9
  0.0   0.1    0.0   99.9
  0.4   0.2    0.1   99.3
 
# lparstat 1 3
 
System configuration: type=Shared mode=Uncapped smt=On lcpu=2 mem=2560 ent=0.50
 
%user  %sys  %wait  %idle physc %entc  lbusy   app  vcsw phint
-----  ----  -----  ----- ----- ----- ------   ---  ---- -----
  0.3   0.4    0.0   99.3  0.01   1.1    0.0     -   346     0
 43.2   6.9    0.0   49.9  0.29  58.4   12.7     -   389     0
  0.1   0.4    0.0   99.5  0.00   0.9    0.0     -   312     0


Improving system performance

For a CPU-bound system, you can improve the system performance by manipulating thread and process priorities of a specific process or tuning the scheduler algorithm to set a different system-wide scheduling policy.

Changing user-process priority

The commands to change or set user task priority include the nice and renice commands and two system calls that allow thread priority and scheduling policy to be changed through API calls.

Using the nice command

The standard nice value of a foreground process is 20; the standard nice value of a background process is 24, if started from ksh or csh (20, if started by tcsh and bsh). The system uses the nice value to calculate the priority of all threads associated with the process. Using the nice command, a user can specify an increment or decrement to the standard nice value so that a process can be started with a different priority. The thread priority is still non-fixed and gets different values based on the thread's CPU usage.

By using nice, any user can run a command at a lower priority than normal. Only root can use nice to run commands at a priority higher than normal. For example, the command nice -5 iostat 10 3 >iostat.out causes the iostat command to start with a nice value of 25 (instead of 20), resulting in a lower starting priority. The values of nice and priority can be viewed using the ps command with the -l flag. Listing 10 shows a typical output using the ps -l command:


Listing 10. Using ps -l to observe process priority
       F S UID   PID  PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  240001 A   0 15396  5746   1  60 20 393ce   732           pts/3  0:00 ksh
  200001 A   0 15810 15396   3  70 25 793fe   524           pts/3  0:00 iostat

As root, you can run iostat at a higher priority with # nice --5 vmstat 10 3 >io.out. The iostat command can run with a nice value of 15, resulting in a higher starting priority.

Using the renice command

If a process is already running, you can use the renice command to alter the nice value, and thus the priority. The processes are identified by process ID, process group ID, or the name of the user who owns the processes. The renice command cannot be used on fixed priority processes.

Using the setpri() and thread_setsched() subroutines

There are now two system calls that allow users to make individual processes or threads to be scheduled with fixed priority. The setpri() system call is process-oriented and thread_setsched() is thread-oriented. Use caution when calling these two subroutines, since improper use might cause the system to hang.

An application that runs under the root user ID can invoke the setpri() subroutine to set its own priority or the priority of another process. The target process is scheduled using the SCHED_RR scheduling policy with a fixed priority. The change is applied to all the threads in the process. Note the following two examples:

retcode = setpri(0, 45);

Gives the calling process a fixed priority of 45.

retcode = setpri(1234, 35);

Gives the process with PID of 1234 a fixed priority of 35.

If the change is intended for a specific thread, the thread_setsched() subroutine can be used:

retcode = thread_setsched(thread_id,priority_value, scheduling_policy)

The parameter scheduling_policy can be one of the following:

SCHED_OTHER, SCHED_FIFO, or SCHED_RR.

When SCHED_OTHER is specified as the scheduling policy, the second parameter (priority_value) is ignored.

Changing the scheduling algorithm globally

AIX allows users to make changes to the priority calculation formula using the schedo command.

Adjusting r and d

As mentioned earlier, the formula for calculating the priority value is as follows:

Priority = x_nice + (C * r/32 * x_nice_factor)

The recent CPU usage value is displayed as the C column in the ps command output. The maximum value of recent CPU usage is 120. Once every second, the CPU usage value for each thread is degraded using the following formula:

New Charge C = (Old Charge C) * d / 32

The default value of r is 16; therefore, the thread priority is penalized by recent CPU usage * 0.5. The d also has a default value of 16, which means the recent CPU usage value of every process is reduced to half of its original value once every second. For some users, the default values of sched_R and sched_D do not allow enough distinction between foreground and background processes. These two values can be tuned using sched_R and sched_D options to the schedo command. Note the following two examples:

  • # schedo -o sched_R=0


    (R=0, D=.5) indicates that the CPU penalty was always 0. The priority value of the process would effectively be fixed, although it is not treated like an RR process.

  • # schedo -o sched_D=32


    (R=0.5, D=1) indicates that long-running processes would reach a C value of 120 and stay there. The recent CPU usage value does not get reduced once every second and the priority of long-running processes would not fluctuate back to low numbers (higher importance) to compete with new processes.

Changing the timeslice

Although the schedo command can modify the length of the scheduler timeslice, the timeslice change only applies to RR threads. This does not affect threads running with other scheduling policies. The syntax for this command is:

schedo -L timeslice

n is the number of 10ms clock ticks to be used as the timeslice. schedo -p -o timeslice=2 would set the timeslice length to 20ms.

You must log on as root to make changes using the schedo command.


Using additional techniques

Other techniques that can help a CPU-bound system include the following.

Scheduling

Depending on the relative importance of applications, you could schedule less important ones for off-shift hours using at, cron, or batch commands.

Using the mkpasswd command

If your system has thousands of entries in the /etc/passwd file, you could use mkpasswd command to create a hashed or indexed version of the /etc/passwd file to save CPU time spent in looking up a user ID.


Tuning individual applications

The following techniques can help you diagnose and improve the performance of specific applications running under AIX.

Using the ps command

The ps command or profiling can identify an application that is consuming large fractions of CPU time. This information can then be used to narrow the search for a CPU bottleneck. After you find the problem area, you can tune up or improve the application. You might need to recompile the application or change the source code.

Using the schedo command

The schedo command is used to set or display current or next boot values for all CPU scheduler tuning parameters. This command can only be executed by the root user. The schedo command can also make permanent changes or defer changes until the next reboot. Beginning with AIX 5L Version 5.3, several tuning parameters have been added to the schedo command. Listing 11 shows all the CPU scheduler parameters.


Listing 11. CPU scheduler parameters
# schedo -a
              %usDelta = 100
          affinity_lim = 7
         big_tick_size = 1
      fixed_pri_global = 0
             force_grq = 0
       hotlocks_enable = 0
idle_migration_barrier = 4
    krlock_confer2self = n/a
  krlock_conferb4alloc = n/a
         krlock_enable = n/a
    krlock_spinb4alloc = n/a
   krlock_spinb4confer = n/a
               maxspin = 16384
    n_idle_loop_vlopri = 100
              pacefork = 10
               sched_D = 16
               sched_R = 16
 search_globalrq_mload = 256
  search_smtrunq_mload = 256
  setnewrq_sidle_mload = 384
   shed_primrunq_mload = 64
    sidle_S1runq_mload = 64
    sidle_S2runq_mload = 134
    sidle_S3runq_mload = 134
    sidle_S4runq_mload = 4294967040
    slock_spinb4confer = 1024
      smt_snooze_delay = 0
     smtrunq_load_diff = 2
             timeslice = 1
        unboost_inflih = 1
         v_exempt_secs = 2
         v_min_process = 2
           v_repage_hi = 0
         v_repage_proc = 4
            v_sec_wait = 1

Upgrading

Upgrading the system to a faster CPU or more CPUs might be necessary if tuning does not improve the performance.


Case studies

Two real-world examples show how the performance experts from IBM implemented these theories and techniques.

Case 1

Symptoms: The user has a batch script that starts up 500 other batch scripts, and each of these scripts queries and updates a database. Each script also starts as a client request from another machine. Each client request creates a database user thread on the database server machine. The response time began at less than 10 seconds for a period of time. Then the response time gradually became worse. At times it was more than a minute -- sometimes two minutes.

Diagnosis: The run queue began growing until it reached into the hundreds. Another symptom included the CPU being 100 percent utilized (this was an eight-way SMP system), with 99 percent in user mode. By examining an AIX trace sample collected for a few seconds, we saw a pattern emerge. While a thread was using the CPU, a network packet would arrive and cause a network adapter interrupt. This would take the currently running thread off its CPU so the interrupt could be serviced.

After servicing the interrupt, the scheduler verifies if any other threads are runnable and have a better priority than the currently running thread. Since the currently running thread had run for a few timeslices already, its CPU priority had increased as it accumulated CPU ticks. Each of the 500 scripts began with priority 60. If they were runnable, they would preempt any currently running thread with a thread priority higher than 60. The preempted thread would then be put at the end of the run queue and have to wait for the CPU until its priority rose again.

One effect of this preemption was that sometimes a thread would be preempted while holding a database lock. Since this type of lock is implemented at the application layer within the database software, the kernel does not know that the thread is holding a lock. If the lock was a kernel-level lock or a pthread library mutex lock, then the kernel could perform priority boosting and boost a thread's priority to the same level as that of a running thread that is requesting the lock. This way, the requesting thread does not have to wait long for the lock holder to get the CPU again and release the lock.

Since the lock in this scenario was a user lock, the database thread would spin on the lock until it exhausted its spin count (a tunable database parameter), and then go to sleep. So the 99 percent used CPU was mostly due to the threads spinning on database locks.

Prescription: After determining that priority preemption was having a negative effect, we tuned the scheduler formula, which calculates the thread priority. This particular formula is:

pri = base_pri + NICE + (C * r/32) 

pri is the new priority, base_pri is 40, NICE is the nice value (20 in this case), C is the CPU usage in ticks, and r is 16.

As a thread accumulates CPU ticks, its priority value becomes larger, thereby making its priority lower.

The schedo command provides a way to change the value of r by using the sched_R option. Running schedo -p -o sched_R=0 causes r to be 0, which then causes the CPU penalty factor (C * r/32) to be 0. This prevents priorities from changing, unless the nice value is changed. If the nice value is the same for all threads, then threads can complete their timeslices without being preempted due to priority changes. This allows the thread that is currently running and holding the database lock to keep running and then release the lock.

Results: These changes had an instantaneous impact on the performance. The response time, which was over two minutes by this time, started getting better until all of the scripts were completing in just a few seconds. The C value in the priority formula is recalculated once a second by a CPU usage decay factor (C = C*d/32). Setting the d value to 0 when using the schedo command would have accomplished the same result. In this case, if d=0, then C*d/32 = 0. Since the CPU penalty factor is C*r/32, this also becomes 0 so that the priority will be just 40 + NICE.

Case 2

Symptoms: A pSeries machine was used as both a database and an application server. Users would input requests into a forms-based application and then submit the transactions. They noticed that at certain times the forms would take longer to get updated on their screens and their usual short-running queries would return in a longer time period.

Diagnosis: When this slowness was observed, there were also some long-running database batch jobs that were submitted to the system. Normally, such batch jobs would be run at night, but near the end of the month additional batch jobs were run during the day while the users were on the system. The batch jobs were CPU-intensive and constantly on the run queue. Therefore, users' threads had to compete with the threads of the batch jobs for the CPU.

With priorities degrading as CPU usage increased, the batch jobs' priorities became worse and allowed the users' threads to run. However, the kernel decays the CPU usage value C by half once a second. This allowed the priorities of the batch jobs to improve in a short time period. So the batch jobs would again compete for the CPU with the users' threads.

Prescription: By changing the decay factor (d/32) used to reduce CPU usage once a second, we improved performance for the users. We used the schedo command to set the d value to 31. The higher the value of d; the higher the value of C (C=C*d/32).

Since C is used to calculate priorities (pri=40+NICE+C*r/32), the priority would get worse as C became larger. By setting the d value to a higher number, the C value is reduced at a slower than usual rate.

Results: The users' threads get the CPU more often than the batch threads. As a result, the users saw an immediate improvement in performance. Of course, the batch jobs would be slowed down somewhat, but these jobs would get the CPU whenever the users had any "think" time or had to wait on I/O. The impact was minimal on the batch jobs, but performance improvement for the users was dramatic.

Case study notes: Tracing a pattern

A final tip describes some odd things that impact performance. During one of our benchmarks, we noticed that the CPU usage reached 100 percent, with most of the time being charged to "system". At that time, the application performance degraded noticeably.

After we collected an AIX trace, we noticed a repeating pattern. One application process would encounter a page fault on an address. That page fault caused a protection exception in the VMM, which in turn caused the kernel to send this process a SIGSEGV (segmentation violation) signal. When the process resumed, the page faulted on the same address again, which then caused yet another protection exception and another SIGSEGV signal to be sent to the process. The default signal disposition for the SIGSEGV signal is to kill the process and generate a core dump, but in this case, the application continued on and stayed in this loop. Most of the CPU time was spent in this loop.

After investigation, we discovered the problem: A developer for another component had installed a signal handler to catch the SIGSEGV signal in the code during the test process. After the testing was completed, the developer had forgotten to remove the signal handler. That component then linked with the rest of the application and, during the benchmark, another unrelated component of the application caused a segmentation fault. This old signal handler caught the exception, ignored it, and caused the process to resume. The current instruction (the one which caused the exception) was then restarted, causing an infinite loop to occur.


Resources

About the authors

Wayne Huang is a Senior Consultant for the IBM eServer pSeries and AIX systems, with a focus on e-business, banking, finance, and securities industries. He provides AIX support to ISVs in the areas of application design, problem determination, system performance tuning, and application benchmarks. He holds a BS in Physics from National Taiwan University and an MS in Computer Science from the University of Texas in Austin, TX. You can reach him at huangw@us.ibm.com.

Lee Cheng currently works as a Senior Consultant for pSeries (RS/6000) and AIX software vendors. She provides support to them in the areas of application benchmarks, performance tuning, application porting, and internationalization. Before joining the RS/6000 ISV Technical Support group, she was a developer for compilers and the AIX system management component. She holds an MS in Computer Science from the University of Kentucky. You can reach her at chenglc@us.ibm.com.

Matthew Accapadi is an IBM Senior Software Engineer responsible for performance of AIX and Oracle on AIX, tuning benchmarks for optimal performance, and teaching courses on AIX performance tuning. He works with other vendors to improve their application performance on AIX. He has a BS in Computer Science from Texas A&M University. You can reach him at accapadi@us.ibm.com.

Nam Keung works as a Senior Programmer at IBM. Nam works in the area of AIX communication development, AIX multimedia, SOM/DSOM development, and Java™ performance. His current assignment involves helping ISVs in application design, deploying applications, performance tuning, and education for the pSeries platform. You can contact him at namkeung@us.ibm.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=90789
ArticleTitle=CPU monitoring and tuning
publish-date=07282005
author1-email=huangw@us.ibm.com
author1-email-cc=
author2-email=chenglc@us.ibm.com
author2-email-cc=
author3-email=accapadi@us.ibm.com
author3-email-cc=
author4-email=namkeung@us.ibm.com
author4-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers