Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation  Back up to About Overview 
  
[+] Expand All
[-] Collapse All

Debugging Performance Issues

This section describes the three common bounding scenarios of a live SBRC system, which can be verified using the tools listed in Table 14. They are:

Focus is primarily on Solaris tools, but the techniques of analysis apply to Linux as well.

CPU Bound

This section shows output in which the SBRC is well utilizing the available CPU and the CPU utilization is spread over many threads (see Figure 7).

Note: The peak utilization rate is not much higher than the current rate.

Figure 7: CPU-Bound Utilization

CPU-Bound Utilization

prstat output

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
  1624 root      244M  129M cpu6    33    0 686:11:58  88% radius_generic/252
 29853 hadm      181M  141M sleep   59    0   0:53:12 0.1% mysqld/18
 29766 hadm      131M   26M sleep   59    0   0:51:51 0.1% ndb_mgmd/21
  2602 root     3808K 3072K cpu0    49    0   0:00:00 0.0% prstat/1
 19423 noaccess  174M   83M sleep   59    0   3:31:37 0.0% java/18
  2556 root     7456K 4744K sleep   59    0   0:00:00 0.0% sshd/1
 17497 root     3544K 1712K sleep   59    0   0:00:00 0.0% mountd/2
You can determine if any particular thread is taking up too much CPU by prstat –L
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/LWPID      
  1624 root      244M  129M run     22    0  20:33:47 2.6% radius_generic/64
  1624 root      244M  129M sleep   33    0   5:07:37 1.7% radius_generic/82
  1624 root      244M  129M sleep   53    0   2:09:43 1.6% radius_generic/110
  1624 root      244M  129M run     23    0   3:58:26 1.6% radius_generic/112
  1624 root      244M  129M run     35    0   2:14:50 1.6% radius_generic/116
  1624 root      244M  129M run     42    0  11:09:58 1.4% radius_generic/65
  1624 root      244M  129M sleep   52    0   3:25:00 1.4% radius_generic/85

The following output shows no significant I/O usage:

  tty        sd0           sd1           sd2           nfs1           cpu
tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
   0   66   9   1    8  254  16   15    0   0    0    0   0   55   13  4  0 82
   0  236   0   0    0    0   0    0    0   0    0    0   0    0   68 23  0  8
   0   80   0   0    0    0   0    0    0   0    0    0   0    0   68 24  0  8
   0   80   0   0    0    1   1    4    0   0    0    0   0    0   68 23  0  9
   0   80   0   0    0    0   0    0    0   0    0    0   0    0   68 24  0  8

I/O Bound

Figure 8 shows output of a system that is more I/O bound (large packets being written to a long accounting file with LogAccept=1, LogLevel=2, TraceLevel=2). In this case, the CPU overhead of I/O is very high, and larger multiprocessor systems with more optimized I/O frameworks (such as Fibre Channel NASD devices and RAIDs) will see higher rates and lower CPU utilization if they are I/O bound.

Figure 8: I/O-Bound Utilization

I/O-Bound Utilization
 PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
  1624 root      248M  131M sleep   51    0 687:01:29  76% radius_generic/253
 29853 hadm      181M  141M sleep   59    0   0:53:16 0.1% mysqld/18
 29766 hadm      131M   26M sleep   59    0   0:51:54 0.1% ndb_mgmd/21
 19423 noaccess  174M   83M sleep   59    0   3:31:37 0.0% java/18
  2693 root     3808K 3072K cpu1    59    0   0:00:00 0.0% prstat/1
  2556 root     7456K 4744K sleep   59    0   0:00:00 0.0% sshd/1
  2609 root     3456K 2344K sleep   49    0   0:00:00 0.0% bash/1
 17497 root     3544K 1712K sleep   59    0   0:00:00 0.0% mountd/2
 17499 daemon   2832K 1624K sleep   60  -20   0:04:46 0.0% nfsd/2

  tty        sd0           sd1           sd2           nfs1           cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
   0   66   9   1    8  254  16   15    0   0    0    0   0   55   13  4  0 82
   0  235   0   0    0  16333  17   12    0   0    0    0   0    0   58 24  0 18
   0   82   0   0    0  17678  38   13    0   0    0    0   0    0   58 24  0 17
   0   82   0   0    0  16693  18   12    0   0    0    0   0    0   58 24  0 18
   0   82   0   0    0  16394  17   11    0   0    0    0   0    0   59 24  0 17

Contended Systems

Figure 9 is an example of a contended system with many (artificially induced) errors.

Figure 9: Contended Systems Utilization

Contended Systems Utilization
 PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
  3701 root      199M   83M sleep   59    0   0:00:07 1.3% radius_generic/80
 29853 hadm      181M  141M sleep   59    0   0:53:19 0.0% mysqld/18
 29766 hadm      131M   26M sleep   59    0   0:51:58 0.0% ndb_mgmd/21
  3717 root     3872K 3232K cpu6    59    0   0:00:00 0.0% prstat/1
  2556 root     7456K 4744K sleep   59    0   0:00:00 0.0% sshd/1

  tty        sd0           sd1           sd2           nfs1           cpu
tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
   0   66   9   1    8  254  16   15    0   0    0    0   0   55   13  4  0 82
   0  236   0   0    0    0   0    0    0   0    0    0   0    0    2  1  0 97
   0   80   0   0    0    0   0    0    0   0    0    0   0    0    2  1  0 98
   0   80   0   0    0    0   0    0    0   0    0    0   0    0    2  1  0 97
   0   80   0   0    0   11  17   20    0   0    0    0   0    0    2  1  0 97

Modified: 2017-03-07