Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation  Back up to About Overview 
  
[+] Expand All
[-] Collapse All

Test Results Summary

This section provides a summary of the results tested in a given set of systems.

Note: All of the results shown in this section are for one front-end application unless otherwise stated.

Pap Authentication, Accounting Start, Accounting Stop

Tested on four D nodes with zero sessions preloaded.

Note: The number of test client threads is varied to show the effect of simultaneous operations.

Table 17: Pap Authentication, Accounting Start, Accounting Stop

Client Test Threads

CPS

Front-End Utilization

SSR Node Single Thread Utilization

100

4290

91 %

3.8 %

80

4252

91 %

3.8 %

120

4239

90 %

3.8 %

Pap Authentication, Accounting Start, Accounting Stop—Sessions Preloaded

Tested on four D nodes with sessions preloaded.

Table 18: Pap Authentication, Accounting Start, Accounting Stop–Sessions Preloaded

Number of Sessions Preloaded

CPS

Front-End Utilization

SSR Node Thread Utilization

1M

4210

89 %

5.7-6.1 % 1

4M

4185

91 %

6.5-6.8 %

7M

4064

89 %

7.2-7.4 %

1CPU utilization varied with Local and Global Checkpoints (LCP and GCP). Higher figures for other results are shown below.

Accounting Only—Sessions Preloaded, One Start and One Stop, Four D Nodes

Tested on four D nodes with sessions preloaded.

Table 19: Accounting Only—Sessions Preloaded, One Start and One Stop, Four D Nodes

Number of Sessions Preloaded

Accts/Sec

SSR Node Single Thread Utilization

0

11,900

4M

11,446

7M/100 threads

11,070

8.7 %

7M/120 threads

11,160

8.8 %

Accounting Only—Sessions Preloaded, One Start and One Stop, Two D Nodes

Tested on two D nodes with sessions preloaded, variation of accounting to local file.

Table 20: Accounting Only—Sessions Preloaded, One Start and One Stop, Two D Nodes

Number of Sessions Preloaded

Accts/Sec

SSR Node Single Thread Utilization

0 sessions/no account logging

13,900

4.2 %

0 sessions/default account logging

12,280

4.4 %

1M sessions/default account logging

12,340

6.6 %

4M sessions/default account logging

12,300

7.5 %

4M sessions/minimal account logging

12,660

8.0 %

4M sessions/no account logging

13,740

8.3 %

7M sessions/no account logging

13,400

8.6 %

Standalone: Auth/start/stop CPS

Tested on standalone SBR for Auth/start/stop CPS.

Table 21: Standalone: Auth/start/stop CPS

Case

CPS

CPU Utilization

0 Sessions

6330

97 %

1M sessions

5930

96 %

3M sessions

5760

95 %

Standalone: Accounting for 200 Threads

Tested on standalone SBR for accounting only for 200 threads.

Note: Standalone being a 32-bit application, is limited to a 4 GB working set.

Table 22: Standalone: Accounting for 200 Threads

Case

Accts/s

CPU Utilization

Disk I/O

1M sessions

22,500

95 %

1.8 M/s

3M sessions

(This maximum varies with each use case, depending on data stored in the SSR compared to the 4G limit.)

20,300

95 %

2.1 M/s

3M Sessions/Account.ini Enabled

15,200

96 %

6.0 M/s

Standalone: Authentication Only

Tested on standalone SBR for authentication only.

Table 23: Standalone: Authentication Only

Case

Auths/s

CPU Utilization

Disk I/O

PAP

14,800

98 %

CHAP

14,760

98 %

Rejects/s

~22,000

98 %

LogAccept=1

13,170

96 %

1.2 M/s

LogLevel=2

3,890

77 %

6.8 M/s

LogLevel=1

9,590

86 %

3.6 M/s

LDAP Authentication Only

Tested for LDAP authentication.

Table 24: Authentication Only

Case

Auths/Sec

Front-End Overall

Front-End Single-Thread

1 client threads, maxconnect = 1

454

11 %

3.4 %

10 client threads, maxconnect = 1

1030

26 %

6.2 %

100 client threads/ 10 [server/*] sections

10,175

39 % 2

3

2 CPU on LDAP server maxed out—running 166 percent of two CPU VMware at 3.3 GHz.

3 Evenly split, below the transport threads level.

Oracle 11g

Tested on an Oracle 11g database running on an M3000, 2.52 GHz.

Table 25: Oracle 11g

Case

Auths/Sec or CPS

Front-End %

Oracle DB Process Max %

Auth only—10 client threads

9200 auths/sec

24 %

22 % overall

Auth only—15 client threads

12,500 auths/sec

37 %

30 % overall

Directed realms, 20 client threads over two instances

17,000–24,0004 auths/sec

71 %

70 %

Auth/start/stop, MaxConnections = 1

2610 CPS

37 %

4.6 %

Auth/start/stop, MaxConnections = 10

4255 CPS

91 %

6.9 % (spread over 8 Oracle processes)

4 This result varied widely due to latency of the single Oracle DB processing requests from too many simultaneous connections.

LCI Queries against the SSR through SBRC

The number of LCI server threads is hard-limited to eight by the server to avoid interference with regular RADIUS traffic processing.5

Table 26: LCI Queries against the SSR through SBRC

Case

LCI Queries/Sec

Front-End CPU

SSR Node CPU

100 query client threads

1452

53 %

3.7 %

200 query client threads

1716

62 %

4.4 %

5 SBRC 7.4.0 performance improvement permits greater throughput at less front-end CPU utilization.

IP Address Allocation

Tested on two D nodes with IP address allocation.

Table 27: IP Address Allocation with Two D Nodes

Case

CPS

SSR Node CPU

1M sessions, 1 pool, 1 front end

3050

18 %, 18 %, 5.2 %, 3.5 %

1M sessions, sharing 1 pool, 2 front ends

1600 + 1550 = 3150

19 %, 18 %, 5.3 %, 3.5 %

1M sessions, 2 pools, each front end using its own

1885 + 1857 = 3742

22 %, 20 %, 5.1 %, 3.5 %

1M sessions, LCI queries, 1 front end

2650 + 550 LCI TPS

17 %

1M sessions, LCI queries, 2 front ends

3200 + 750 LCI TPS

22 %

IP Address Allocation with Eight Threads, Two D Nodes

Tested on two D nodes with eight execute threads on eight virtual processors. This test case gives an indication of performance on the M4000. In this case, two threads are running on each physical processor for the M3000 case; it will be one thread per physical processor for the M4000.

Table 28: IP Address Allocation with Two D Nodes and Eight Execute Threads

Case

CPS

Max Single Thread (out of 12.5 %) SSR Node

Front-End CPU

0 sessions, 1 front end

3635

5.3, 5.1, 5.1, 5.1, 3.3, 2.0

75 %

0 sessions, 2 front ends, 2 pools

3270 + 3390 = 6660

8.3, 8.2, 8.2, 8.2, 4.0, 2.0

68 %

1M sessions, 2 front ends, 2 pools

2509 + 2492 = 5001

10.0, 9.7, 9.7, 9.6, 3.4, 1.9

1M sessions, 2 front ends, 1 pool6

1853 + 1875 = 3728

8.4, 8.3, 8.0, 8.0, 3.3, 1.9

6 In the two front end applications, one pool case, the limiting factor is the ndb lock collision associated with multiple threads reaping the old IP address at the same time.

IP Address Allocation with Eight Threads, Four D Nodes

Tested on four D nodes with eight execute threads on eight virtual processors. This test case gives an indication of performance on the M4000. In this case, two threads are running on each physical processor.

Table 29: IP Address Allocation with Four D Nodes and Eight Execute Threads

Case

CPS

Max Single Thread (out of 12.5 %) SSR Node

Front-End CPU

No IP, 1M sessions, 2 front ends

(“No IP” is used as a baseline reference)

5514 + 5472 = 10,9867

3.8, 3.8, 3.2, 3.0, 2.9, 1.7

1M sessions, 1 front end

3650

4.0, 3.9, 3.7, 3.7, 2.2, 1.9

74 %

1M sessions, 2 front ends, 2 pools

2550 + 2600 = 5150

5.4, 5.3, 5.1, 5.1, 2.7, 2.1

45 %

1M sessions, 2 front ends, 1 pool

2300 + 2400 = 4700

3.9, 3.9, 3.8, 3.7, 2.0, 1.6

35 %

7 10 percent variation due to GCP and LCP.

TTLS

TTLS with five RADIUS auth/challenge pairs per accept using various cipher suites and dh key sizes.

Table 30: TTLS with Five RADIUS Auth/Challenge Pairs per Accept

Case

Accepts/Second

Front-End %

1024 dh bits, 0x39

565

95 %

512 dh bits, 0x39

871

95 %

1536 dh bits, 0x39

256 8

73 %

0x38

574

(Maximum, around 95 %)

0x33

575

(Maximum, around 95 %)

0x32

578

(Maximum, around 95 %)

0x16

575

(Maximum, around 95 %)

0x13

872

(Maximum, around 95 %)

0x66

886

(Maximum, around 95 %)

0x35

890

(Maximum, around 95 %)

0x2f

888

(Maximum, around 95 %)

0x15

575

(Maximum, around 95 %)

0x12

1112

(Maximum, around 95 %)

0x0a

1116

(Maximum, around 95 %)

0x05

1114

(Maximum, around 95 %)

0x04

1120

(Maximum, around 95 %)

0x07

1116

(Maximum, around 95 %)

0x09

1117

(Maximum, around 95 %)

8 The clients tested were out of CPU.

TTLS Plus Storing Resumption Context

Tested with two D nodes on TTLS plus storing resumption context storage overhead.

Table 31: TTLS Plus Storing Resumption Context

Case

Accepts/Second

NDB SSR Node Thread Utilization

Resumption (0x38)

575

0.7, 0.7, 0.4, 0.4

Resumption (0x09)

1117

0.9, 0.8, 0.5, 0.5

WiMAX

WiMAX tested with TTLS plus one HA plus two starts and two stops. 10 RADIUS transactions = 6 NDB hits.

Table 32: WiMAX

Case

Accepts/Second

NDB SSR Node Thread Utilization

Front-End %

WiMAX (0x38)

355

3.5, 3.5, 1.3, 1.1

91 %

WiMAX (0x09)

475

4.7, 4.7, 1.6, 1.4

98 %

4D node system: M5000

These results were tested on the M5000 with four CPUs (virtual CPUs disabled) at 2.66 GHz and switch-connected with 10G networking.

Simple accountings per second = 49,200 (million per row stored):

  • Network bandwidth (NB) for D nodes recorded upwards of 42 MBps = 336 Mbps
  • Disk bandwidth 5 MBps = 40 Mbps

Realistic accountings per second (7M rows preloaded) and more data per transaction (approximately 512b per row stored):

  • Accountings per second = 23,240
  • Bandwidth (up to 7 MBps) = 56 Mbps
  • Network bandwidth = 80 MBps

IP address allocations: 7701 CPS

WiMAX with resumptions: 3300 CPS

Standalone: M9000

These results were tested on M9000 (16 CPUs x 8 cores) at 3.0 GHz running in one global zone, or six non-global zones.

Note: When SBR is running in non-global zones, the performance is limited because it is unable to set the RealTime priority on the receiving threads.

  • Maximum authentications per second for one global zone = 31,000
  • Maximum accountings per second for one global zone (with logging) = 38,000
  • Maximum authentications per second across six non-global zones = 105,804
  • Maximum accountings per second across six non-global zones = 91,470
  • CSPS (without logging) across six non-global zones = 59,252
  • TTLS (512RSA):
    • Maximums TTLS per second for one global zone = 3900
    • Maximums TTLS per second for six non-global zones = 11,900

Standalone: T3

These results were tested on T3 (4 CPUs x 128 cores) at 1.65 GHz running in six non-global zones.

Note: Network bandwidth–disabling virtual CPUs decreases performance in all cases.

  • Maximum authentications per second across six non-global zones = 69,850. Total CPU utilization is 18%.
  • Maximum accountings per second across six non-global zones: 18,800. Total CPU utilization is 15.6%.

Note: Accounting and local session performance is excessively limited by single-CPU speed and spindle I/O performance. Consequently, T3 is not recommended for accounting performance, but has high ROI for TLS/TTLS and WiMAX cases.

  • TTLS (512 DHE)—Maximum TTLS per second across six non-global zones = 12,100.

    Total CPU utilization is 78% and represents maximum server utilization available on this server.

  • TTLS (1024RSA)—Maximum TTLS per second across six non-global zones = 17,000.

    Total CPU utilization is 66%.

  • TTLS (1024 DHE)—Maximum TTLS per second across six non-global zones = 6300.

    Total CPU utilization is 52%.

  • TTLS (2048 DHE)—Maximum TTLS per second across six non-global-zones = 2046.

    Total CPU utilization is 43.2%.

SSR and Standalone Performance: E4870 and X5687 CPUs

These results were tested on E4870 CPUs and X5687 CPUs.

Table 33: SSR Performance

Case

6D - E7–4870 x 1 CPU

4D - E7–4870 x 1 CPU

2D - E7-4870 x I CPU

2D X5687 x 2 CPU

Accountings per second

137,900

97,000*

78,100

89,000

Accountings per second 512 byte sessions, 15M preload

65,600

47,700

24,400

SBR 7.4.1 with new NDB optimizations

140,700

120,000 *

107,000

107,000

Table 34: Standalone Performance

Case

X5687 x 2 CPUs

E7-4870 x 1 CPU (2 for TTLS)

E7-4870 x 8 CPUs (with 6 to 12 Virtual machines)

Pap Authentications per second

44,900

48,000

179,000

Accountings per second (CST and logging to disk)

42,000

40,000*

190,000

TTLS 512 DHE (0x39) authentications per second

3,800

4,300

TTLS 1024 RSA (0x2F) authentications per second

4,700

5,400

11,600

TTLS 1024 DHE (0x39) authentications per second

2,900

3,000

Note: *—These numbers are based on estimation.

Proxy authentications per second, accountings per second = 22,000. Round-robin is used for multiple downstreams.

JDBC/MySQL authentication = 18,800. Total CPU utilization is 15.6%.

JavaScript Engine Overhead = 10%.

Modified: 2016-11-15