IP Pools through the SSR
Because of the (well-known) problems in the ability of a SQL-derived system to implement efficient, reliable queuing across different processes, the current implementation of SSR has a performance limit with recovering addresses from very large IP pools on multiple threads. Multiple threads (running across multiple SBRC instances) attempting to pre-fetch possible lock-up keys to reclaim unused IP addresses can contend for the rows. This is a form of lock contention that can interfere with continued high-throughput SBRC operations. The semantics for lock back-off are controlled by the value for TransactionDeadlockDetectionTimeout in the config.ini file. One thread can wait up to this timeout value before retrying the reclaim operation, leading to a serious throughput limit.
There are several ways of managing this contention.
Use different IP pools for different NAS devices, with different NAS devices biasing themselves (or being biased by a load balancer) to a given SBRC front-end application. This reduces cross-front-end contention.
Shard the users or profiles so that different users have a Framed-IP-Address set from different pools. This evenly spreads the contention so that the usual case of random sleep used to manage contention (configured in dbclusterndb.gen as a period between CacheThreadSleepMin and CacheThreadSleepMax) will be sufficient to ensure adequate back-off for a request to reclaim a chunk of IP addresses. The added benefit is you can easily identify such items as class of service by linking given addresses to certain pools with well-defined ranges.
Have enough addresses to set the CacheLowWater and CacheHighWater marks in dbclusterndb.gen very high, so that contended operations do not impact the effective throughput (so that you can take a multi-hundred millisecond pause in reclaiming addresses without any transaction errors); however, values over 20,000 may cause SBRC to take longer to shut down in order to return the cached addresses to the available state. Also, a CacheHighWater value near the available size of the pool can cause one SBRC to cache all available addresses, leaving other SBRCs with none, which will lead to incorrectly failed authentications.