For maximum efficiency, your code should create thread pools bound to the available CPUs. There can then be one or more "accepting threads," which perform the following actions:
accept()
to get the socket from the kernel.
The following pseudocode illustrates how this works:
main() { int cpus, i, s, newconn, boundcpu; struct thread_pool *thread_pool_array; cpus = get_number_of_cpus_in_system(); thread_pool_array = malloc(cpus * sizeof(*thread_pool_array)); for (i = 0; i < cpus; i++) { thread_pool_array[i] = make_thread_pool(); } s = bind_listen_socket(); // make listen socket. for ( ;; ) { newconn = accept(s); // get new connection boundcpu = msp_cpu_affinity(flow_selector); // Call the flow affinity API to get the CPU to which // 'newconn' is bound thread_pool_queue_socket(newconn, boundcpu); // This function calls pthread_attr_setcpuaffinity_np() // to send this socket to thread pool 'boundcpu' } }
The following code from the IP Snooper sample application binds a thread to a CPU:
pthread_attr_t attr; ... /* Create session thread and bind it to control CPU. */ pthread_attr_init(&attr); if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) { logging(LOG_ERR, "%s: pthread_attr_setscope() on CPU %d ERROR!", __FUNCTION__, cpu); goto error; } if (pthread_attr_setcpuaffinity_np(&attr, cpu)) { logging(LOG_ERR, "%s: pthread_attr_setcpuaffinity_np() on CPU %d ERROR!", __FUNCTION__, cpu); goto error; } if (pthread_create(&thrd->ssn_thrd_tid, &attr, (void *)&session_thread_entry, thrd)) { logging(LOG_ERR, "%s: pthread_create() on CPU %d ERROR!", __FUNCTION__, cpu); goto error; }
This sample application is located at src/sbin/ipsnooper
when you install the example in your development sandbox. This code is in the session_thread_create()
function in the ipsnooper_server.c
file.