In FreeBSD, the mbuf facility provides this mechanism; the JUNOS software implementation of the same functionality is provided by the jbuf library.
A jbuf is a data structure that describes a block of data that can vary in size depending on its contents. jbufs are used for holding packets, in addition to other data relevant to network protocol handling.
A packet can be comprised of multiple jbufs chained in a singly-linked list, which lets applications add or trim network headers with minimal overhead. This structure lets you link data together that is not stored contiguously without copying the data from one buffer to another. The next node in the chain is specified by the jb_next
pointer in the header.
When you are moving more data than will fit in a single packet, jbufs and jbuf chains can, in turn, be combined to form a larger singly-linked list that represents a stream of packets. You can use the jb_scratch
field in the jbuf structure, described in the next section, to locate the next packet stream.
sandbox/src/junos/lib/libmp-sdk/h/jnx/jbuf.h
header file.The jbuf header has the following fields:
jb_next
A pointer to the next buffer in the chain jb_data
A pointer to the data jb_len
The length of the data
jb_flags
The jbuf flags jb_total_len
The total length of the packet jb_pad
Extra field for future use jb_rcv_vrf
The ingress VRF jb_rcvidx
The index of the ingress interface jb_hdr_hash
5-tuple hash jb_l3_hash
3-tuple hash jb_xmit_subunit
The egress ms-
interface subunit jb_rcv_subunit
The ingress ms-
interface subunit jb_opq_data
Data relevant to the application
char jb_scratch[JBUF_SCRATCH_SIZE];
jbuf_get_chain()
and jbuf_free()
functions.jbuf_copy_to_buf()
and jbuf_copy_from_buf()
functions, together with the jbuf_to_d()
function that converts a jbuf pointer to a data pointer of the correct type.jbuf_prepend()
and jbuf_adj()
functions.jbuf_setvrf()
function together with other functions from libssd (see Transmitting a jbuf.)jbuf_pullup()
and jbuf_pulldown()
functions; to see whether it is necessary to call these functions, you can check the length of data from a certain offset into the jbuf to make sure that the data to be read is all contained in the same jbuf data area. (Note that if there is enough room already at the start of the jbuf data area, it is not necessary to call jbuf_prepend()
; you can check for this by invoking the jbuf_leadingspace()
function.)jbuf_buffer_start()
and jbuf_buffer_end()
functions.jbuf_align()
function.jbuf_apply()
; this is typically used to calculate header checksums.jbuf_cat()
, jbuf_dup()
, and jbuf_split()
.You can also modify jbufs to encapsulate or decapsulate packets by adding or removing headers, or to retrieve data. If you are modifying an existing jbuf, you should also use the checksum functions. If your modifications involve changes to packet header information (for example: protocol field, address information), your application must swallow the packet entirely and reinject it.
Local jbufs are used for packets originated by data-handling applications (the data component). These applications use the jbuf for copying packets that need to be queued for later processing (for example, for IP reassembly).
System-owned jbufs are the same as local jbufs in all other respects.
jbuf_prepend()
function. The following code from the jnx-gateway sample application prepends an IP header to a packet. pkt_ctxt
is declared to be type jnx_gw_pkt_proc_ctxt_t
, the definition for which is shown first:
typedef struct jnx_gw_pkt_proc_ctxt_s{ struct jnx_gw_data_cb_s* app_cb; /* Pointer to the Control Block */ msp_data_handle_t dhandle; /* Data thread handler */ struct jbuf *pkt_buf; /* Pointer to the packet received */ u_int32_t ing_vrf; /* Ingress VRF of the packet */ u_int32_t eg_vrf; /* Egress VRF of the packet */ jnx_gw_data_gre_tunnel_t* gre_tunnel; /* Pointer to the Gre Tunnel */ jnx_gw_data_ipip_sub_tunnel_t* ipip_sub_tunnel; /* Pointer to the IPIP sub tunnel */ struct ip* ip_hdr; /* Pointer to the IP Header in the Packet */ jnx_gw_data_vrf_stat_t* ing_vrf_entry; /* Pointer to the ingress VRF entry */ jnx_gw_data_vrf_stat_t* eg_vrf_entry; /* Pointer to the egress VRF entry */ jnx_gw_gre_key_hash_t gre_key_info; /* Gre Tunnel Key */ jnx_gw_data_ipip_sub_tunnel_key_hash_t ipip_sub_key_info; /* IP-IP Sub Tunnel Key */ jnx_gw_gre_encap_header_t* ip_gre_hdr; /* Pointer to the Outer IP&GRE Header */ jnx_gw_ipip_encap_header_t* ipip_hdr; /* Pointer to the Outer IP-IP header */ jnx_gw_stat_type_t stat_type; /* Pointer to the stat type to be incremented */ }jnx_gw_pkt_proc_ctxt_t; ... if (pkt_ctxt->gre_tunnel->tunnel_type == JNX_GW_TUNNEL_TYPE_IPIP) { /* We have a pre-computed IP header in the IP-IP Tunnel * to be appended to the packet */ pkt_ctxt->pkt_buf = jbuf_prepend(pkt_ctxt->pkt_buf, sizeof(struct ip)); /* Copy the outer IP header to the packet start */ jbuf_copy_from_buf(pkt_ctxt->pkt_buf, 0, sizeof(struct ip), (char*)&pkt_ctxt->gre_tunnel->ip_hdr);
Removing a header or trailing encapsulation fields can be done using the jbuf_adj()
function, which adjusts a jbuf's total length by the number of bytes you specify. The following code from the sample application removes the outer IP header and GRE header from the packet and then passes the packet to another function to process the IP layer.
decap_len
in the following code is calculated as follows when the GRE packet is processed:
decap_len = (pkt_ctxt->ip_hdr->ip_hl << 2) + GRE_FIELD_WIDTH;
jbuf_adj(pkt_ctxt->pkt_buf, decap_len); /* * Get the information about the Egress VRF and how to encapsulate the rcvd * packet */ pkt_ctxt->eg_vrf_entry = pkt_ctxt->gre_tunnel->eg_vrf_stat; /* Get the pointer to the inner IP Header */ pkt_ctxt->ip_hdr = jbuf_to_d(pkt_ctxt->pkt_buf, typeof(pkt_ctxt->ip_hdr)); /* Do the processing for IP Layer */ if (jnx_gw_data_process_ip_packet(pkt_ctxt, TRUE) == JNX_GW_DATA_DROP_PKT)
jbuf_prepend()
and jbuf_copy_from_buf()
, the packet data could be placed in physically discontiguous memory. You can use jbuf_pullup()
or jbuf_pulldown()
to make sure that your memory references cover the entire length of the data to be read in the same jbuf. A sample function call is
pkt_ctxt->pkt_jbuf = jbuf_pullup (pkt_ctxt->pkt_jbuf, pkt_ctxt->pkt_jbuf->jb_total_len);
It is then safe to typecast the memory area of such a length by calling jbuf_to_d()
, as the following sample illustrates.
jb = pkt_ctxt->pkt_jbuf; if (jb->jb_len < sizeof(struct ip)) { if ((jb = jbuf_pullup(jb, sizeof(struct ip)) == NULL) { syslog(LOG_EMERG, "jbuf_pullup returned NULL\n"); } } pip = jbuf_to_d(jb, struct ip*); printf("SIP=0x%x -> DIP=0x%x\n", pip->ip_src, pip->ip_dst); printf("LEN=0x%x TOS=0x%x\n", pip->ip_len, pip->ip_tos); printf("ID=0x%x FRGOFF=0x%x\n", pip->ip_id, pip->ip_off); printf("TTL=0x%x \n", pip->ip_ttl);
jb_xmit_subunit
field. The jb_xmit_subunit
field is an overloaded field: applications can specify either the outgoing VRF or the subunit.
If you do not know the value to set, use the functions in libvrfutil (defined in sandbox/src/lib/libvrfutil/h/jnx/vrf_util_pub.h
as follows:
vrf_getindexbyvrfname()
to return the VRF index according to the interface name or the VRF name
jbuf_setvrf()
to set the value that is returned. The kernel will use the first available subunit in the VRF for transmission. If you know the subunit, use code like the following:
jb->jb_flags &= ~JBUF_FLAG_XMIT_VRF; jb->jb_xmit_subunit = xmit_subunit;
The following examples show some configurations with and without known VRFs or subunits, and correlate them to your application code.
[ Routing Engine ] [ Control MS PIC at ms-5/1/0 ] IPv4 client ---> [ I/O MS PIC Data MS PIC I/O MS PIC ] ----> IPv4 server at so-5/1/1 at ms-1/0/0 at so-5/1/2
The configuration is as follows:
user@host# show system system { extensions { providers { jnx; } } } syslog { file { messages { any; debug; } } } [edit] user@host# show chassis fpc 1 { pic 0 { adaptive-services { service-package { extension-provider { control-cores 1; data-cores 7; package jnx-test-ctrl; package jnx-test-data; } } } } } fpc 5 { pic 1 { adaptive-services { service-package { extension-provider { control-cores 7; data-cores 1; package jnx-test-ctrl; package jnx-test-data; } } } } } [edit] user@host# show interfaces so-5/1/1 { unit 0 { family inet { address 10.10.10.10/24; sonet-options { fcs 32; } } so-5/1/2 { unit 0 { family inet { address 20.20.20.20/24; } sonet-options { fcs 32; } } ms-1/0/0 { unit 0 { family inet; } unit 1 { family inet; } } ms-5/1/0 { unit 1 { family inet; { address 5.5.5.20/24; } } [edit] user@host#show policy-options policy-statement dummy { then reject; }
The following are optional settings for this configuration:
user@host#show routing-options static route { 30.31.32.33 { next-hop ms-1/0/0 } } static route { 100.100.100.100 { next-hop so-5/1/1 } }
This code forewards ingress packets in jbufs to a certain egress ms-
interface subunit:
/* init xmit using current ms interface subunit number */ static inline void jbuf_setvrf(struct jbuf *jb, if_subunit_t msif_subunit) { jb->jb_flags &= ~JBUF_FLAG_XMIT_VRF; jb->jb_xmit_subunit = msif_subunit; }
A sample function call is
jbuf_setvrf(jb, 0) /* remote client loopback */
[ Routing Engine ] [ Control MS PIC at ms-5/1/0 ] IPv4 client ---> [ I/O MS PIC Data MS PIC I/O MS PIC ] ---> IPv4 server at so-5/1/1 at ms-1/0/0 at so-5/1/2 VRF vrf-1 VRF vrf-2
The configuration is
user@host# show system system { extensions { providers { jnx; } } } syslog { file { messages { any; debug; } } } [edit] user@host# show chassis fpc 1 { pic 0 { adaptive-services { service-package { extension-provider { control-cores 1; data-cores 7; package jnx-test-ctrl; package jnx-test-data; } } } } } fpc 5 { pic 1 { adaptive-services { service-package { extension-provider { control-cores 7; package jnx-test-ctrl; } } } } } [edit] user@host# show interfaces so-5/1/1 { unit 0 { family inet { address 10.10.10.10/24; sonet-options { fcs 32; } } so-5/1/2 { unit 0 { family inet { address 20.20.20.20/24; } sonet-options { fcs 32; } } ms-1/0/0 { unit 0 { family inet } unit 1 { family inet } } ms-5/1/0 { unit 1 { family inet { address 5.5.5.20/24; } } [edit] user@host#show policy-options policy-statement dummy { then reject; }
In the following settings, the ingress routing instance is vrf-1
and the egress routing instance is vrf-2
. The final static route is optional.
user@host#show routing-instances routing-instance { vrf-1 { instance-type vrf; interface ms-1/0/0.0; interface so-5/1/1.0; vrf-import dummy; vrf-export dummy; route-distinguisher 1:1; routing-options { static { route 5.5.5.5 next-hop ms-1/0/0.0; route 100.100.100.100 next-hop so-5/1/1; } } } routing-instance { vrf-2 { instance-type vrf; interface ms-1/0/0.1; interface so-5/1/0; vrf-export dummy; route-distinguisher 1:2; } }
In this example, the traffic received on any I/O PIC interface included in vrf-1 (so-5/1/1
) is routed by performing a route lookup in the vrf routing table. Traffic is forwarded to the data PIC using optional static routes. Any routes in the inet-0
table are ignored.
Traffic that is received on any other I/O PIC interface not included in vrf-1 (so-5/1/0
) is routed using a route lookup in the inet-0
table.
Routes are added to the inet-0
table in either of two ways:
set routing-options static route 100.100.100.100 next-hop so-5/1/1
/* init xmit using egress routing instance (vrf) with specific id */ static inline void jbuf_set_xmit_subunit (struct jbuf *jb, if_subunit_t vrf_id) { jb->jb_flags |= JBUF_FLAG_XMIT_VRF; jb->jb_xmit_subunit = vrfid; }
A sample function call is as follows:
(void) jbuf_setvrf(jb, 0); /* for client remote loopback */
/* init xmit using egress routing instance (vrf) uniquely identified by specific vrf name and protocol family */ static inline void jbuf_set_xmit_vrf_name(struct jbuf *jb, const char *lrname, int af) { jb->jb_flags |= JBUF_FLAG_XMIT_VRF; jb->jb_xmit_subunit = vrf_getindexbyvrfname(const char *vrfname, const char *lrname, int af); }
Sample function calls:
(void)jbuf_set_xmit_vrf_name("vrf-1", NULL, AF_INET); /* for client remote loopback */ (void) jbuf_set_xmit_vrf_name("vrf-2", NULL, AF_INET); /* for forwarding to the server */
/* init xmit using egress routing instance (vrf) uniquely identified by specific ifname and protocol family */ static inline void jbuf_set_xmit_ifname(const char *ifname, int af) { jb->jb_flags |= JBUF_FLAG_XMIT_VRF; jb->jb_xmit_subunit = vrf_getindexbyifname(const char *ifname, int af); }
Sample function calls:
(void)jbuf_set_xmit_ifname("vrf-1", AF_INET); /* for client remote loopback */ (void)jbuf_set_xmit_ifname("vrf-2", AF_INET); /* for forwarding to the server */
MAX_JBUFS_PER_PACKET
and JBUF_MAX_TOTAL_LEN
. It could be necessary to call the jbuf_defrag()
and jbuf_trim()
functions to meet these requirements on egress. jbuf_defrag()
compresses the jbuf data in the chain so that it fits in the least possible number of jbufs; jbuf_trim()
removes zero-length jbufs from a jbuf chain.jbuf_get_chain()
and jbuf_free()
operations, as well as other jbuf calls that indirectly allocate jbufs.The system could allocate from this local jbuf pool dynamically (no limits at this time) for IP fragmentation, IP defragmentation, or when using internal system plugins. Therefore, you should actively monitor usage of jbufs to ensure that enough jbufs are available for your monitoring or transit application.
Applications can track jbuf CPU usage according to the configuration (pools are shared equally among the pool of allocated data CPUs). jbufs are evenly distributed to data CPUs as follows:
Number of cores Number of jbufs per data CPU --------------- ---------------------------- 1 2500 (7500/ 3) 2 1250 (7500/ 6) 3 834 (7506/ 9) 4 625 (7500/12) 5 500 (7500/15) 6 417 (7506/18) 7 358 (7518/21)
The debug CLI ( mspdbg-cli ) also indicates the total number of jbufs. For example:
MSP-DEBUG> show msp jbuf Jbuf ref count : 0 Local jbuf count : 7518 <<< total number of jbufs Jbuf max ref count: 6500
Allocation per CPU pool enhances performance for the following operations, which are performed with no locking:
jbuf_get_chain()
and jbuf_free()
, or indirectly via operations such as jbuf_cat()
, jbuf_pulldown()
, jbuf_defrag()
, jbuf_prepend()
, jbuf_dup()
, jbuf_copy_from_buf()
, or jbuf_copy_chain()
.
int get_num_data_cpus () { int num_dcpus = 0; int next_dcpu = MSP_NEXT_END; do { next_dcpu = msp_env_get_next_data_cpu(next_dcpu); printf("msp_env_get_next_data_cpu ret %d\n", next_dcpu); if (next_dcpu != MSP_NEXT_END) { num_dcpus++; } } while ( next_dcpu != MSP_NEXT_END ); return num_dcpus; } int get_max_local_jbufs() { int max = 0; int num_dcpus = 0; num_dcpus = get_num_data_cpus(); <<< see previous function max = jbuf_get_max_jbufs() / num_dcpus; return max; }