Debugging an SDK Application on the Routing Engine

This topic discusses various tools you can use to debug your application on the Routing Engine.

Note:
For information on debugging plugins, see Debugging Plugins.

Using Log and Trace Files

Logging operations use a system logging mechanism similar to the UNIX syslogd utility to record systemwide, high-level operations, such as interfaces going up or down and users logging in to or out of the router. You configure these operations by using the syslog statement at the [edit system] hierarchy level, and by using the options statement at the [edit routing-options] hierarchy level.

Each system log message identifies the JUNOS software process that generated the message and briefly describes the operation or error that occurred.

JUNOS logs SDK messages as EXTERNAL. Log files are kept in /var/log.

You can see the log from the command-line interface (CLI) by entering the command show log logfile-name. logfile-name is specified as part of your application's configuration. For example, the following configuration specifies a log file named messages:

 syslog
  {
    file {
        messages {
              any;
              debug;
          }
     }
  }

The log file name defaults to the name of the application, with the suffix d designating a daemon. For example, if the application name is jnx-example, the log filename is jnx-exampled. To view the log file, enter show log ext/jnx-exampled

Logging Programmatically

SDK daemons can use the ERRMSG macro to log to syslog. Logging requires a log tag, log level, and a log message. The log tag is just a string displayed in the syslog file.

You can define a macro to provide your log tag automatically; for example:

#define LOG(_level, _fmt ...) ERRMSG("HELLOWORLD", (_level), _fmt)

A sample usage is:

LOG(LOG_ERR, "Setup heartbeat failed.");

The log level determines whether the log message will be written to the syslog file; the system outputs all levels at or above the configured level.

The logging functions are in libjuniper, in the logging.h header file, and are documented in the Library Reference. You first call the logging_set_mode() function to specify whether to log to stderr or to syslog. For logging to syslog, the system logging process syslogd determines what to log depending on its global configuration in the file /etc/syslog.conf. You can set values in this file using the CLI command set system syslog.

Once the mode and global configuration level have been specified, applications can call the logging_set_level() function to specify what messages are sent to syslogd. (The logging level that is set from the CLI always takes precedence. )

The log displays the provider prefix from your partner certification and the package ID associated with the application. In the following example, the provider prefix is sync and the package ID is sync-apps:

Sep 14 20:03:56  router helloworldd[1234]: HELLOWORLD: setup heartbeat failed [sync:sync-apps]

Logging Levels

Logging levels are shown in the following table:

Log Level Meaning

Emergency Panic condition that terminates the process

Alert Immediate action required

Critical Critical condition

Error Error condition

Warning Warning message

Notice Condition that should be handled

Info Informational message

Debug Message for debugging purposes

When you call logging_set_level(), you specify a level constant that is shown in the following set of defines. The integer value associated with the constant determines the priority of the message: the lower the number, the higher the priority. Thus, if you set the log level to LOG_ERR, all the levels with an integer value above 4 are also logged.

#define	LOG_EMERG     1
#define	LOG_ALERT     1
#define	LOG_CRIT      1
#define	LOG_ERR       4
#define	LOG_WARNING   5
#define	LOG_NOTICE    6
#define	LOG_INFO      6
#define	LOG_DEBUG     6

The syslog process never writes the log messages below the priority level that is configured.

Tracing

Tracing operations provide specialized logging in separate files to record details about a daemon's operation, such as messages about the operation of routing protocols, the various types of routing protocol packets sent and received, and routing policy actions. Tracing is normally used to debug or gather details about the behavior of a single software process. Whereas logging is configured in the CLI for the entire platform, tracing is configured per application.

You configure tracing operations using the traceoptions statement. You can define tracing operations in different portions of the router configuration.

A sample configuration is:

routing-options {
   traceoptions {
      file routing.trace;
      flag all;
      }
}      

For a more detailed description of how logging and tracing work in JUNOS and how to configure them, see the System Basics Configuration Guide (beginning with Tracing and Logging Operations) and the JUNOS Routing Protocols Configuration Guide .

JUNOS logs all trace messages relevant to an SDK-based daemon on the Routing Engine to /var/log/ext.

Programmatically, the RE SDK facilitates writing trace messages, creating DDL macros to add tracing options, and using functions to automatically read the traceoptions configuration. The tracing APIs are in libjunos-sdk, in the junos_trace.h header file, and are documented in the API Reference documentation.

Using gdb

This section describes how to use gdb, the Gnu Debugger, to debug a running SDK daemon that was installed on the router in the usual manner as part of a package. It assumes usage of the VBE as a build environment and access to it as well as access to the router running the daemon. It also assumes you are debugging remotely from a host that is not the router.

The first step in debugging a running daemon (called the target) takes place on the router. You will need access to the router console, which can be via an SSH connection.

Setting Up the Router for Debugging

First, you will need to ensure that your daemon is running. If it is not running, it must be enabled or configured.

Once the daemon is running, retrieve its process ID (PID) number. To find the PID of your daemon on the router, issue the following command in operational mode:

show system processes

You should see output like:

  PID  TT  STAT      TIME COMMAND
0 ?? DLs 0:00.00 (swapper)
1 ?? SLs 0:01.00 /packages/mnt/jbase/sbin/init --
2 ?? DL 0:00.00 (taskqueue)
3 ?? DL 0:00.69 (pagedaemon)
4 ?? DL 0:00.00 (vmdaemon)
5 ?? DL 0:08.06 (bufdaemon)
6 ?? DL 0:03.19 (vnlru)
7 ?? DL 0:38.21 (syncer)
8 ?? DL 0:00.00 (netdaemon)
9 ?? IL 0:00.05 (if_pfe_listen)
10 ?? DL 0:00.00 (scs_housekeeping)
11 ?? IL 0:00.00 (if_pic_listen)
12 ?? DL 0:00.00 (cb_poll)
13 ?? DL 0:03.82 (vmuncachedaemon)
14 ?? IL 0:00.00 (kern_dump_proc)
157 ?? SLs 0:02.76 mfs -o noauto /dev/ad1s1b /tmp (newfs)
164 ?? ILs 0:00.61 mfs -o noauto /dev/ad1s1b /mfs (newfs)
3010 ?? Is 0:00.00 pccardd
3088 ?? Ss 0:01.47 /usr/sbin/cron
3111 ?? S 0:02.14 /sbin/watchdog -t-1
3112 ?? I 0:00.06 /usr/sbin/tnetd -N
3114 ?? S 276:48.26 /usr/sbin/chassisd -N
3115 ?? S 0:57.96 /usr/sbin/alarmd -N
3116 ?? I 0:00.23 /usr/sbin/craftd -N
3117 ?? I 0:01.77 /usr/sbin/mgd -N
3119 ?? I 0:05.83 /usr/sbin/mib2d -N
3120 ?? S 5:00.63 /usr/sbin/rpd -N
3121 ?? S 4:12.44 /usr/sbin/l2ald -N
3122 ?? I 0:00.23 /usr/sbin/inetd -N
3123 ?? I< 0:00.05 /usr/sbin/apsd -N
.
.
.
3143 ?? S 0:01.57 /usr/sbin/lacpd -N
3144 ?? S 0:16.48 /usr/sbin/lfmd -N
3145 ?? I 0:00.09 /usr/sbin/ssd -N
3146 ?? I 0:00.02 /usr/sbin/lrmuxd
3147 ?? S 0:34.54 /usr/sbin/rpd -JLEDGE
3148 ?? S 0:33.06 /usr/sbin/rpd -JLTHMA1
3153 ?? S 0:47.05 /usr/sbin/xntpd -j -N -g (ntpd)
3154 ?? S 1:37.39 /sbin/dcd -N
3155 ?? S 0:20.69 /usr/sbin/snmpd -N
3282 ?? IL 0:03.01 (peer proxy)
3545 ?? SL 0:03.54 (peer proxy)
7843 ?? Is 0:00.56 telnetd
7846 ?? Is 0:02.47 mgd: (mgd) (jamesk)/dev/ttyp0 (mgd)
8639 ?? S 3:50.56 /bin/sh /sbin/cleanup-pkgs
12785 ?? Ss 0:01.14 telnetd
12788 ?? Ss 0:01.71 mgd: (mgd) (jamesk)/dev/ttyp1 (mgd)
13043 ?? I 0:21.23 /opt/sbin/acmeped -N
14037 ?? I 0:00.01 /opt/sbin/helloworldd -N
16814 ?? I 0:05.80 /opt/sbin/acmepsd -N
67049 ?? S 0:00.00 sleep 60
67050 ?? R 0:00.00 /bin/ps -ax
7844 p0 Is 0:00.03 login [pam] (login)
7845 p0 I 0:00.93 -cli (cli)
17258 p0 I 0:00.01 sh -c /bin/csh
17259 p0 I 0:00.05 /bin/csh
59853 p0 I+ 0:00.07 _su (csh)
12786 p1 Is+ 0:00.03 login [pam] (login)
12787 p1 S+ 0:02.45 -cli (cli)
3026 d0- S 1:01.00 /usr/sbin/eventd -N -r -s
3149 d0 Is+ 0:00.02 /usr/libexec/getty std.9600 ttyd0

For example, if your process was named helloworldd, the PID would be 14037. helloworldd is noticably a part of an SDK application package because it is installed in /opt/sbin/, as are all SDK daemons installed on the router.

Now that you know the PID of the process, you can attach a gdb debugging stub to it. gdbserver is a control program for Unix-like systems, which allows you to connect your program with a remote gdb instance without linking in the usual debugging stub. gdbserver does not require creating a custom stub, is lightweight, and is already installed on the router. (For more information about gdbserver, see the gdb online documentation.)

The steps to set up gdbserver on the router are quite simple, but you must have root access to bind gdbserver to your running daemon. You will also need to choose a TCP port number on which gdbserver will listen for connections from hosts wishing to debug the daemon.

You first start a shell on the router. Use the following commands:

> start shell

Now become root.

Using the command template gdbserver host:port --attach pid, issue the following command to start gdbserver:

root@router% gdbserver localhost:23045 --attach 14037

In some cases, you might want to initiate your debugging session at startup instead of attaching to the running daemon. In this case, you can use the following command to start gdbserver:

root@router% gdbserver localhost:23045 opt/sbin/helloworldd -N

This completes your setup on the router, but you should leave gdbserver running because it is listening for connections. The next sections show you how to connect to it to debug your daemon.

Setting Up Debugging on the VBE

This section explains how to use gdb to connect to the instance of gdbserver running on the router. You should have completed the previous section before proceeding with the following steps.

In the virtual build environment (VBE), place a build of the package containing the daemon to which gdbserver is attached. The daemon needs to be unstripped to contain symbol and debugging information.

Navigate to the sandbox where you installed the package and then to the obj-i386/sbin/daemon directory. For example:

cd ~/sandboxes/helloworld/obj-i386/sbin/helloworldd/

Now start up a gdb instance (as usual) using the name of the local copy of your program as the first argument. For example:

gdb helloworldd

You will now be in the gdb shell and you should see (gdb) as the prompt.

To connect gdb to the gdbserver instance on the router, issue the command target remote ip:port command at this prompt, where ip is the IP address of the router and port is the port number that gdbserver is listening on. For example:

(gdb) target remote 10.227.7.124:2345

For more information on this gdb command and connecting to a remote target, see Using the gdbserver Program.

Now you have started a debugging session with the remote target your running daemon. You can use gdb as usual at this point as if you had attached gdb to the process running locally.

Inserting a Breakpoint and Stepping Through Code

This discussion uses the helloworldd sample application (you can find this example at sandbox/src/juniper/sdk/junos/examples/src/sbin/helloworldd.) The output will be similar for your own daemons.

Insert the breakpoint from the host in gdb as follows:

(gdb) break helloworldd_show_messages

You'll see a confirmation like:

Breakpoint 1 at 0x8053378: file ../../../src/sbin/helloworldd/helloworldd_ui.c, line 73.

To go to the breakpoint in the code, open another terminal to the router, login, and execute one of the configured CLI commands, such as:

show helloworld message

The message is already configured because the daemon is running, and you have gdbserver attached to it in another terminal window.

Notice that the message is not immediately printed out. Switch back to the VBE and gdb.

Insert the following command:

(gdb) continue

and you will have hit the breakpoint and see:

Continuing.
Breakpoint 1, helloworldd_show_messages (msp=0x80a2000, csb=0xbfbfe7f0,
unparsed=0x0) at ../../../src/sbin/helloworldd/helloworldd_ui.c:73
73 XML_OPEN(msp, ODCI_MESSAGES);

Now you can single step through the lines of executing code with the n (next) command; output appears as follows:

(gdb) n
75 for (data = first_message(); data != NULL; data = next_message(data)) {
(gdb) n
76 XML_ELT(msp, ODCI_MESSAGE, "%s", data->message);
(gdb) n
75 for (data = first_message(); data != NULL; data = next_message(data)) {
(gdb) n
79 XML_CLOSE(msp, ODCI_MESSAGES);
(gdb) continue

On your router terminal where you executed the command, you will see output like:

Message: Hello world

Now you've successfully stepped through the code and you can detach the gdb session with the detach command. If it doesn't work immediately, execute the same command again in the router window to show the Hello World message; when it appears, the VBE displays:

(gdb) detach
Ending remote debugging.

Debugging an SDK Application Using a Core Dump

This section explains how to examine a core dump file from an SDK daemon that has unexpectedly terminated when it was running on the router. This would typically happen due to a segmentation fault (your daemon tried to access a memory segment that did not belong to it.)

First, you need to get the core file from the router. Core files are typically saved in /var/tmp on the Routing Engine.

You can also generate a core dump at any point of execution. If you set up debugging as described earlier, you can generate a core dump for your daemon at any point using the gcore [file] or generate-core-file [file] gdb command. The optional argument file specifies the file name in which to put the core dump. If not specified, the file name defaults to core.pid, where pid is the inferior process ID (the PID of your daemon).

When you have a core dump file, FTP it to the VBE. For example, if you are debugging helloworldd, you put the core file in obj-i386/sbin/helloworldd/ (in the corresponding sandbox).

Run ldd on your binary to find the shared libraries it uses. Copy these libraries from the Routing Engine or backing sandbox to /usr/lib on the VBE (or tell gdb where they are with the gdb command set solib-absolute-prefix)

Run gdb as follows in obj-i386/sbin/helloworldd/:

gdb helloworldd helloworld.core.0

You should see output like:

Core was generated by 'helloworldd'. Program terminated with signal 11, Segmentation fault.
/usr/lib/libjunos-sdk.so.1: No such file or directory.#0  0x881a6355 in ?? ()

At this point, you can use gdb to diagnose where the problem occured. For example, if you insert data = NULL; at line 77 in helloworldd_ui.c, and then run the operational command to execute this code, the daemon will fail when it gets to the next line containing:

XML_ELT(msp, ODCI_MESSAGE, "%s", data->message);

Typically, if you do not know the location of the failure, you can determine it by looking at the stack. To do this, issue the backtrace command and your output will look like this:

(gdb) backtrace
#0 0x881a6355 in ?? ()
#1 0x88171d41 in ?? ()
#2 0x80544cd in xml_send_all ()
#3 0x80533ac in helloworldd_show_messages (msp=0x80a2000, csb=0xbfbfe7f0,
unparsed=0x0) at ../../../src/sbin/helloworldd/helloworldd_ui.c:77
#4 0x8062439 in ms_parse_substring ()
#5 0x8062642 in ms_parse_line ()
#6 0x805e909 in ms_parser ()
#7 0x8064265 in msev_reader ()
#8 0x880b140b in ?? ()
#9 0x880b18c2 in ?? ()
#10 0x880b18ff in ?? ()
#11 0x880a8516 in ?? ()
#12 0x8053363 in main (argc=2, argv=0xbfbfee00)
at ../../../src/sbin/helloworldd/helloworldd_main.c:99

This shows that the problem occured at line 77 of helloworldd_ui.c in the helloworldd_show_messages() function. It is also possible to move to different levels in the stack trace, and examine their variables, using the gdb commands up and down. For example, if you issue the up command three times to get back into Helloworld code you'll see (on the last up):

(gdb) up
#3 0x80533ac in acmeapp1d_show_messages (msp=0x80a2000, csb=0xbfbfe7f0, unparsed=0x0)
at ../../../src/sbin/acmeapp1d/acmeapp1d_ui.c:76
76 XML_ELT(msp, ODCI_MESSAGE, "%s", data->message);

Additional details on using gdb are available in the gdb online documentation.)

Adding Juniper Library (Shared Object) Files for More Complete Debugging

This section describes how to obtain and set up the Juniper libraries so that more symbols are found by gdb. You might want to set this up before debugging.

Looking at the stack trace output from the backtrace gdb command just described, you will notice that most of the stack frames are associated with functions where the name is not known. That is because the symbols haven't been loaded from the Juniper libraries containing those functions. This section explains how to set up your environment better so gdb finds all the function names.

First, FTP to a router and go to /usr/lib/. There you will see many .so (shared library) files. The files of particular interest are those used in the SDK. Copy the following files from this location to /usr/lib/ on the VBE:

libddl-access.so.1
libisc.so.2
libjipc.so.1
libjunos-sdk.so.1

You might want to copy only some of these (as needed); for example if you find that when you load a core dump file, gdb cannot find a specific library, you can retrieve it from the router and copy it to the VBE.

After doing this, you should no longer see the question marks (??) in the gdb backtrace. For example, the output shown in the previous section would now look like this:

-bash2-2.05b$ gdb helloworldd helloworldd.core.0
...
Core was generated by `helloworldd'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libjunos-sdk.so.1...done.
Reading symbols from /usr/lib/libisc.so.2...done.
Reading symbols from /usr/lib/libddl-access.so.1...done.
Reading symbols from /usr/lib/libkvm.so.2...done.
Reading symbols from /usr/lib/libutil.so.3...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0 0x881a6355 in vfprintf () from /usr/lib/libc.so.4
(gdb) bt
#0 0x881a6355 in vfprintf () from /usr/lib/libc.so.4
#1 0x88171d41 in innetgr () from /usr/lib/libc.so.4
#2 0x80544cd in xml_send_all ()
#3 0x80533ac in helloworldd_show_messages (msp=0x80a2000, csb=0xbfbfe7f0, unparsed=0x0)
at ../../../src/sbin/helloworldd/helloworldd_ui.c:76
#4 0x8062439 in ms_parse_substring ()
#5 0x8062642 in ms_parse_line ()
#6 0x805e909 in ms_parser ()
#7 0x8064265 in msev_reader ()
#8 0x880b140b in __evDispatch () from /usr/lib/libisc.so.2
#9 0x880b18c2 in __evMainLoopSync () from /usr/lib/libisc.so.2
#10 0x880b18ff in __evMainLoopSyncSighdl () from /usr/lib/libisc.so.2
#11 0x880a8516 in junos_daemon_init () from /usr/lib/libjunos-sdk.so.1
#12 0x8053363 in main (argc=2, argv=0xbfbfee00) at ../../../src/sbin/helloworldd/helloworldd_main.c:99

Notice that the function names are now supplied.

While in gdb, you can also use the gdb sharedlibrary command to see the libraries that are loaded and needed. An error appears if something is missing.

Debugging the SDK Service Daemon

For background on ssd, see ssd-section.

Basics on SSD Functionality

The SDK Service Daemon provides an interface that enables applications to manage the following routing functionality:

The follosing figure shows the interface between a SDK client, ssd, and the JUNOS Routing Protocol Daemon and kernel:

ssd-debug-g017348.gif

libssd in Context

The functionality shown here is as follows:

  1. The SDK client interacts with the SSD server using the APIs provided in libssd.

  2. The SSD server maintains the client list, and propagates the route addition, deletion, and management data to:

    1. The Routing Protocol Daemon (rpd), using the APIs in librpd, for simple routes

    2. The JUNOS Kernel and rpd, for routes pointing to service PICs.

  3. rpd performs the actual route update and management.

In previous releases, ssd debugging was available in terms of the syslog messages printed for various failure scenarios from within the ssd code, with the various ERRMSG tags defined in the ssd.errmsg file.

This release adds the following enhancements:

Tracing for ssd

Trace options for ssd are based on the following areas of functionality:

The following example shows how to configure ssd traceoptions from the CLI:

  system {
            processes {
               sdk-service {
                  traceoptions {
                     flag [all | ssd-infrastructure | ssd-server | 
                           client-management | route-management |
                           nexthop-management]
                     level <level>;
                  } 
               }
            }
         }

Note:
For more information on GENCFG, see gencfg-intro.

Tracing in Application Code

To enable tracing levels for ssd, call junos_trace_level(), which is defined as follows:

void junos_trace_level  (  int  msg_type,  
  u_int32_t  trace_level,  
  const char *  fmt,  
    ... 
 )   

Pass one of the following ssd values - which correspond to the flags in the configuration - for msg_type:

SSD_DBGSRC_INFRASTRUCTURE
SSD_DBGSRC_SERVER
SSD_DBGSRC_CLIENT
SSD_DBGSRC_ROUTE
SSD_DBGSRC_NEXTHOP
SSD_DBGSRC_ALL

trace_level represents one of the debugging levels (TRACE_LEVEL_ERR, TRACE_LEVEL_WARN, TRACE_LEVEL_INFO, TRACE_LEVEL_VERBOSE).

fmt represents the message string.

For more information, see the API Reference for junos_trace.h in libjunos-sdk.

The trace messages for ssd are logged to /var/log.


2007-2009 Juniper Networks, Inc. All rights reserved. The information contained herein is confidential information of Juniper Networks, Inc., and may not be used, disclosed, distributed, modified, or copied without the prior written consent of Juniper Networks, Inc. in an express license. This information is subject to change by Juniper Networks, Inc. Juniper Networks, the Juniper Networks logo, and JUNOS are registered trademarks of Juniper Networks, Inc. in the United States and other countries. All other trademarks, service marks, registered trademarks, or registered service marks are the property of their respective owners.
Generated on Sun May 30 20:26:47 2010 for Juniper Networks Partner Solution Development Platform JUNOS SDK 10.2R1 by Doxygen 1.4.5