Multithreaded Debugging Made Easier by Forcing Core Dumps
From Texas Instruments Embedded Processors Wiki
Contents |
Introduction
Many applications on DaVinci are multithreaded in nature to ensure that all system resources are used to their full potential. Unfortunately, traditional debugging methods for multithreaded programs are rarely successful. In fact, many of us who are savvy with gdb command-line debugging have been forced back into the caveman days of instrumenting our code with printf() statements due to poor support for multithreaded applications in our debugger of choice.
If you've ever been frustrated while trying to list the running threads in gdb or have been unable to successfully display a stack trace for any of them, you'll be happy to hear about my recent discovery: gdb's support for debugging multithreaded programs is much more reliable when you are debugging a core file.
Example Debugging Session
If your DaVinci platform has enough memory to run gdb, I wouldn't bother messing with gdbserver -- it is much easier to bring up gdb on the running target system. This is especially true if gdb needs to load symbols from a bunch of shared objects, or if you are debugging a shared object. In this example, I'll be running gdb directly on the target, but if you need to use gdbserver I would look at this page for more information.
Step 1: Make sure your system will generate core files
As core files can be huge, may systems disable the generation of core files by default and you may need to turn them on. First take a look at your current settings:
% ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 960 virtual memory (kbytes, -v) unlimited
The above report shows that the core file size is set to 0, which means that core files will not be generated. You need to change this setting to "unlimited" as follows:
% ulimit -c unlimited
You show now see "ulimited" for the core file size if you run ulimit -a again.
Even if ulimit -a reports "unlimited" for core file size when your system first boots, I would probably go ahead and set it again anyway. I seem to remember once seeing a system that reported "unlimited" at boot time, but didn't generate a core file. When I set it again, it started getting the core files.
Step 2: Forcing your program to core dump
You may be thinking "How do I get a program to core dump at a particular point in time?". The first thing that may come to your mind is to create a NULL pointer and dereference it, but I prefer to be a little more precise.
First, you need to make sure your program includes these header files:
#include <sys/types.h> #include <signal.h> #include <unistd.h>
Then, at the point where you want to core dump, you send your program the "SIGSEGV" (segmentation fault) signal:
kill(getpid(), SIGSEGV);
You could also get fancy and only core dump when certain conditions are met:
ret = VIDDEC_process((VIDDEC_Handle) gdecoder->decoder, &(gdecoder->inBufDesc), &(gdecoder->outBufDesc), &vidDecInArgs, &vidDecOutArgs); if (ret < 0) kill(getpid(), SIGSEGV);
Once you've sabotaged your program in the right place, build it and and run it and let the segmentation fault happen:
Segmentation fault (core dumped) gst-launch-0.10 filesrc location=$1 !
tidemux_avi name=t ! queue max-size-buffers=60 ! gdecoder Codec=2 !
fbvideosink device=/dev/fb/3 t. ! queue max-size-buffers=180 ! adecoder Engine=1 !
osssink
Note that the segmentation fault reported "(core dumped)" because we had core files enabled.
Step 3: Using your core file in gdb
When your program core dumped, it created a file named "core.n", where n is a number indicating the process ID of your program when it was running. Next we will bring up gdb with this core file by specifying it on the command line after the program:
% gdb ./gst-launch-0.10 ./core.1831
Once gdb comes up, you can inspect your program at the point when the core dump occurred. Note that you cannot resume your program or use the "next" or "step" commands, because your program is no longer running.
If you now try to use gdb's commands for debugging threads, you'll be pleasantly surprised to see that they now seem to work as expected. First, let's list all the threads that were running when the program was terminated:
(gdb) info threads 9 process 1833 0x401fc8ec in ?? () from /lib/tls/libpthread.so.0 8 process 1834 0x402b7294 in ioctl () from /lib/tls/libc.so.6 7 process 1835 0x40235ac4 in kill () from /lib/tls/libc.so.6 6 process 1836 0x401fc8ec in ?? () from /lib/tls/libpthread.so.0 5 process 1837 0x401fc8ec in ?? () from /lib/tls/libpthread.so.0 4 process 1838 0x402004a4 in ?? () from /lib/tls/libpthread.so.0 3 process 1839 0x401fcc18 in ?? () from /lib/tls/libpthread.so.0 2 process 1840 0x401ff830 in ?? () from /lib/tls/libpthread.so.0 * 1 process 1831 0x402b5548 in poll () from /lib/tls/libc.so.6
Note that thread 7 is the thread that called "kill". We need to use the "thread" command to switch to that thread:
(gdb) thread 7 [Switching to thread 7 (process 1835)]#0 0x40235ac4 in kill () from /lib/tls/libc.so.6
We are now inside the "kill" function. We need to use the "up" command to go back to the function that called "kill" so we can examine the variables in that function:
(gdb) up
#1 0x407a83e8 in gst_gdecoder_decode (gdecoder=0x79000, outbuf=0x423431d4)
at gstgdecoder.c:634
if (ret < 0) kill(getpid(), SIGSEGV);
We can now dump variables in this function:
(gdb) set print pretty
(gdb) print vidDecOutArgs
$5 = {
size = 156,
extendedError = 1032,
bytesConsumed = 544,
decodedFrameType = 1,
outputID = 646152,
displayBufs = {
numBufs = 1,
width = 720,
...
As shown above, we are able to see the values of all the members of a large structure without coding up a million printfs!
Core Dumping on Ctrl-C
You may also have a situation where your program seems to hang and you want to inspect it while it's hanging. No problem -- just make your program core dump when you hit Ctrl-C.
First, write a function that will cause a core dump:
#include <sys/types.h> #include <signal.h> #include <unistd.h> static void core_dump(int sigid) { kill(getpid(), SIGSEGV); }
Next, override the SIGINT signal to call this function (SIGINT is the signal that gets sent when you hit Ctrl-C):
int main() { signal(SIGINT, core_dump); ... }
Now hit Ctrl-C when your program is running and get an instant core dump!
Note that the man page for the signal function states "Use of this function is unspecified in a multi-threaded process". However, it has worked for me so far -- as least for this purpose. If it works, great! We're just using this for debugging anyway.
Don Darling 18:04, 31 July 2008 (CDT)
Leave a Comment