Oprofile User's Guide

From Texas Instruments Wiki
Jump to: navigation, search


Return to the Sitara Linux Software Developer's Guide

TIBanner.png

Supported Platforms

Oprofile is supported on all ARM devices

Oprofile Overview

  • OProfile is a statistical continuous profiler. In other words, profiles are generated by regularly sampling the current registers on each CPU (from an interrupt handler, the saved PC value at the time of interrupt is stored), and converting that runtime PC value into something meaningful to the programmer.
  • Regularly sampling the PC value like this approximates what actually was executed and how often - more often than not, this statistical approximation is good enough to reflect reality. In common operation, the time between each sample interrupt is regulated by a fixed number of clock cycles. This implies that the results will reflect where the CPU is spending the most time; this is obviously a very useful information source for performance analysis.
  • Taken verbatim from http://oprofile.sourceforge.net/doc/internals/introduction.html

This is only a small set of examples to help show how Oprofile can be used. Anyone wishing more information should look here: http://oprofile.sourceforge.net/

Oprofile Limitations

Due to a bug in the Cortex-A8, only the internal timer mode is used for Oprofile. While Oprofile provides the capability to utilize ARM hardware performance counters, due to the bug, this feature has been disabled.

Oprofile Applications Overview

The sample applications are intended to show an example of how Oprofile may be used to discover where the CPU is spending the most time in an application. In this series of examples, a bottleneck is identified in a "signal_parent" application and better solution is implemented showing dramatic differences in CPU utilization in an optimized version of the same application.

There are 3 types of applications in this Oprofile example.

  • Init - Initialize Oprofile to be able to use kernel debug information. This only needs to be done once.
  • Profile session - run an application and collect profiling data.
  • Reports - Run one of the variations of report generation.

The signal_parent application source code can be found in your SDK on your host at $(SDK_HOME)/ti-sdk-amxxx-evm-xx.xx.xx.xx/example-applications/oprofile_example. It is a single application which can be compiled two different ways. One where the parent uses polling or another where the parent uses SIGNALS. When building two executable are generated:

  1. Debug/signal_parent - The polling version
  2. Debug1/signal_parent.opt - The optimized version which uses SIGNALS rather than polling.

The scripts used to support these apps are found in your target file system at /usr/bin/

ls /usr/bin/runOp*

Initial Boot Up

When you first boot up a target system containing a Sitara Software Development Kit (SDK), Matrix should be automatically started. Select the Oprofile button to bring up all the Oprofile applications. Matrix Oprofile Applications Page:

Oprofile.png

Applications Detailed Information

  • Oprofile Init - This command is required only the first time you run Oprofile or if you build and boot a new kernel. You need to point Oprofile to the vmlinux file that matches your kernel.
    • vmlinux is found by default in the /boot directory in your target file system. If you rebuild the kernel yourself, you should run this step again to point to the vmlinux from your build that corresponds to the kernel you are using
      • If you use a vmlinux that was not built along with the kernel you are using, then the debug information will be incorrect.
    • Actual commands
opcontrol --vmlinux=/boot/vmlinux*
  • Run Signal Parent - Running this demo will profile a simple unoptimized application called "Signal Parent".
    • Running this application causes profile data to be stored internally. To view the data you must generate a report.
    • In this application a parent process forks a child process which does a little work and then signals the parent when it is complete.
    • The parent busy waits by constantly polling status waiting for the child to complete.
    • Actual commands:
opcontrol --reset
opcontrol --start
signal_parent
opcontrol --stop
  • Simple Report This will generate a simple report based on the last application you profiled.
    • Notice below how signal parent takes up 86.2% of the total samples taken. Your numbers may vary depending platform differences.
Profiling through timer interrupt
         TIMER:0|
 samples|      %|
------------------
     909 86.2429 signal_parent
     136 12.9032 vmlinux-3.2.0
       6  0.5693 ld-2.12.2.so
       1  0.0949 busybox
       1  0.0949 libc-2.12.2.so
       1  0.0949 thttpd
  • Actual command
opreport
  • Run Signal Parent Optimized - This is the same application as "Run Signal Parent", however after seeing how much time is wasted polling, it was redesigned. The parent now goes to sleep and waits for a signal from the child to complete.
    • After running this application, you can run the Simple Report again to see that now the "Run Signal Parent" application is only sampled 0.38% of the time.
Profiling through timer interrupt
         TIMER:0|
 samples|      %|
------------------
    1035 98.7595 vmlinux-3.2.0
       4  0.3817 ld-2.12.2.so
       4  0.3817 signal_parent.opt
       2  0.1908 busybox
       2  0.1908 libc-2.12.2.so
       1  0.0954 lighttpd
  • Actual commands:
opcontrol --reset
opcontrol --start
signal_parent.opt
opcontrol --stop
opreport
  • Additional reports - There are additional reports available to show the flexibility of the report tool, however these variations are only a small subset of the capabilities of Oprofile.