NOTICE: The Processors Wiki will End-of-Life in December of 2020. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.

System Analyzer Tutorial 4D

From Texas Instruments Wiki
Jump to: navigation, search

System Analyzer Tutorial 4D:

How to use System Analyzer to view and analyze events from multiple cores

Timelines: Global time vs. CPU time

For multicore devices with multicore event correlation enabled, the timeline in all views is the ‘global time’, based on the global timestamp. Each CPU’s local timestamp is converted to the equivalent global time based on the last sync point event that was logged by that CPU and the clock frequency value(s) specified in the platform file (specified in the RTSC tab of the General Build Settings for the project). The Time column in the Log view shows the global time in nanoseconds.

Ensure that your Platform clock frequency settings are accurate

In order for events from multiple cores to be properly correlated in time, it is critical that the clock frequency values specified in the platform file be as accurate as possible. This is especially important for heterogeneous multicore devices (e.g. DM816x, OMAP, etc.) where the CPUs run at different clock rates.

The attached project (File:Tutorial4D measureCPUfreq.zip) can be used to measure the local CPU timestamp frequency relative to the Global timestamp frequency. The zip file contains two files: the application c code (main.c) and a generic .cfg file (measureCPUfreq.cfg). Here's how to build and run a project in CCStudio for the CPU that you wish to check the timestamp frequency for:

  • File / New / CCS Project
  • provide a project name
  • Device: select the Device Family and Variant for your CPU
  • Project templates and Examples: select Empty RTSC Project
  • Click Next
  • Deselect all Products and Repositories except for SYS/BIOS and System Analyzer (UIA Target)
  • select the platform file for your device
  • Click Finish
  • Unzip Tutorial4D_measureCPUfreq.zip and copy the two files into your project
  • Right click on the project and select Build Configurations / Set Active / Release
  • Build the project
  • Launch a debug session for your device
  • Load the .out file from your project and run it
  • The measured CPU frequency will be displayed in the console
  • MeasureCPUFreq.gif

If the "Provisioned CPU Timestamp Frequency" is more than 0.001% different than the "Measurecd CPU Timestamp Frequency", you should create a custom platform file specifying the measured CPU clock frequency ( = cpuCyclesPerTick * Measured CPU Timestamp Frequency). Here's how: In CCStudio:

  • File / New / Other / RTSC / New RTSC Platform
  • Enter a package name (e.g. myBoard)
  • specify the Platform Package Repository folder you wish to use
  • ensure that it points to a folder named 'packages' - e.g. C:\Users\<your user folder>\myRepository\packages
  • Select the Device Family and Device Name for your device
  • click next
  • click the Import button and select the platform file you normally use
  • this will copy all of the memory configuration options from the selected platform file
  • enter the Clock Speed (MHz) value shown in the console when you ran the measureCPUFreq project.
  • Click Finish
  • this should build the platform file automatically

You can then use this for your project's platform file as follows:

  • In the Project Explorer, right click on the project and select Show Build Settings
  • In the left pane, select General, and then click on the RTSC tab
  • Click the Add.. button and browse to the Platform Package Repository folder you stored the platform file in (make sure the path name ends in 'packages')
  • Click on the Platform: field and select your custom platform from the drop-down list that is displayed
  • Click OK

Working with multiple System Analyzer Views

Group and Arrange System Analyzer views

It's a good idea to arrange the System Analyzer views that you are working with so that you can see all of the ones of interest at the same time. To do this, click on the tab of the view and drag it to a new position in the IDE. While the mouse button is held down, a grey rectangle will show where the view will be docked if you release the mouse button at that moment.
DragViews.gif
Move the mouse with the mouse button held down until the grey rectangle appears over the position or 'tab group' that you wish to dock the view to, then release the mouse button to complete moving the view to its new location.

Clicking on the 'group' button for all of the views that you are working with allows them to automatically scroll in sync with each other so that the start or end of each of the views displays the same global time. Clicking on an event in one view will highlight the same event in another view if the event exists in that view E.g. clicking on a task switch event in the log view will cause the cursor to be displayed at that event in the Execution graph.
NOTE: If the event is not displayed by the view, the cursor will not be updated. E.g. clicking on a CPU load event in the log view will not cause the cursor to move in the execution graph.

Tips for Improving CCStudio Performance

If your multicore device has a lot of different cores and several different .out files loaded, it can cause CCStudio to slow down due to the amount of overhead that is involved. In some cases, this can put a lot of pressure on the Java JVM heap, which can end up causing a lot of paging of memory to disk. In the extreme case, CCS will report JVM Heap Low warnings.

You can display how much JVM heap space is being used in the CCS status bar by doing the following:

  • Windows / Preferences -> General
  • check the "Show heap status" checkbox

The JVM heap usage will be displayed in the bottom right corner of CCS as shown below:
HeapUsage.gif
Clicking on the garbage can icon will force garbage collection to run, which may in some cases free up memory.

Increase your JVM Heap Size

You can increase the size of the JVM heap that is available by editing the CCStudio.ini file that is located in the ccsv5/ecliipse folder and changing -Xmx384m to -Xmx512m and -Xms40 to -Xms80 as shown below:

-startup
plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
-product
com.ti.ccstudio.branding.product
...
-Xms80m
-Xmx512m
...

Close System Analyzer before Reloading .out Files

When working with Multicore devices, it is recommended that you ensure that System Analyzer is closed before reloading a new .out file. This avoids a number of problems, including memory management issues associated with symbol loading and symbol management. NOTE: stopping and then restarting System Analyzer is not sufficient for this - you need to close System Analyzer (e.g. by clicking on the white X in the tab of the Live Sessions view CloseSystemAnalyzer.gif)

Event Correlation

Breakpoints and Console I/O cause a loss of synchronization between the local and global timestamps

Both CCS breakpoints and CIO operations will cause a loss of synchronization between the local and global timestamps. If you call stdio.h's printf API, a hidden breakpoint will be executed in order to notify CCS that it needs to update the console. Unfortunately, System Analyzer is not currently notified of these breakpoints so the idleHook function will not be notified by System Analyzer that it needs to log a new sync point event. You will need to explicitly log a sync point event after you call printf in order to establish a new sync point. Only events that occur AFTER the breakpoint and the new sync point event will be correlatable.

Use Log_print instead of stdio's printf or system_print

A system_print call is handled differently than a normal stdio printf call. If you use SysMin = xdc.useModule('xdc.runtime.SysMin'); in your code and do not call SysMin_flush(), then the console will only be updated when you exit your application and there will be no breakpoints executed. If you use SysStd = xdc.useModule('xdc.runtime.SysStd'); or call SysMin_flush(), then a breakpoint will be executed in order to update the console and you will lose synchronization between the local CPU timestamp and the global timestamp. Please see this post for more info.

If you call Log_print in your code, it logs a normal UIA event that System Analyzer will be able to receive and display. There are no breakpoints involved, so there is no loss of sync. This is the preferred way to output print statements from your application as it is thread safe and doesn't break your real-time performance.

For JTAG transports, ensure that the Idle task calls LogSync_idleHook

Whenever the target halts due to a breakpoint or whenever System Analyzer first starts capturing events, CCStudio updates a global variable in target memory (ti_uia_runtime_LogSync_gNumTimesHalted) that the LogSync module's LogSync_isSyncEventRequired() function can read to determine whether it needs to log a sync point event. To configure your code to automatically call this function when it goes idle, add the following to your application's .cfg file:

var Idle = xdc.useModule('ti.sysbios.knl.Idle');
Idle.addFunc('&ti_uia_runtime_LogSync_idleHook');

This allows System Analyzer to tell your application when to log a sync point event so that event correlation can be re-established (e.g. after a breakpoint has occurred or after System Analyzer has stopped and started, as discussed below).

After a breakpoint is hit, Stop System Analyzer and then Start it again, then run the target

Whenever the target is halted, the CPU timestamp will halt while the Global Timestamp keeps running. This breaks the synchronization between the local and global timestamps that the last Sync Point Event reported to System Analyzer. In order to tell System Analyzer about the new timing relationship between the local and global timestamps for that CPU, a new sync point event needs to be logged, and System Analyzer needs to update its timelines with this new timing information. With the Idle hook function described above, you can do this by stopping System Analyzer by clicking the Stop button in the Live Session Logs view StopButton.gif, and then clicking the button again after it has changed into a start button StartButton.gif.

How to skip waiting for Sync Point Events

If you are in a situation where a 'Waiting for Sync Points' message is displayed in the status bar and you want to view the events even though not all cores have logged a sync point event, you can skip waiting for the sync points by right-clicking on the Live Session Logs view and selecting Live Session / Skip sync points for correlation.
SkipSyncPoints.gif
This will allow the events to be displayed in the Logs view, but the time values shown in the Time column will be based on the local CPU timestamps instead of the global timestamps for all cores that have not logged a sync point event. For cores that have logged a sync point event, the Time column will contain the global time. The ability to skip sync point events should only be used in situations where you know the target is not behaving normally and you are more concerned with looking at raw events and do not need events to be correlated across cores.

Using Stop-mode logging and Sync Groups

Event drop-outs can be caused by either bandwidth limitations or buffer-size limitations or, in many cases, by a combination of the two. In order to debug multicore interactions or concurrent operations it is often necessary to be able to see what is going on in all cores over a period of time, and event loss from one or more cores during this time window may render whatever events were captured useless. If it is not possible to simply reduce the number of events that are being logged (e.g. by setting LoggingSetup.sysbiosTaskLogging = false in the .cfg file), it is often best to configure each core to log events with stop-mode logging, and to use large buffers to capture as many events as possible. Here’s an example of a .cfg file that shows how:

// Assign the logger buffer to some large chunk of memory
Program.sectMap[".loggerBuffers"]       = "MSMCSRAM_MASTER";
var LoggerStopMode = xdc.useModule('ti.uia.runtime.LoggerStopMode');
var LoggerStopModeParams = new LoggerStopMode.Params();
LoggerStopModeParams.transferBufSize = 64*1024;
// If the memory that the logger is in is shared memory, specify how many
// cores are sharing the memory
//LoggerStopModeParams.numCores = 8;
/* specify the memory section to locate the logger buffer in */
LoggerStopModeParams.bufSection = '.loggerBuffers';
var loggerTask = LoggerStopMode.create(LoggerStopModeParams);
loggerTask.instance.name = "SYSBIOS System Logger";  // Prior to CCSv6, some System Analyzer views look for this name 
var LoggingSetup = xdc.useModule('ti.uia.sysbios.LoggingSetup');
// Assign the logger instance we created for use logging SysBios task switch events
LoggingSetup.sysbiosLogger = loggerTask;
// NOTE: when assigning a logger to LoggingSetup, the size of the logger
// is no longer controlled by LoggingSetup.  To make this clear, the
// following line is commented out. 
//LoggingSetup.sysbiosLoggerSize = 1024;  

// Allow LoggingSetup to manage all of the other loggers.
// specify the size of the loggers we wish to use or disable logging if it is not required.
LoggingSetup.mainLoggerSize = 32*1024;
LoggingSetup.loadLogging = false;
LoggingSetup.sysbiosTaskLogging = true;
// Change the eventUploadMode to JTAGSTOPMODE:
//LoggingSetup.eventUploadMode = LoggingSetup.UploadMode_JTAGRUNMODE;
LoggingSetup.eventUploadMode = LoggingSetup.UploadMode_JTAGSTOPMODE;

Note: When using Stop-mode logging, System Analyzer will not show any events until the CPU has been halted (either by clicking the Debug View's Halt button or by a breakpoint being executed).

When using Stop-mode logging, it is often good to group all of the cores in the device into a 'sync group', so that whenever one core in the group halts or runs, the rest of the cores in the group will also halt or run. To do this, select all of the cores in the Debug View, right click and select "Sync Group Cores". The cores will then be collected under a top-level entry (typically named Group 1 (Synchronous)). With this node in the Debug View selected, you can apply actions to all of the cores at once (e.g. connect / disconnect, load a program, run, halt). If you set a breakpoint in one core, when it hits, all of the cores will be halted. System Analyzer will then be able to read the events from all of the stop-mode loggers in all of the cores and provide a clean, unbroken view of the events leading up to the breakpoint. Once you have finished analyzing the events, be sure to stop System Analyzer and then Start it again as described previously so that event correlation can be re-established.

One important consideration to be aware of is that, if your program code is shared between multiple cores, you should ONLY use hardware breakpoints. Using software breakpoints is not reliable, as multiple cores may hit the breakpoint while the debugger is in the process of trying to replace the breakpoint opcode with the original source code opcode in order to process the breakpoint for another core in the device. Hardware breakpoints tell the debugger explicitly which core hit the breakpoint and avoid all of the potential problems associated with software breakpoints in shared memory.

Links