Template:C6-Integra Software Design Debugging

Debugging
Debugging your code is an integral part of code development and a skill that takes time to develop. The goal of this section is not you make you an expert software debugger, but rather to help you understand steps that you can perform on your application code to determine the root cause of your issue. This can be especially helpful if you are using package dependencies (such as drivers) that were not developed by you.

This section is organized as a crash course into a collection of basic philosophies about debugging your applications on the C6-Integra/C6748 devices.

This page is dedicated to helping the software development team find the "root case" of the broken system code and provide explicit and useful feedback for others who can assist you in solving your integration challenges.

Typical Software Activity on C6-Integra Devices
In most C6-Integra/C6748 applications, the software development will be split between controlling multiple CPUs, DMAs, and various peripherals. Often, code is developed for each individual portion (CPU Interrupt Service Routines, DMA Transfers, 3rd Party I/O peripheral drivers) by various team members and then collectively integrated into the final application. This is especially true of C6-Integra/C6748 devices running operating systems.

While splitting code between an development team does have its benefits allowing you to test individual segments of your code, it encourages individuals write code that is beneficial for their particular segment rather than considering the impact on the entire application. As a result, integration of the individual segments often results in a non-working application.

Why?

The assumptions that one can make for testing an individual I/O, a DMA Transaction, or a CPU ISR typically do not overlap during system integration. When this occurs, it is not always easy to determine which part is a fault, especially since each developer as independently verified / validated their segment of the code.

So now what....?

As mentioned above, individuals test their segment of the code assuming they have highest priority and full system bandwidth. However in reality, code integration often introduces system various levels of arbitration (delays) amongst master/slave peripherals using the same bus, as well as possibly additional wait states until CPU bandwidth becomes available (stalls).

Debug Approach
Debugging code is commonly done by eliminating variables (one at a time if necessary) to assist you in determining where the error occurs.

As mentioned above, code developed for C6-Integra/C6748 devices often have code operating on various different pieces of hardware (CPUs, DMAs, &amp; I/O peripherals). Each of these individual hardware blocks may be running at different clocking rates and require different frequencies of servicing. These hardware pieces often are interdependent on each other to perform a function (or set of functions) in a well specified time window. When these events do not occur in time, the system will typically break and leave you with little (or no) output on the I/O or an DMA transfer that does not happen.

Check the Basics
This seems silly, but taking a step back from the specific problem you are facing can be very beneficial.

 Verify device clocking trees are configured correctly  C6-Integra/C6748 devices are not just the run-of-the-mill clocking structure of a simplistic 8 bit microcontroller. The hardware domains inside of the C6-Integra/C6748 devices  do not  run off a single clocking frequency, but rather a tree of related clocking domains that are derived from a main system clock inputs. Additionally, some peripherals (USB, EMAC, McASP, RTC) often involve being clocked from the main system clock, as well as a second clock source from an external crystal.

Incorrect configuration of the device clock tree could result in missed servicing of peripherals (even though the peripheral may be configured correctly otherwise).

For an overview of your device specific clocking structure, refer to your C6-Integra/C6748 Device Specific Technical Reference Manual

 Verify Power Domains are Active  C6-Integra Devices consist of multi-core, multi-frequency hardware blocks (CPUs, DMAs, Peripherals), its important to note these blocks can be individually powered down to conserve additional power &amp; reduce heat. The ROM code of C6-Integra/C6748 devices typically enables as few blocks as necessary to boot the device. As a result, it is up to the application developer to power on any additional hardware blocks in order to run their application.

For an overview of your device specific power structure, refer to your C6-Integra/C6748 device-specific Technical Reference Manual

''' Is this issue always reproducible? ''' If the issue is not always reproducible under the same conditions, this typically means it's a timing issue due to asynchronous timing between events. Unfortunately, this also means that the issue is typically harder to debug.

Testing the same code on a second hardware platform (if available) is also a quick sanity check to make sure your software issue is not board dependent.

Narrow the Scope of the Problem
Once you have determined that you device is operating and you have confirmed the basics are covered, the next step is to narrow down to scope of the problem to eliminate as many variables as possible.

Narrowing the scope of the problem is typically done by removing as many variables from your software application while still being able to reproduce the problem.

NOTE: Removing code segments is most easily done by using (#ifdef / #endif) around your existing code. The idea behind removing your code is not to make you do more work by rewriting your source code.

Some examples of this would be to occluding various unnecessary I/O operations, bypassing unnecessary DMA transfers, and/or masking off unnecessary interrupt service routines in the CPUs.

Once the unnecessary operation of software is eliminated, it becomes much easier to see where the problem is occuring.

Divide and Conquer
Once the scope of the probem has been narrowed, its important to identify all the various hardware blocks that remain.

Once you identify the remaining variables in your program, you can systematically disect the operation of each of the hardware blocks to verify correct operation.

 Example 1:  If you have an CPU with multiple interrupts, you can individual mask all by one interrupt at a time to determine if that particular ISR is properly being executed as expected. This process can be repeated through the other remaining interrupt services routines until you find the interrupt service routine that does not perform as expected.

 Example 2:  If you have a DMA that is not making the necessary transfers, you can break the overall transfer down into various sets.


 * Step 1: Is event that starts the DMA transfer being reached by the DMA Engine
 * Step 2: Is the DMA Transfer source / destination addresses being properly configured
 * Step 3: Is the source destination RAM accessible by the DMA controller? Some Memory Mapped Registers are only addressable by certain device masters. It's important to check to make sure that the Memory Mapped Registers you write to are accessible by the DMA controller.

 Example 3:  If you have a I/O Peripheral that is not transmitting/receiving in a consistent manner, there are several steps you can make


 * Step 1: Are the clocks to the peripheral correctly set up
 * Step 2: Are the buffers properly being serviced in time. Some peripherals (such as McASP) have a limited time window in which you need to service the buffers before a transmit overrun, or underrun condition will cause the state machines to generate an error. You can then use this information to back track and determine why the buffers are not being serviced in time by the CPU or DMAs

Obtaining Additional Assitance with your Investigation
Using the Divide and Conquer approach above you will typically be able to determine what the cause of the errant behavior is provide corrective action within your code, however if you are using a driver or code package that was developed by a 3rd Party, you may need to seek assistance from the developer.

It is important to remember that just because the problem is resolved in your particular software application, it may not be able to be reproduced on a different software and/or hardware application. Because of this, it is very critical for you to provide evidence that you have tracked down the culprit to their specific code before presenting directly pointing the finger.

Sending simple oscilloscope captures is the most useful way to see how your system is specifically behaving is some of the best insight you can produce. Oscilloscope captures can be as simple as triggering on a GPIO toggle when a particular event -or- error occurs.

Oscilloscope captures allow multiple parties to concurrently debug the problem while:


 * Guaranteeing inherent orotection of your code (no risk in allowing others to see the source code)
 * Showing the specific timing relationship between an event (or sequence of events)
 * Showing how specific events are mitigated or exacerbated with code tweaks.

In addition to sending conclusive evidence, it's also important to remember that the more information you can provide upfront about your specific application, the more likely you will get fast help from the developer. '''Remember, someone who has written the code is an expert at the code, and has likely come across (and successfully debugged) many of the issues you are currently seeing. The more information you can provide, them the faster they will be able to provide useful feedback to help you solve your integration challenges.''' When providing evidence of a issue one should provide at the bare minimum the following information:


 * Hardware Platform
 * Silicon Part Number (including Package, Version, &amp; Additional Package Markings)
 * Are you working with a C6-Integra Hardware Development Platform -or- is this a custom board?
 * Software Version
 * Full Names &amp; Versions of all software packages that were used to create the software application.
 * Full Names of IDEs, Tools Chains, &amp; Debugging Equipment (JTAG Emulators)
 * Oscilloscope screen captures of the timing before, during, and after the error occurs.
 * Additional Observations that may lead to insight into the issue.

Debug and Trace capabilities using Code composer Studio

What is supported:


 * ARM9 has a 4 KB Enhanced Trace Buffer (ETB) based trace and some AET based debug using Unified Break point Manager (UBM)
 * DSP has limited debug capability using UBM
 * BIOS Real time Analysis tools (RTA) and RTOS Object View(ROV) tools.

What is not supported:
 * Directing trace data from the C674x DSP to ETB is not supported by this device.
 * System trace (software instrumentation)
 * System events monitoring
 * System throughput/latency analysis
 * System bus watch point

Detailed description of debug and trace support in CCS Debug_and_Trace_Capabilities_on_OMAPL1X/AM1x/DA8x_devices