Template:KS2 SoftwareDesignDebugging

Debugging
Debugging your code is an integral part of code development and a skill that takes time to develop. The goal of this section is not to make you an expert software debugger, but rather to help you understand steps that you can perform on your application code to determine the root cause of your issue. This can be especially helpful if you are using package dependencies (such a drivers) that were not developed by you.

This section is organized as a crash course into a collection of basic philosophies about debugging your applications on TI devices.

This page is dedicated to helping the software development team find the "root case" of the broken system code and provide explicit and useful feedback for others who can assist you in solving your integration challenges.

Typical Software Systems
Often the software development is split into various software components; code is developed for each individual software component (CPU Interrupt Service Routines, DMA Transfers, 3rd Party I/O peripheral drivers) by various team members and then collectively integrated into the final application.

Why is this important?

TI provides a well integrated and tested SDK; however, often there is a need for developers who are too far in the development process to upgrade one software component while keeping all the rest. In addition, there is also the need for developers to pull in software components developed by 3rd parties or open source software. All these scenerious open the door for software imcompatibilities.

So now what....?

For this reason it is very important to read the software release notes for all corresponding software components when mixing and matching components; these documents often cite dependencies on particular versions of other software components as well as too-chains and being aware of this information will allow developers to make better decisions about tradeoffs and chosing the right components before spending much development time only to find out they went down the wrong path.

Debug Approach
Debugging code is commonly done by eliminating variables (one at a time if necessary) to assist you in determining where the error occurs.

Check the Basics
This seems silly, but taking a step back from the specific problem you are facing can be very beneficial.

 Verify device clocking trees are configured correctly  66AK2H is not just the run-of-the-mill clocking structure of a simplistic 8 bit microcontroller. The hardware domains inside of these devices  do not  run off a single clocking frequency, but rather a tree of related clocking domains that are derived from a main system clock inputs. Additionally, some peripherals (USB, EMAC, McASP, RTC) often involve being clocked from the main system clock, as well as a second clock source from an external crystal.

Incorrect configuration of the device clock tree could result in missed servicing of peripherals (even though the peripheral may be configured correctly otherwise).

For an overview of your device specific clocking structure, refer to the corresponding Device Specific Technical Reference Manual

 Verify Power Domains are Active  66AK2H consists of multi-core, multi-frequency hardware blocks (CPUs, DMAs, Peripherals), its important to note these blocks can be individually powered down to conserve additional power &amp; reduce heat. The ROM code of these devices typically enables as few blocks as necessary to boot the device. As a result, it is up to the application developer to power on any additional hardware blocks in order to run their application.

For an overview of your device specific power structure, refer to your 66AK2H Technical Reference Manual

''' Is this issue always reproducible? ''' If the issue is not always reproducible under the same conditions, this typically means it's a timing issue due to asynchronous timing between events. Unfortunately, this also means that the issue is typically harder to debug.

Testing the same code on a second hardware platform (if available) is also a quick sanity check to make sure your software issue is not board dependent.

Narrow the Scope of the Problem
Once you have determined that you device is operating and you have confirmed the basics are covered, the next step is to narrow down to scope of the problem to eliminate as many variables as possible.

Narrowing the scope of the problem is typically done by removing as many variables from your software application while still being able to reproduce the problem.

NOTE: Removing code segments is most easily done by using (#ifdef / #endif) around your existing code. The idea behind removing your code is not to make you do more work by rewriting your source code.

Some examples of this would be to occluding various unnecessary I/O operations, bypassing unnecessary DMA transfers, and/or masking off unnecessary interrupt service routines in the CPUs.

Once the unnecessary operation of software is eliminated, it becomes much easier to see where the problem is occuring.

Divide and Conquer
Once the scope of the probem has been narrowed, its important to identify all the various hardware blocks that remain.

Once you identify the remaining variables in your program, you can systematically disect the operation of each of the hardware blocks to verify correct operation.

 Example 1:  If you have an CPU with multiple interrupts, you can individual mask all but one interrupt at a time to determine if that particular ISR is properly being executed as expected. This process can be repeated through the other remaining interrupt services routines until you find the interrupt service routine that does not perform as expected.

 Example 2:  If you have a DMA that is not making the necessary transfers, you can break the overall transfer down into various sets.


 * Step 1: Is event that starts the DMA transfer being reached by the DMA Engine
 * Step 2: Is the DMA Transfer source / destination addresses being properly configured
 * Step 3: Is the source destination RAM accessible by the DMA controller? Some Memory Mapped Registers are only addressable by certain device masters. It's important to check to make sure that the Memory Mapped Registers you write to are accessible by the DMA controller.

 Example 3:  If you have a I/O Peripheral that is not transmitting/receiving in a consistent manner, there are several steps you can make


 * Step 1: Are the clocks to the peripheral correctly set up
 * Step 2: Are the buffers properly being serviced in time. Some peripherals may have a limited time window in which you need to service the buffers before a transmit overrun, or underrun condition will cause the state machines to generate an error. You can then use this information to back track and determine why the buffers are not being serviced in time by the CPU or DMAs

Obtaining Additional Assitance with your Investigation
Using the Divide and Conquer approach above you will typically be able to determine what the cause of the errant behavior is provide corrective action within your code, however if you are using a driver or code package that was developed by a 3rd Party, you may need to seek assistance from the developer.

It is important to remember that just because the problem is resolved in your particular software application, it may not be able to be reproduced on a different software and/or hardware application. Because of this, it is very critical for you to provide evidence that you have tracked down the culprit to their specific code.

Sending simple oscilloscope captures is the most useful way to see how your system is specifically behaving is some of the best insight you can produce. Oscilloscope captures can be as simple as triggering on a GPIO toggle when a particular event -or- error occurs.

Oscilloscope captures allow multiple parties to concurrently debug the problem while


 * Guaranteeing Inherent Protection of your code (no risk in allowing others to see the source code)
 * Showing the specific timing relationship between an event (or sequence of events)
 * Showing how specific events are mitigated or exacerbated with code tweaks.

In addition to sending conclusive evidence, it's also important to remember that the more information you can provide upfront about your specific application, the more likely you will get fast help from the developer. '''Remember, someone who has written the code is an expert at the code, and has likely come across (and successfully debugged)many of the issues you are currently seeing. The more information you can provide, them the faster they will be able to provide useful feedback to help you solve your integration challenges.''' When providing evidence of a issue one should provide at the bare minimum the following information:


 * Hardware Platform
 * Silicon Part Number (including Package, Version, &amp; Additional Package Markings)
 * Are you working with a TI Hardware Development Platform -or- is this a custom board?
 * Software Version
 * Full Names &amp; Versions of all software packages that were used to create the software application.
 * Full Names of IDEs, Tools Chains, &amp; Debugging Equipment (JTAG Emulators)
 * Oscilloscope screen captures of the timing before, during, and after the error occurs.
 * Additional Observations that may lead to insight into the issue.