OMAP-L138 Software Design Guide
- 1 Introduction
- 2 Software Design Timeline →
- 3 Determine Top Level Software Hierarchy
- 4 Software Hierarchy Providing specific for OMAP-L138/C6748
- 5 Download, Purchase, or Install Appropriate IDE & Tool Chain(s)
- 6 Download, Purchase, or Install Additional Packages / Dependencies
- 7 Procure Hardware Development Platforms & Emulation Tools
- 8 Code Development
- 9 Debugging
- 10 Product Integration
- 11 Software Performance
- 12 Software Training Resources
- 13 Demos
The purpose of this software design guide is to walk developers through the various stages of designing software for TI's C6-Integra ARM + DSP devices as well as the C6748 DSPs. The guide follows the structure shown in the Software Design Timeline below. Each design stage in the Timeline links to a collection of useful documentation, application notes, and design recommendations pertaining to that stage. Using this Guide, software designers can efficiently locate a resource (or collection of resources) they need at each stage in the software development process.
Note: This guide is not as a replacement for information specified in the data manual nor is it designed as a replacement for information on TI.com, rather is should be used as a companion guide to point to existing information.
Software Design Timeline →
Determine Top Level Software Hierarchy
Real Time Operating System (RTOS)
Real Time Operating Systems are specifically designed to appropriately schedule signals to be handled with minimal latency at the specific time when the the signal occurs. RTOS such as TI's DSP/BIOS (later renamed SYS/BIOS to reflect multi-core ARM <-> DSP interactions) is one such example of a RTOS that has been specifically designed and successfully used on multiple C6x DSPs for years. The below links will direct you to more information about the DSP/BIOS and SYS/BIOS which are the officially supported RTOSs of TI, and are available free of charge.
Note: DSP/BIOS is the precursor to SYS/BIOS, although both are supported. Those who are not as familiar with RTOS will benefit with starting with DSP/BIOS as it has a greater number of drivers for new and legacy devices.
In additional to DSP/BIOS and SYS/BIOS there are additional available RTOSs that can be run on C6-Integra/C674x Devices, each with it own benefits, however it should be noted that these RTOSs are not always free of charge, nor are they directly supported by TI. The following matrix lists additional Real Time Operating Systems that are available for C6-Integra/C674x devices.
Generic Operating System (OS)
Developing on a generic operating system (OS) can jump start your development by providing a predefined code structure (and typically existent peripheral drivers) which allow you to focus directly on your end application development code. While this may sound advantageous, it's important to understand that the benefits that one gains with a traditional operating system can quickly be out shadowed by non-optimized code that can quickly consume a lot of unnecessary processing power and/or available on-chip system memory resulting in a higher overall system cost.
The following matrix lists Generic Operating Systems are available for C6-Integra/C6748 Devices
No Operating System
Software development on a device without the use of any operating system is typically done when a very minimal code footprint is needed, or an operating system may add unnecessary complexity that is not needed. Development with no operating systems are typically discouraged because it requires an application to rely solely on interrupts for scheduling. While this approach does work, it quickly becomes a heavy burden on the developer to appropriate structure code to manage all events, when the code will need to manage for future undefined programs.
As the application scales to meet additional demands of your customers, additional demands of not using an operating system may sideline you into re-using your your differentiating your products from those your competitors provide.
Software Hierarchy Providing specific for OMAP-L138/C6748
TI provides the following software development packages specific for OMAP-L138/C6748
RTOS Support for OMAP-L138/C6748
DSP/BIOS Platform Support Package for OMAP-L138/C6748
OS Support for OMAP-L138
Linux Software Development Kit for OMAP-L138
Non-OS Examples for OMAP-L138/C6748
QuickStartOMAPL1x rCSL Register Level Chip Support Library Example Package
StarterWare Device Abstraction Layer
EVM BSL Board Support Library
Optimized DSP Software libraries
DSP Software Libraries
Download, Purchase, or Install Appropriate IDE & Tool Chain(s)
Once you have decided on the appropriate software hierarchy that is appropriate for your design, the next step is to pick to tool chain that will quickly assist you in developing your end applications. Integrated Development Environments (IDEs) provide a software developer a base set of software which allows a programmer to get started writing and testing code. IDEs typically include a Text Editor, The software tool chain, and a debugging environment all in a single application that typically runs from a host PC.
A tool chain typically consists of at least one C compiler, an Assembler, and a Linker. The tool chain is CPU architecture dependent, so the right tool chain is dependent on the CPU of choice as well as any additional constraints imposed by the host operating system.
C6x-Integra Devices offer wealth of various tools chains and IDEs due to the integration of both the ARM CPU core and the C674x DSP core, both of which have a different instruction set architecture (ISA)
TI provides both an IDE as well as a tool chain for development on C6x-Integra devices.
Integrated Development Environments supported by TI
- Code Composer Studio (CCS) Integrated Development Environment (IDE)
- Inclusive of TMS320 Optimizing C6000 Optimizing Compiler, C6000 Assembler, & C6000 Linker
- Inclusive of TMS470 Optimizing ARM Optimizing Compiler, ARM Assembler, & ARM Linker
- Inclusive of Cycle Accurate Simulators for both the ARM and C674x DSP Cores
- Integrated support for DSP/BIOS Real Time Operating System
- Integrated support XDS Series Emulators/Debuggers
- Additional Introductory Information on the Code Composer can be referenced in the following wiki articles
Stand Alone Tool Chains supported by TI outside of CCS
- C674x DSP CPU Tool Chain
- TMS320 Optimizing C6000 C/C++ Compiler
- TMS320 Optimizing C6000 Assembler
- TMS320 Optimizing C6000 Linker
- ARM9, Cortex-A8 CPU Tool Chain
- TMS470 Optimizing ARM C/C++ Compiler
- TMS470 Optimizing Assembler
- TMS470 Optimizing Linker
- Programmable Realtime Unit (PRU) Tool Chain
- PRU Assembler
Additional 3rd Party Tool Chains available (not supported by TI)
Download, Purchase, or Install Additional Packages / Dependencies
TI offers wealth of additional Application Programming Interfaces (APIs) for the C6-Integra devices which serve to abstract programming each register at the bit level as well as to provide a well defined method to program the C6-Integra devices. Additionally, there are various software packages available for download which can help you jump start your software development based upon your preferences.
For a list of all TI Provided API's specific to the OMAP-L138/C6748 please visit the OMAP-L138 Project Support Page on TI.com
Procure Hardware Development Platforms & Emulation Tools
Hardware Development Platforms are designed, developed, and tested by 3rd Party Design Houses in order to provide the software architect team with the ability to develop their code in parallel with the hardware design of the Printed Circuit Board (PCB). Hardware Development platforms are preferable vs simulators because they actually reflect how the device will behave in the end product and take into account system latencies for cache misses and off chip memory accesses.
Texas Instruments provides a variety of reference designs for JTAG Emulators called the XDS series. These emulators come in various different classes based upon the desired level of support
TI XDS Hardware Emulators
XDS100 Class Emulators (Version 1, Version 2)
XDS510 Class Emulators
XDS560 Class Emulators
Adaptive Clocking JTAG Emulator Adaptors
3rd Party Emulator Drivers Updates
For additional updates and drivers please consult the Official Support Pages from 3rd Party Emulator Providers
For additional information on Emulation refer to Emulators/Analyzers Page on TI.com and additional wiki articles about TI JTAG Emulators & JTAG Schematics The following hardware development kits are recommended by TI for the OMAP-L138/C6748
Software Development Platforms for OMAP-L138/C6748
Since Code Development is application specific, and a variety of code development tools can be used to program the OMAP-L138/C6748, this section is specific to highlight code development on the OMAP-L138/C6748 using the Code Composer Studio (CCS) Integrated Development Environment.
OMAP-L138/C6748 Peripheral Specific Code Development Guides
- Programming Asynchronous EMIF on OMAP-L13x/C674x
- This guide provides a spreadsheet for calculating the controller timing values needed to interface with an asynchronous NAND device.
- Programming mDDR/DDR2 on OMAP-L13x/C674x
- This guide provides a spreadsheet for calculating the controller timing values needed to interface with mDDR or DDR2 devices. In addition it can be used to decode the registers values into the individual bit fields for easier debugging.
- Programming the PLL Controllers on OMAP-L13x/C674x
- This wiki provides a spreadsheet for determining whether a given PLL configuration is valid using the frequency limitations for each clock domain.
Debugging your code is an integral part of code development and a skill that takes time to develop. The goal of this section is not you make you an expert software debugger, but rather to help you understand steps that you can perform on your application code to determine the root cause of your issue. This can be especially helpful if you are using package dependencies (such as drivers) that were not developed by you.
This section is organized as a crash course into a collection of basic philosophies about debugging your applications on the C6-Integra/C6748 devices.
This page is dedicated to helping the software development team find the "root case" of the broken system code and provide explicit and useful feedback for others who can assist you in solving your integration challenges.
Typical Software Activity on C6-Integra Devices
In most C6-Integra/C6748 applications, the software development will be split between controlling multiple CPUs, DMAs, and various peripherals. Often, code is developed for each individual portion (CPU Interrupt Service Routines, DMA Transfers, 3rd Party I/O peripheral drivers) by various team members and then collectively integrated into the final application. This is especially true of C6-Integra/C6748 devices running operating systems.
While splitting code between an development team does have its benefits allowing you to test individual segments of your code, it encourages individuals write code that is beneficial for their particular segment rather than considering the impact on the entire application. As a result, integration of the individual segments often results in a non-working application.
The assumptions that one can make for testing an individual I/O, a DMA Transaction, or a CPU ISR typically do not overlap during system integration. When this occurs, it is not always easy to determine which part is a fault, especially since each developer as independently verified / validated their segment of the code.
So now what....?
As mentioned above, individuals test their segment of the code assuming they have highest priority and full system bandwidth. However in reality, code integration often introduces system various levels of arbitration (delays) amongst master/slave peripherals using the same bus, as well as possibly additional wait states until CPU bandwidth becomes available (stalls).
Debugging code is commonly done by eliminating variables (one at a time if necessary) to assist you in determining where the error occurs.
As mentioned above, code developed for C6-Integra/C6748 devices often have code operating on various different pieces of hardware (CPUs, DMAs, & I/O peripherals). Each of these individual hardware blocks may be running at different clocking rates and require different frequencies of servicing. These hardware pieces often are interdependent on each other to perform a function (or set of functions) in a well specified time window. When these events do not occur in time, the system will typically break and leave you with little (or no) output on the I/O or an DMA transfer that does not happen.
Check the Basics
This seems silly, but taking a step back from the specific problem you are facing can be very beneficial.
Verify device clocking trees are configured correctly C6-Integra/C6748 devices are not just the run-of-the-mill clocking structure of a simplistic 8 bit microcontroller. The hardware domains inside of the C6-Integra/C6748 devices do not run off a single clocking frequency, but rather a tree of related clocking domains that are derived from a main system clock inputs. Additionally, some peripherals (USB, EMAC, McASP, RTC) often involve being clocked from the main system clock, as well as a second clock source from an external crystal.
Incorrect configuration of the device clock tree could result in missed servicing of peripherals (even though the peripheral may be configured correctly otherwise).
For an overview of your device specific clocking structure, refer to your C6-Integra/C6748 Device Specific Technical Reference Manual
Verify Power Domains are Active C6-Integra Devices consist of multi-core, multi-frequency hardware blocks (CPUs, DMAs, Peripherals), its important to note these blocks can be individually powered down to conserve additional power & reduce heat. The ROM code of C6-Integra/C6748 devices typically enables as few blocks as necessary to boot the device. As a result, it is up to the application developer to power on any additional hardware blocks in order to run their application.
For an overview of your device specific power structure, refer to your C6-Integra/C6748 device-specific Technical Reference Manual
Is this issue always reproducible? If the issue is not always reproducible under the same conditions, this typically means it's a timing issue due to asynchronous timing between events. Unfortunately, this also means that the issue is typically harder to debug.
Testing the same code on a second hardware platform (if available) is also a quick sanity check to make sure your software issue is not board dependent.
Narrow the Scope of the Problem
Once you have determined that you device is operating and you have confirmed the basics are covered, the next step is to narrow down to scope of the problem to eliminate as many variables as possible.
Narrowing the scope of the problem is typically done by removing as many variables from your software application while still being able to reproduce the problem.
NOTE: Removing code segments is most easily done by using (#ifdef / #endif) around your existing code. The idea behind removing your code is not to make you do more work by rewriting your source code.
Some examples of this would be to occluding various unnecessary I/O operations, bypassing unnecessary DMA transfers, and/or masking off unnecessary interrupt service routines in the CPUs.
Once the unnecessary operation of software is eliminated, it becomes much easier to see where the problem is occuring.
Divide and Conquer
Once the scope of the probem has been narrowed, its important to identify all the various hardware blocks that remain.
Once you identify the remaining variables in your program, you can systematically disect the operation of each of the hardware blocks to verify correct operation.
If you have an CPU with multiple interrupts, you can individual mask all by one interrupt at a time to determine if that particular ISR is properly being executed as expected. This process can be repeated through the other remaining interrupt services routines until you find the interrupt service routine that does not perform as expected.
If you have a DMA that is not making the necessary transfers, you can break the overall transfer down into various sets.
- Step 1: Is event that starts the DMA transfer being reached by the DMA Engine
- Step 2: Is the DMA Transfer source / destination addresses being properly configured
- Step 3: Is the source destination RAM accessible by the DMA controller? Some Memory Mapped Registers are only addressable by certain device masters. It's important to check to make sure that the Memory Mapped Registers you write to are accessible by the DMA controller.
If you have a I/O Peripheral that is not transmitting/receiving in a consistent manner, there are several steps you can make
- Step 1: Are the clocks to the peripheral correctly set up
- Step 2: Are the buffers properly being serviced in time. Some peripherals (such as McASP) have a limited time window in which you need to service the buffers before a transmit overrun, or underrun condition will cause the state machines to generate an error. You can then use this information to back track and determine why the buffers are not being serviced in time by the CPU or DMAs
Obtaining Additional Assitance with your Investigation
Using the Divide and Conquer approach above you will typically be able to determine what the cause of the errant behavior is provide corrective action within your code, however if you are using a driver or code package that was developed by a 3rd Party, you may need to seek assistance from the developer.
It is important to remember that just because the problem is resolved in your particular software application, it may not be able to be reproduced on a different software and/or hardware application. Because of this, it is very critical for you to provide evidence that you have tracked down the culprit to their specific code before presenting directly pointing the finger.
Sending simple oscilloscope captures is the most useful way to see how your system is specifically behaving is some of the best insight you can produce. Oscilloscope captures can be as simple as triggering on a GPIO toggle when a particular event -or- error occurs.
Oscilloscope captures allow multiple parties to concurrently debug the problem while:
- Guaranteeing inherent orotection of your code (no risk in allowing others to see the source code)
- Showing the specific timing relationship between an event (or sequence of events)
- Showing how specific events are mitigated or exacerbated with code tweaks.
In addition to sending conclusive evidence, it's also important to remember that the more information you can provide upfront about your specific application, the more likely you will get fast help from the developer.
Remember, someone who has written the code is an expert at the code, and has likely come across (and successfully debugged) many of the issues you are currently seeing. The more information you can provide, them the faster they will be able to provide useful feedback to help you solve your integration challenges.
When providing evidence of a issue one should provide at the bare minimum the following information:
- Hardware Platform
- Silicon Part Number (including Package, Version, & Additional Package Markings)
- Are you working with a C6-Integra Hardware Development Platform -or- is this a custom board?
- Software Version
- Full Names & Versions of all software packages that were used to create the software application.
- Full Names of IDEs, Tools Chains, & Debugging Equipment (JTAG Emulators)
- Oscilloscope screen captures of the timing before, during, and after the error occurs.
- Additional Observations that may lead to insight into the issue.
Debug and Trace capabilities using Code composer Studio
What is supported:
- ARM9 has a 4 KB Enhanced Trace Buffer (ETB) based trace and some AET based debug using Unified Break point Manager (UBM)
- DSP has limited debug capability using UBM
- BIOS Real time Analysis tools (RTA) and RTOS Object View(ROV) tools.
What is not supported:
- Directing trace data from the C674x DSP to ETB is not supported by this device.
- System trace (software instrumentation)
- System events monitoring
- System throughput/latency analysis
- System bus watch point
Detailed description of debug and trace support in CCS Debug_and_Trace_Capabilities_on_OMAPL1X/AM1x/DA8x_devices
Additionally, TI provides a quick utility to collect the status of important register settings via JTAG that can be run from any Code Composer Studio Emulator Tool.
The OMAP-L13x Debug GEL File can be used to help determine the current state of the OMAP-L138/C6748 during the debug phase.
Once the software is developed and debugged, the final step in the software development process is to take the code you've developed and port it to the actual hardware that will be integrated into your project.
IMPORTANT: If your software development made use of functions executed by a GEL file, this functionality must be incorporated into the software application itself before you can generated a bootable image. Since GEL files rely on an emulator for execution, this functionality must be replaced by the application code in your end product.
Understanding how the OMAP-L138 Boot Sequence Works
An brief overview of the ROM Bootloader sequence for the OMAP-L138/C6748 can be found in the following wiki article:
Generating the Boot Image
The bootable image for the OMAP-L138/C6748 is an Application Image Script (AIS)-format binary converted from a COFF or ELF .out file. For details on generating AIS files and AIS commands, refer to the OMAP-L138/C6748 Bootloader Application Report
Additionally, TI provides several examples creating a bootable image from your program. There are several LED Blink examples for booting from NAND, NOR, SPI, and UART on the OMAP-L138/C6748 EVM on the following wiki: OMAP-L138 Boot Examples
Programming the Image into non-volatile memory
Once a bootable image has been created, there are utilities provided by TI to flash your image into non-volatile memory that run on top of your Host PC.
Serial Flashing Utility
The Serial Flashing Utility can be used to flash parallel Flash (NAND or NOR), as well as serial Flash (SPI Flash) using a standard UART serial cable connected to a host PC running Windows or Linux.
- Fast Flashing Time
- No Connection is required to CCS or to JTAG via an Emulator
- The Serial Flash Utility only supports flashing EVMs without code modifications for user specific boards.
- The build environment needs to be downloaded to there user PC.
Flashing via JTAG using CCS
Various CCS Projects have been developed to flash non-volatile memory via the JTAG connection using Code Composer Studio. These projects can be found in the Software Development Kit for NAND, NOR, or SPI Flash.
- Source Code is easy to modify
- No additional build environment tools to install
- Slow flashing times
- Requires a JTAG connection
Serial Port Boot Utility
When debugging, it is often useful to boot over the serial port rather than reflashing the image each time a modification is made. The UART Boot Host enables this using the following steps:
- Download the tool here
- Select the AIS boot image to flash
- Set the board to the appropriate UART boot mode and reset/power cycle
- Push Start
The existing AIS image will boot unless one of the following is true:
- NOR boot mode is used. In this case, it will be necessary to regenerate it with the AISgen tool and select the appropriate UART boot mode.
- AIS commands are used to modify the pinmux on the UART pins. In this case, the boot will not complete since the UART communication will be lost. Performing the pinmux configuration in the user code rather than AISgen will resolve this issue.
C674x DSP performance using software library functions
Software Training Resources
Code Composer Studio
Peripheral Throughput optimization
These Wiki articles describe the throughput performance of different peripherals on OMAP-L1x/C674x/AM1x devices. They also provide the factors that affect peripheral throughput and recommendations for optimum peripheral performance.
In most cases, an EDMA transfer was setup in the background to mimim system activity in a real system. This background activity is described in EDMA Background Activity for OMAP-L1x/C674x/AM1x Throughput Measurements.
LCD Controller (LCDC)
For a description of the throughput analysis of the LCDC module integrated in OMAP-L1x/C674x/AM1x devices see OMAP-L1x/C674x/AM1x LCD Controller (LCDC) Throughput and Optimization Techniques.
Universal Asynchronous Receiver/Transmitter (UART)
For a description of the throughput analysis of the UART module integrated in OMAP-L1x/C674x/AM1x devices see OMAP-L1x/C674x/AM1x UART Throughput and Optimization Techniques.
Multichannel Audio Serial Port (McASP)
See OMAP-L1x/C674x/AM1x Multichannel Audio Serial Port (McASP) Throughput and Optimization Techniques for a description of the throughtput analysis done on the McASP module included in OMAP-L1x/C674x/AM1x devices.