Programming the EDMA3 using the Low-Level Driver (LLD)
From Texas Instruments Embedded Processors Wiki
The EDMA3 (Enhanced DMA version3) peripheral provides an efficient method of moving information between memory locations. This hardware peripheral exists on most of the new Application Processors designed by Texas Instruments (TI) in the last few years.
The Low-Level Driver (LLD) is a a group of software libraries provided by TI which makes it easier to program and manage the EDMA3 peripheral. It consists of three (3) libraries:
- EDMA3 Resource Manager (RM library)
- EDMA3 Driver (DRV library)
- Chip-specific configuration files (Sample library)
This discussion should provide a quick introduction to these topics, then conclude with a series of examples that progressively demonstrate more-and-more of the LLD (and EDMA3) capabilities.
Go to the following link for the pdf file:
When should (and shouldn't) I use LLD to program the EDMA3 ?
There are several ways to program EDMA3 operations: (1) the hard way - i.e. using writes to specific peripheral registers; (2) using the ACPY3 APIs (via a framework like Codec Engine); (3) using the Low Level Driver. You can do anything you like manually - via programming the specific peripheral registers. However, with API libraries available, why do it the hard way?
If you plan to only do memory-to-memory transfers using the Quick DMA (QDMA), similar to a memcpy() in C, then all you need is the ACPY3 API library via a framework like Codec Engine (CE). ACPY uses the QDMA resources of the EDMA3 peripheral and provides no synchronization to peripherals. It is easy to set up and use if all you're doing are memory to memory transfers. These are quite common in video applications when large buffers need to be moved from on-chip to off-chip memories. If you are using an algorithm via Codec engine and only desire to do memory-to-memory transfers, stick with ACPY3 and don't use the LLD.
If you are building, for example, a peripheral driver, which requires syncing transfers to a peripheral "data ready" signal, then the Low Level Driver (LLD) is the best choice. First, you can't use ACPY because the QDMA ignores sync events to the EDMA. The LLD was built specifically for syncing to peripheral events. You can also use the LLD for memory-to-memory transfers as shown in the examples below. Once again, if you need to respond to sync events from peripherals, use the LLD to program the EDMA3 peripheral.
Brief Overview of EDMA3
The EDMA3 peripheral is essentially a separate CPU on most TI processors that performs data transfers without CPU intervention. Almost all systems require some type of data movement - either from one memory location to another or from a peripheral register to memory (or vice versa). The EDMA3 is a powerful co-processor that can handle almost any type of transfer - either synchronous or asynchronous.
The EDMA3 is comprised of two basic components - the DMA and the QDMA. The QDMA uses a trigger word (i.e. when you write to the trigger register, it triggers the transfer) to start the transfer and is used for memory-to-memory data movement. It cannot be synced to a peripheral event. The DMA is configured to respond to sync events from peripherals - i.e. the "data ready" signals. For example, a common system using a serial port hooked to an ADC gathers a sample every sample period. When this "data" is ready to be copied to memory, a sync event is sent to the DMA and the DMA copies the data from the peripheral register to a memory buffer independent of the CPU. Once the buffer is full, the DMA can interrupt the CPU to say "you have a buffer to process" ... and so on.
DMA transfers can be triggered 3 different ways: (1) a manual START; (2) via a sync event from a peripheral; (3) via a chained event - i.e. the completion of channel x can kick off channel y to start.
A basic transfer requires the source address, destination address and a count value (how much to copy). Source and destination addresses are fairly obvious. The count value, however, is three-dimensional - ACNT, BCNT and CCNT. ACNT specifies the number of bytes in an "element". For example, for 16-bit audio data, ACNT would be 2 bytes. This is the "minimum" transfer size. BCNT specifies the number of "elements" in a "frame" or "line". CCNT specifies the "block" size or the number of frames in the block. Each of these count values max out at 64K. So, the max transfer is 64K*64K*64K. Big enough?
For more advanced transfers, the EDMA offers indexing between transfers for both the source and destination addresses. For example, if you'd like to bump the src address 4 bytes after each element transfer, you can set 'BIDX to 4. After the EDMA transfers the first element, the src address will be indexed 4 bytes (plus or minus) prior to the next transfer. The destination address has similar indexing capabilities. Also, 'CIDX can be employed to bump the src or dst addresses after a "frame" or "line" is transfered. A combination of these indexes can automatically perform "channel sorting" on incoming or outgoing data "free of charge" - with no time penalty.
The EDMA3 also allows for "linking" and "chaining" capabilities. Linking is the process of "reloading" a channel's peripheral configuration registers (or buffer descriptor) with another set after the first transfer is done. For example, when channel X is complete, maybe you'd like to transfer the same thing again. Extra "Reload" parameter sets are available to hold another configuration that can be "reloaded" automatically into Channel X's registers upon completion of a transfer. This does NOT start a transfer - it only configures the channel for the next transfer. "Chaining" is the ability for one channel's completion to trigger another transfer. For example, if channel X is chained to channel Y, when channel X completes, it triggers channel Y to start transferring.
This is only a brief overview of the EDMA3's capabilities. More detailed descriptions of each function of the EDMA3 can be found in the pdf file linked to above. Also, it is highly advised to download the examples below if you plan to use the LLD to program the EDMA. They are heavily commented and offer an outstanding starting point for people new to LLD.
Brief Overview of LLD
The EDMA3 Low-Level Driver (LLD) is a set of APIs that support programming the EDMA3 peripheral. Each one of the capabilities outlined above require specific API calls to set the source and destination addresses, count values, indexing, linking and chaining as well as configuring what type of trigger is used to start the transfer. Under the hood, the LLD is simply programming the peripheral register sets for you - however, this abstracts the user from needing to know the peripheral register addresses and cumbersome programming techniques. The APIs are relatively easy to use and most are self explanatory.
The best way to learn how to use the LLD is to download the pdf file shown below and work through the examples. Much time was spent working through each example and learning the LLD from the ground up. If you plan to use the LLD, working through the pdf document and perusing the examples will save you precious design time in getting up to speed quickly on the LLD.
LLD is included e.g. in many versions and variants of DVSDK. Its further available for download as a separate component.
Getting to know LLD by example – 9 to show the way
|1. Async||A simple asynchronous transfer; akin to a memcpy(). Uses polling to determine when the transfer has completed.|
|2. IntGen||The EDMA generates a CPU interrupt when the async example transfer is complete.|
|3. SyncAB||Adds synchronization to our example. The data transfer starts when a timer interrupt occurs. “AB” sync transfers the (A * B) elements when triggered by the timer.|
|4. SyncA||Similar to the syncAB example, but uses “A” synchronization.|
|5. Link||Two transfers are setup; when one completes, the “linked” channel is reconfigured automatically using another configuration. The terms linking, autoinitialization, and reload are used to describe the same thing.|
|6. Chain||Two different channels are configured for transfers. After the first one completes, it triggers the next channel to start. (Similar to linking, but rather than reloading the same channel, another channel is triggered to run.)|
|7. ChanSort||Explores the channel sorting features of the EDMA3. This involves using indexing (i.e. offsets) when incrementing (or decrementing) the source and destination addresses.|
|8. Audio||Uses the DMA to transfer data in a simple audio application. This example includes synchronization, linking, and sorting.|
|9. ConfigStruct||Examples 1-8 were simple examples that partitioned the LLD calls into separate functions (alloc resources, configure transfers, start, delete). These were “canned” functions, though, in that they were partially hard-coded to accomplish the example. Example 9 tries to abstract the configuration requirements so that the provided functions could be used (and reused) for a variety of purposes. To accomplish this, we created a structure to pass the transfer requirements to the various (now more generic) functions.|
EDMA3 LLD Download / Contributed Examples
The EDMA LLD itself can be downloaded at:
You can access the code examples at TI's external Gforge site:
- examples project page (with SVN, notes, files, etc.)
- zip-files and PDF document (PDF download requires a gforge login)