OMAP-L1x/C674x/AM1x Multichannel Audio Serial Port (McASP) Throughput and Optimization Techniques

From Texas Instruments Wiki
Jump to: navigation, search

^ Up to main OMAP-L1x/C674x/AM1x SOC Architecture and Throughput Overview Table of Contents

This article was created to present throughput measurements and optimization techniques for the the Multichannel Audio Serial Port (McASP) of OMAP-L1x/C674x/AM1xx devices. Different variables were explored to assess their impact on McASP performance. Finally, the maximum throughput that can be achieved by the McASP are also presented. 

The information in this article deals mainly with OMAP-L1x8/C674m/AM18xx devices (where m is an even number). However the data can also be extended to OMAP-L1x7/C674n/AM17xx devices (where n is an odd number) given that both sets of devices share a similar SOC architecture.

McASP Basics

The following sections provide a high-level overview of the McASP. Detailed information on the McASP can be obtained from the TMS320C674x/OMAP-L1x Multichannel Audio Serial Port (McASP) User's Guide (SPRUFM1).


Features

The McASP is a general-purpose serial port optimized for multichannel audio applications. It supports time-division multiplexed (TDM) stream, Inter-IC Sound (I2S) protocols, direct connection to analog to digital converters (ADC) and digital to analog converters (DAC). The McASP includes independent transmit and receive sections with separate master clocks, bit clocks, and frame syncs.  Up to 16 serializers that can be individually enabled for either transmit or receive operation; the number of serializers supported on each McASP varies for device to device (see your device data sheet).  Each McASP also includes a 256-byte Read FIFO and Write FIFO.

McASP Block Diagram

McASP Block Diagram

Data Flow

The McASP generates transmit/receive DMA event (AXEVT/AREVT) requests when it needs data to be tranferred to/from its serializer registers. The Read and Write FIFOs directly service these data transfer requests.  When a specific amount of data is available or needed in the FIFOs, the FIFOs generate a transmit/receive DMA event (AXEVT/AREVT).  The EDMA must service McASP FIFOs when AXEVT/AREVT are generated.  The amount of data requested by the FIFOs depends on number of serializers being used and FIFO configuration (see below).

FIFO Configuration

The McASP Write and Read FIFOs temporarily hold serializer data.

  • 256 bytes = up to four 32-bit words per serializer in the case of 16 active serializers
  • 256 bytes = up to 64 32-bit words in the case of one active serializer

The WNUMDMA and WNUMEVT parameters are used to configure the Write FIFO.  Similarly RNUMDMA and RNUMEVT configure the Read FIFO.

WNUMDMA specifies the write word count per McASP transfer. This value must equal the number of McASP serializers used as transmitters. Upon a transmit DMA event from the McASP, WNUMDMA words are transferred from the Write FIFO to the McASP. WNUMEVT specifies the write word count per DMA event. This value is a non-zero integer multiple of the number of serializers enabled as transmitters. When the Write FIFO has space for at least WNUMEVT words of data, then an AXEVT (transmit DMA event) is generated by FIFO to the host/DMA controller. RNUMEVT and RNUMDMA are receive/read versions of the two parameters just described.

McASP FIFO Use

McASP FIFO Use

When FIFO is disabled, EDMA writes data directly to serializers; WNUMDMA, WNUMEVT, RNUMDMA, and RNUMEVT are ignored.

The “TX/RX EVT DMA RATIO ” is the multiplication factor between W/RNUMEVT and W/RNUMDMA.

  • WNUMEVT = TX EVT DMA RATIO * WNUMDMA
  • RNUMEVT = RX EVT DMA RATIO * RNUMDMA

TX/RX EVT DMA RATIO affects how often DMA requests are generated by the McASP. If TX EVT DMA RATIO = 1, then FIFO will trigger an DMA request on every transmit event.
However, if TX EVT DMA RATIO = 4, then FIFO will trigger an DMA request after four transmit events. Reducing number of DMA requests makes it easier to meet real-time deadlines.

Clocking

McASP serial clock (ACLKR & ACLKX) can be generated internally or externally. In the case of internal clock, ACLKR & ACLKX are generated from high-frequency clock (AHCLKX and AHCLKR) and clock dividers (CLKXDIV and CLKRDIV). The high-frequency clock generated externally on AHCLKX and AHCLKR pins or internally from AUXCLK and clock dividers (HCLKXDIV and HCLKRDIV). In the case of external clock, ACLKR & ACLKX are sourced directly on McASP pins.

McASP Transmit Clock Generator Block Diagram

McASP Transmit Clock Generator Block Diagram


McASP Receive Clock Generator Block Diagram

McASP Receive Clock Generator Block Diagram


McASP supports independent transmit and receive clock zones.  However, McASP receiver can also be configured to operate synchronously to the transmitter clock and frame signals (ACLKX and AFSX).

McASP Throughput Characterization

A vast amount of throughput data was collected on the McASP. Several knobs (or variables) were turned to get a full understanding of the McASP throughput performance under different scenarios.

One imporant variable that was considered during these throughput measurements was the impact of other master activity on the McASP throughput performance. To simulate activity generated by another master(s), a dummy EDMA continuous transfer was setup to compete for access to external memory. Several aspects of this backgound activity were also tweaked.

Throughput Data Test Variables
Test Variable
Options
DSP/ARM Frequency
300, 200, and 100 MHz
Pass/Fail Criteria
EDMA is not able to complete the transfer of data from McASP or FIFO to the destination buffers
McASP Data Location
L2, Shared RAM, DDR2 (132 MHz)
Test Parameters
  • Serial clock frequency and word size (8, 16, and 32)
  • Number of serializers (4 TX/RX, 8 TX/RX)
  • Write/Read FIFO enabled/disabled
  • TX/RX EVT DMA RATIO
  • EDMA default burst size
  • Background EDMA activity w/ different read rate levels & priority
  • DDR2 memory controller peripheral bus burst priority register (PBBPR) setting

Notes:

  • The TX/RX EVT DMA RATIO value shown in data slides is the minimum value for which the test always passes for a given set of test parameters. A lower threshold value will cause the test to fail.
  • All data collected using low-level software; BIOS and Linux were not used.


Test Environment

The following list describes the test environment under which the McASP throughput data was collected:

  • All data was collected with the DSP & ARM running at three different frequencies: 300, 200, and 100MHz.
  • L2, Shared RAM, and DDR2 memory were used to hold the source (SRC) and destination (DST) data buffers.
  • No drivers or high-level operating systems (BIOS, Linux, etc.) were used for this testing.  All data was collected using low-level software.


Summary of McASP Throughput Data

The subsections given below present a summary of a large amount of data collected by varying the parameters listed in the Throughput Data Test Variables table . To see the specific impact of these test parameters on McASP performance, refer to the back up slides given in the presentations at the end of this wiki article.

NOTE: "Max throughput" is on a per serializer basis.

Standalone Transfers, No EDMA Background Activity

Summary of McASP Standalone Data, 4 TX/RX Serializers
Test Scenario Max Throughput (Mbps) SRC and DST Buffers
Standalone Test
Element size=32 bits
FIFO enabled
49
49
49
L2 with TX/RX EVT DMA ratio = 1
L3 with TX/RX EVT DMA ratio = 1
DDR2 with TX/RX EVT DMA ratio = 1
Standalone Test
Element size=16 bits
FIFO enabled
32
32
32
L2 with TX/RX EVT DMA ratio = 2
L3 with TX/RX EVT DMA ratio = 2
DDR2 with TX/RX EVT DMA ratio = 3
Standalone Test
Element size=8 bits
FIFO enabled
16
16
16
L2 with TX/RX EVT DMA ratio = 2
L3 with TX/RX EVT DMA ratio = 2
DDR2 with TX/RX EVT DMA ratio = 2


Summary of McASP Standalone Data, 8 TX/RX Serializers
Test Scenario Max Throughput (Mbps) SRC and DST Buffers
Standalone Test
Element size=32 bits
FIFO disabled
21.5
21.5
21.5
L2
L3
DDR2
Standalone Test
Element size=32 bits
FIFO enabled
32
32
32
L2 with TX/RX EVT DMA ratio = 2
L3 with TX/RX EVT DMA ratio = 2
DDR2 with TX/RX EVT DMA ratio = 2
Standalone Test
Element size=16 bits
FIFO disabled
10.66
10.66
10.66
L2
L3
DDR2
Standalone Test
Element size=16 bits
FIFO enabled
16
16
16
L2 with TX/RX EVT DMA ratio = 2
L3 with TX/RX EVT DMA ratio = 2
DDR2 with TX/RX EVT DMA ratio = 2
Standalone Test
Element size=8 bits
FIFO disabled
4.8
4.8
4.8
L2
L3
DDR2
Standalone Test
Element size=8 bits
FIFO enabled
8
8
8
L2 with TX/RX EVT DMA ratio = 2
L3 with TX/RX EVT DMA ratio = 2
DDR2 with TX/RX EVT DMA ratio = 2


Competing Traffic, EDMA Background Activity Using Different TC

Summary of McASP Concurrent Data, 4 TX/RX Serializers, Different TC Used for Background Activity

Summary of McASP Concurrent Data, 4 TX/RX Serializers, Different TC Used for Background Activity

NOTE: The above all test scenarios are with the background activity with continues data transfer of 4K bytes.


Summary of McASP Concurrent Data, 8 TX/RX Serializers, Different TC Used for Background Activity

Summary of McASP Concurrent Data, 8 TX/RX Serializers, Different TC Used for Background Activity

NOTE: The above all test scenarios are with the background activity with continues data transfer of 4K bytes.

Competing Traffic, EDMA Background Activity Using Same TC

Summary of McASP Concurrent Data, 4 TX/RX Serializers, Same TC Used for Background Activity

Summary of McASP Concurrent Data, 4 TX/RX Serializers, Same TC Used for Background Activity

NOTE: The above all test scenarios are with the background activity with continues data transfer of 4K bytes.


Summary of McASP Concurrent Data, 8 TX/RX Serializers, Same TC Used for Background Activity

Summary of McASP Concurrent Data, 8 TX/RX Serializers, Same TC Used for Background Activity

NOTE: The above all test scenarios are with the background activity with continues data transfer of 4K bytes.

Factors Affecting McASP Throughput

The following table describes the main factors affecting McASP throughput and general recommendations for handling those factors.


Factors Affecting McASP Throughput
Factor Impact General Recommendation
SRC/DST Buffer Location Different memories have different access latencies. The longer the access latency, the lower the sustainable McASP throughput. Internal memory (L2 and Shared RAM) has lower access latencies than external memory (DDR). In general, try to keep McASP SRC/DST buffers in internal memory to meet McASP real-time requirements.
EDMA Queue/TC assignment for McASP Assigning McASP transmit and receive events to the same Queue/TC might add delay in servicing individual events. Assign McASP transmit and receive EDMA events to the same queue/TC during general usage to save EDMA queue/TC resources as the performance impact is not significant.
McASP FIFO Use Using FIFO increases McASP data rate and decreases real-time burden on EDMA. Use McASP FIFO whenever possible.The ratio between W/RNUMEVT and W/RNUMDA should be increased as much as possible to take full advantage of the FIFOs and in turn increase the performance of the McASP.
Large, parallel data transfer on same EDMA Queue/TC Assigning large paging transfers on same EDMA Queue/TC as McASP transfers impacts McASP performance. Move large paging transfers with non real-time requirements to different queue/TC and reduce the priority of that TC relative to the TC used for McASP transfers.


McASP Throughput Presentation

The following presentation summarizes the results of all throughput measurements conducted on OMAP-L1x8/C674x/AM18xx class of devices.

Omapl1x8_c674x_am18xx_mcaspThroughput.zip

Omapl1x8_c674x_am18xx_mcaspThroughput_backupSlides_Part1.zip

Omapl1x8_c674x_am18xx_mcaspThroughput_backupSlides_Part2a.zip

Omapl1x8_c674x_am18xx_mcaspThroughput_backupSlides_Part2b.zip

Omapl1x8_c674x_am18xx_mcaspThroughput_backupSlides_Part3.zip


E2e.jpg For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article OMAP-L1x/C674x/AM1x Multichannel Audio Serial Port (McASP) Throughput and Optimization Techniques here.
Hyperlink blue.png Links

Amplifiers & Linear
Audio
Broadband RF/IF & Digital Radio
Clocks & Timers
Data Converters

DLP & MEMS
High-Reliability
Interface
Logic
Power Management

Processors

Switches & Multiplexers
Temperature Sensors & Control ICs
Wireless Connectivity