C6EZAccel ARM user Documentation

From Texas Instruments Wiki
Jump to: navigation, search

END OF LIFE

C6Accel is still available for download, but is no longer being actively developed or maintained. Please consider other alternatives such as .

^ Up to main C6Accel Main Page

This arcticle is part of a collection of articles describing the C6EZAccel included in DaVinci/OMAPL/OMAP3 devices.  To navigate to the main page for the C6EZAccel reference guide click on the link above.

Contents

Using C6EZAccel in an ARM application

User classification based on Level of control

C6EZAccel package is designed for two class of ARM SoC users who want different level of control.

  • Basic User: ARM SoC user who wants an abstracted view of the iUniversal and C6EZAccel implementation design while using C6EZAccel. For such users, the package has C6EZAccel wrapper API call library that abstracts the C6EZAccel design as well as the iUniversal codec engine interface from the ARM application developer
  • Advanced User: ARM SoC user who has codec engine and iUniversal experience or one who does not mind learning it and some details of C6EZAccel implementation to extract maximum performance out of C6EZAccel.

The user experience for utilizing C6EZAccel for both class of users is as described in the section Interface C6EZAccel with the Application.

Interfacing the C6EZAccel with the application.

The initial Steps of interfacing C6EZAccel with the application will vary based on the user experience with tools offered by TI


Initial configuration steps Users with no experience with XDCtools and Codec engine

For users with no experience with XDC based builds the C6EZAccel package provides an easy to interface prebuilt configuration. this is the simplest way of interfacing C6EZAccel to an ARM application. Prerequiste : Build the C6EZAccel package.

  • To include the prebuilt C6EZAccel configuration in you application include the following macros in the Make file for you application
# Location of Prebuilt configuration files
XDC_CFG		= $(C6ACCEL_INSTALL_DIR)/soc/app/c6accel_app_config
 
# Compiler options to be added to your compile step
XDC_CFLAGS	= $(XDC_CFG)/compiler.opt
 
# Linker file to be linked to your linker
XDC_LFILE	=  $(XDC_CFG)/linker.cmd
 
# C6EZAccel ARM side Library
C6ACCEL_LIB += $(C6ACCEL_INSTALL_DIR)/soc/c6accelw/lib/c6accelw_$(PLATFORM).a470MV
  • Add the XDC_CFLAGs to the rest of the CFLAGS that will be used in the Make file
CFLAGS += XDC_CFLAGS
  • Link the XDC_LFILE to the linking step to include all libraries required to invoke C6EZAccel on the DSP.
$(TARGET):	$(OBJFILES) $(C6ACCEL_LIB) $(XDC_LFILE)

Note: For Users using this mechanism The alg name is c6accel and engine name is the name of the platform (omap138 or omap3530). This information is used in the application code to invoke C6accel_create.

After completing the initial configuration Steps go to the section Common Steps for all users of C6Accel to view steps needed to use C6EZAccel in application code.

Initial configuration steps for Users familiar with the XDC Tools and codec Engine

  • To support Engines with remote codecs, a Codec Server must be created. The Codec Server integrates the various components

necessary to house the codecs (e.g. DSP/BIOS, Framework Components, link drivers, codecs, Codec Engine, etc.) and generates an executable. For any user defined application the user needs to integrate C6EZAccel in the user defined codec server. For details of integrating C6EZAccel into the codec server refer to the Codec Engine Server Integrator's Guide.

The C6EZAccel package comes with a prebuilt unitserver that is built specific to the platform is utilized in the sample test app that can be found under soc/app.

To integrate a codec server and invoke a codec the application must contain

  • A .cfg file that includes the codec server in the app along with configuration of Codec Engine and DSP Link. Refer to the sample application in the package /soc/app

The configuration file (.cfg) uses createFromServer() to integrate the specific server that needs to be invoked in the application:

var demoEngine = Engine.createFromServer(
    "omap3530",
    "./omap3530.x64P",
    "ti.c6accel_unitservers.omap3530"
    );
  • Add C6EZAccel ARM side library in the Makefile
# C6EZAccel ARM side Library
C6ACCEL_LIB += $(C6ACCEL_INSTALL_DIR)/soc/c6accelw/lib/c6accelw_$(PLATFORM).a470MV

Link this library in the linker step to invoke C6EZAccel ARM side APIs

  • Add an XDC build Step to your application to run configuro on your .cfg file to configure the codec server into your application. Link the XDC generated linker command file and add the compiler.opt to the C flags of the compiler as mentioned in the above section (Initial configuration steps Users familiar with no experience with XDCtools and codec engine)
#Eg XDC build Step in make file
$(XDC_LFILE) $(XDC_CFLAGS):	$(XDC_CFGFILE)
	@echo
	@echo ======== Building $(TARGET) ========
	@echo Configuring application using $<
	@echo
	$(VERBOSE) $(CONFIGURO) -o $(XDC_CFG) -t $(XDC_TARGET) -p $(PLATFORM_XDC) -b $(CONFIG_BLD) $(XDC_CFGFILE)

Once the engine is configured in the application it can be invoked from the application.

Note: The Makefile for the application must include the C6EZAccel install directory $(C6ACCEL_INSTALL_DIR).

Common Steps for all users of C6EZAccel.

  • App_main or files in which c6accel needs to be invoked the following steps need to be followed

1. Include Codec Engine header files

#include <ti/sdo/ce/Engine.h>
#include <ti/sdo/ce/CERuntime.h>
#include <ti/sdo/ce/osal/Memory.h>

Note: If the application uses DMAI the Memory include can be replaced by a dmai include. These includes are necessary as C6EZAccel like all DSP codecs expects application developer to allocate contiguous memory for parameters the input and ouput buffers/vectors being passed.

2. Include C6EZAccel application codec header file iC6accel_ti.h or the C6EZAccel wrapper API file c6accelw.h

#include "../c6accelw/c6accelw.h"

OR

#include "ti/c6accel/iC6accel_ti.h"

Note: The codec packge path must be set as include path


3. Declare a C6accel Handle

C6accel_Handle hC6accel = NULL;

4. Define Engine Name (same as configured in the .cfg file) and alg name (default: c6accel) In this case

#define ENGINENAME "omap3530"
#define ALGNAME "c6accel"

5. Before creating a C6EZAccel instance the user must ensure that the codec engine runtime initialization is performed. This can done using the codec engine API CE_Runtime_init() API before any of the C6EZAccel APIs are used in the code.

CE_Runtime_init();

6. Once Codec Engine is initialized, the user can call C6accel_create() that will generate the C6accel handle.

hC6accel = C6accel_create(engineName, NULL, algName, NULL);

Refer to C6Accel_create to find details of the create API call.

7. Once the C6accel_create is successfully invoked , basic user can make calls to kernels in the codec using API calls as shown in section C6Accel Wrapper Library Reference and advanced users can utilize the chaining API feature as explained in section Chaining calls to kernels in a single API call to C6ACCEL codec Note: All input and output buffer parameters used in these API calls need to be contiguous in memory.

8. Once the C6EZAccel functionality in the application is complete the user is expected to tear down the codec using C6accel_delete() API.

Note: C6EZAccel can work with heap as well as pool CMEM memory allocations.

Accessing floating point kernels in C6Accel 1.01.00.06 or earlier

C6EZAccel contains fixed point as well as floating point kernels. However inorder to maitain portability between C64+ and C674x devices C6EZAccel package is configured to provide access to just the fixed point kernel library. On C647x devices inorder to access the floating point kernels in C6EZAccel, the C6EZAccel package module contains a FLOAT Boolean flag which needs to be set inorder to access the floating point kernels.Default settings sets this FLOAT flag set to false.

Inorder to access floating point kernels add the following script to the .cfg file codec/unit server.

   var C6ACCEL = xdc.useModule('ti.c6accel.ce.C6ACCEL');
   C6ACCEL.alg.FLOAT=true;

Eg. View codec.cfg file in omapl138 unit server include in the c6accel package along the path $(C6ACCEL_INSTALL_DIR)/soc/packages/ti/c6accel_unitservers/omapl138

When the application builds the codec server with this float flag set to true, the C6accel package is directed to link in the the .l674 library which contains the floating point kernels in addition to the fixed point kernels. A build error will be seen on a C64+ device if the server configuration tries to set this flag.

Example Applications using C6EZAccel

  • Edge detection demo in DVSDK 4.x for DM3730 Coming soon!!
  • DSP benchmark demo in DVSDK 4.x for OMAPL138 Coming soon!!
  • How to create C6EZAccel based Demo application on DM6467. Coming soon!!
  • Using C6EZAccel with Multimedia codecs on DM6467. Coming soon!!

C6EZAccel ARM side wrapper library Reference

The C6EZAccel wrapper library is designed to abstract iuniversal and C6EZAccel design considerations and provides an interface that appear like a simple function call within an application. The C6EZAccel wrapper library is available in source as well as object to be linked to the application. Refer package directory /soc/c6accelw to view wrapper library source.

Using C6EZAccel wrapper library in the application

Inorder to use C6EZAccel wrapper APIs in the application add the appropriate library from $(C6ACCEL_INSTALL_DIR)/c6accelw/lib in the Make file for the application:

For eg. On OMAP3530 platform add the following to the Makefile to include the C6EZAccel wrapper library in the application.

OBJFILES += $(C6ACCEL_INSTALL_DIR)/c6accelw/lib/c6accelw_omap3530.a470MV

Common wrapper calls and defintions

C6accel_Handle

C6accel_Handle is a handle to the C6Accel Object. The C6Accel Object is defined as

C6accel_Object {
    Engine_Handle hEngine;
    UNIVERSAL_Handle hUni;
    E_CALL_TYPE callType;
} C6accel_Object;

The C6Accel Object carries the Engine Handle and the IUniversal Handle required for the current instance. E_CALL_TYPE is a custom defined datatype that can be take values as ASYNC or SYNC based on application requirements to make asynchronous or synchronous calls to the DSP.

C6Accel_create()

C6accel_Handle C6accel_create(String engName, Engine_Handle hEngine,String algName, UNIVERSAL_Handle hUniversal);

Arguments:

  • engName : Engine Name
  • hEngine : Engine Handle
  • algname : Algorithm name
  • hUniversal: Universal Handle

Return: API returns C6Accel Handle if invoked successfully or NULL if create call failed

Description: This API returns a C6Accel Handle from the Engine Handle and universal handle passed from the application.

Note: Default C6Accel handle is configured to make synchronous calls to the DSP. To enable asynchronous calling of the DSP refer to [ Making_Asynchronous_Calls_to_DSP_kernels_in_C6Accel]

Details:

Case Engine Handle Universal Handle Action
Case 1 NULL NULL Creates engine handle from engine name and universal handle from algname and returns C6accel Handle
Case 2 Passed from app Null Creates universal handle and passes exiting engine handle to C6Accel object and return C6Accel HAndle
Case 3 passed from app passed from app Passes engine and universal handle to C6accel object and returns C6Accel handle
Case 4 NULL(No engine name passed) X Returns NULL
Case 5 passed from app NULL (No algname) Returns NULL


X : Don`t care

C6accel_delete()

int C6accel_delete(C6accel_Handle handle);

Arguments:

  • hC6Accel: C6Accel Handle
  • Return: 1 when passed and 0 when failed

Description: This API tears down the C6accel instance by closing the codec engine and IUNIVERSAL interface using the C6accel Handle.

Error Codes

For all the C6EZAccel wrapper API kernel that call a functionality on the DSP, the error codes returned from the API are documented below

  • _EOK (0): No error occurred
  • _EFAIL (-1) : Error occurred in invoking codec engine interface

This error is most likely occur only if the buffers/vectors being passed to the codec are not assigned contiguous memory. It can also occur if the application makes an asynchronous call to C6EZAccel when there is already a pending asynchronous call.

Specific fail messages

  • _PARAMFAIL (-6) : Error due to invalid parameters

This error is like to occurring when the parameters passed do not satisfy the parameter specifications of the underlying kernel. Check Wrapper API documentation of that specific kernel to know more about the range of permissible parameters.

  • _FXNIDFAIL (-7):Error on function ID passed

This error is likely when the application passes a wrong function ID to the codec. This error is unlikely to occur while using the wrapper API calls as the passing of the function ID is done inside the wrapper code. In case an advanced user comes across this error please verify if the function ID being passed is defined in the application codec interface header file iC6accel_ti.h.

Reference based on categories of kernels in C6EZAccel

The C6EZAccel organizes kernels into seven different functional categories: Digital Signal processing, Image Processing, Math, Analytics, Medical, Audio/Speech processing and Power/Control.

Note: The initial version of the C6EZAccel provides only kernels in the Digital signal processing, Image Processing and Math category.

The references for these functional categories have been furnished below:

C6Accel Signal Processing API Reference guide

C6Accel Image Processing API Reference guide

C6Accel Math API Reference guide

Making Asynchronous Calls to DSP kernels in C6EZAccel


C6Accel async.JPG

Asynchronous calling feature of C6EZAccel enables parallel processing on ARM and the DSP. Inorder to switch between Synchronous and Asynchronous calling C6EZAccel defines the following APIs

int C6Accel_setAsync(C6Accel_Handle hC6accel)

This sets calling mode to asynchronous.

int C6Accel_setSync(C6Accel_Handle hC6accel)

This sets calling mode to synchronous.

CALL_TYPE C6Accel_readCallType(C6Accel_Handle hC6accel)

This returns the current calling mode set in the application

Int C6accel_waitAsyncCall(C6accel_Handle hC6accel)

Wait for Async call to complete. The result from the DSP code will only be available when the Async call completes.

ARM application can make an async call and then perform other processing until the Async call completes, thereby allowing it maximum headroom for adding new features and improving performance.


Important Notes:

  • APIs for calling DSP functionality do not vary in the Synchronous and Asynchronous mode.
  • Application can make only one asynchronous call at a given time. In an asynchronous, the wrapper code does context saving for the call which is used in the waitAsyncCall.

The performance improvement obtained from Asynchronous processing is depicted below. The test application code included in the package contains example code to showcase this asynchronous calling.



Async call.JPG

Advanced Features

Chaining calls to kernels in a single API call to C6Accel

Adding kernels/libraries to C6Accel

Integrating C6Accel in user defined codec server

Using C6Accel on DM6467

Benchmarking and Performance of the kernels

These are the results from the benchmarking tests for kernels in C6EZAccel

Benchmarking results for C6EZAccel functions called synchronously from ARM:

Due to inter-processor over head involved in calling the DSP from the ARM it is generally seen that C6EZAccel performs better as the size of the processing data increases. Inorder to interpret the benchmarking data accurately, please read the following section on overheads involved in C6EZAccel.

Note: Benchmarks assume scaling governor to be userspace(setting device at maximum frequency) or performance.[For applicable devices only]

C6EZAccel Overhead

In a dual-core processor environment like OMAP3 and OMAPL, there is some inherent overhead in processing a buffer on a remote core. As C6EZAccel builds on Codec Engine, the Codec Engine Overhead article describes overheads (and improvement strategies!) relevant to C6EZAccel as well.

In short, processing a buffer of data from the ARM on the DSP requires:

  1. Address translation from ARM-side virtual to DSP-side physical (fast)
  2. Transitioning execution from ARM to DSP-side processing (fast, < 100 microseconds)
  3. ARM side cache invalidation of buffer passed from the ARM to the DSP.
  4. Invalidating cache of the buffers so the DSP sees the right data (slow, especially with very large data buffers)
  5. Activating, processing, deactivating the C6EZAccel algorithm on the DSP (typically fast, but variable based on functionality invoked)
  6. Writing back the cache of the buffers so the ARM sees the right data (slow, especially with very large data buffers)
  7. Transitioning execution from DSP to ARM-side processing (fast, < 100 microseconds)
  8. Address translation back from DSP-side physical to ARM-side virtual (fast)


Due to this overhead C6EZAccel will not be able to provide satisfactory performance on small data buffer sizes. Here is an analysis of some key functions that compare C6EZAccel performance with that on Native ARM running equivalent C code.



C6AccelvsARM convolution.JPG
Figure: Performance of C6Accel vs Native ARM running 8 bit 3x3 mask convolution function


C6AccelvsARM FFT.JPG
Figure: Performance of C6Accel vs Native ARM running 8K sample floating point FFT function

A general observation from this analysis showed that any functionality that took about 1ms or more on the ARM performed much better on the DSP through C6Accel.

C6EZAccel Advanced Users Guide

Information about using C6EZAccel in an ARM application can be found here

Return to C6EZAccel Main page

Click Here, Codec Engine IUNIVERSAL support, OpenCL or RCM