Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.

Using DSP/BIOS on Multi-Core DSP Devices

From Texas Instruments Wiki
Jump to: navigation, search

Originally authored by Arnie Reynoso and Judah Vang, edited by Randyp for Wiki and CCSv4 with updates

Though DSP/BIOS 5 applications typically run on single-core devices, and DSP/BIOS 5 does not explicitly provide multi-processor APIs, your DSP/BIOS 5 applications can still take advantage of the shared memory available in multi-core devices. This application note contains information on how to configure, build, and debug multi-core applications using DSP/BIOS 5 with Code Composer Studio v4.2. It describes various application scenarios for multi-core devices — from a single application to be executed on multiple cores to different applications for each core that share from a common partial DSP/BIOS image. The examples in this document are built for a TMS320C6472 six-core device, but the techniques described apply to any multi-core DSP device.

DSP/BIOS Multi-Core Device Support

This application note focuses on ways to build and run applications that demonstrate how to use both local and shared memory on devices with multiple DSP cores. Techniques used to communicate between processors are not the focus of this application note. Such communication techniques include shared memory, the DSP/BIOS MSGQ module, and software solutions such as DSP/BIOS Link. In particular, you will want to look at the BIOS Multicore Software Development Kit (MCSDK); there are links in the Wiki pages at Category:Multicore.

TI devices are available with multiple DSP cores that share both internal memory and external memory. Using three instructive scenarios, this application note describes how you can modify the setup and code/data placement to take advantage of the multi-core device.

In order to better support the configuration and use of these multi-core devices, DSP/BIOS 5.30 (or higher) provides configuration platforms for such devices (for example, the C6472). DSP/BIOS 5.41.07 (or higher) improves some of the stop-mode Real-Time Analysis (RTA) capabilities, so the examples in this application note use Code Composer Studio (CCS) 4.2.4 and DSP/BIOS 5.41. In the DSP/BIOS configuration for the C6472, there are separate MEM sections for local L2 memory (LL2RAM), shared L2 memory (SL2RAM) and shared DDR2 memory (DDR2).

The CCSv4 code files associated with this application note provide simple examples that illustrate different multi-core memory-sharing scenarios. One basic application program is reused with minor changes to provide the various program scenarios. This basic application uses DSP/BIOS TSK threads that communicate within a single core using the DSP/BIOS SEM module.

Three different archive files are provided, one for each of the major scenarios presented. These archive files are made available at the end of this overview page. The applications are provided with complete application source within Code Composer Studio (CCSv4) archived project zip files for your convenience. Instructions for installing, building, loading, and running the applications are included in each of the Complete Description sections.

The following software is recommended to rebuild and run the examples:

  • Code Composer Studio v4.2.4.00033 or greater
  • Code Generation Tools 7.0.3 or greater
  • DSP/BIOS 5.41.07 or greater

NOTE: The examples read the processor ID at run-time on each core. This allows for uniquely identifiable debug output, so you can easily confirm which core has displayed which results. Also, any application that uses the MSGQ module for inter-processor communication must set a unique and sequential processor ID (starting at 0), so the technique used here can be used also with the MSGQ module. MSGQ is not used in these examples.

Single Application on Multiple Cores

The singleimage example shows how a single application can run simultaneously on all the device cores. The cores run the same application independently of each other.

Figure 1. Single Application on Each Core of C6472 Device
Figure 1. Single Application on Each Core of C6472 Device

singleimage Example Application

The example for this scenario is in the singleimage project folder. The basic application uses a QUE (DSP/BIOS queue) and SEM (DSP/BIOS semaphore) to send messages and to synchronize, respectively, between multiple writer() tasks to a single reader() task. The reader task, the three writer tasks, and the semaphore are statically created in the DSP/BIOS *.tcf configuration file.

In each application on each core, three identical writer tasks will build a message and send it through the DSP/BIOS QUE mechanism, then each writer task will post a SEM to tell the reader task that a message is ready. The reader task will respond to the SEM by pulling a message off the QUE object and reporting the results using a LOG_printf message.

You will be able to observe the results of the execution by each core by using the RTA->Printf Logs window.

The application should be built using DSP/BIOS 5.41.07 or greater and CCS4.2.0.09 or greater. Earlier releases will not support the Printf Logs as well or at all. With DSP/BIOS 5.41.07 or greater and CCSv4.2.0.09 or greater, you will be able to observe the results of the execution by each core by using the RTA->Printf Logs window. With earlier releases of DSP/BIOS and CCSv4, you may have to use the ROV tool and click down into the log buffers to observe the results.

The C6472's DSP/BIOS memory configuration for this example allows a single application image to be loaded and executed on all cores by placing all code and read-only data in a single location in shared L2 and by placing the interrupt vector table and read/write data in local L2.

Complete Description and Step-By-Step Procedures for Single Application on Multiple Cores

Click here for Single Application on Multiple Cores

Multiple Independent Applications

The multipleimage example shows how separate applications can run on different cores completely independently. The shared memory in the C6472 is partitioned such that each partition is "owned" by one and only one core. This example demonstrates how to place memory, load, and run such applications. The intent is to illustrate one method of avoiding memory access collisions when running independent applications on a multi-core device. Figure 2 shows how six different applications could be configured to run on the six cores of the C6472. Equal private partitions are created in both the Shared L2 and DDR memory; one common partition is also created in DDR memory but it is not used in this example.

Figure 2. Multiple Application Images Independent on Each Core of a C6472 Device
Figure 2. Multiple Application Images Independent on Each Core

multipleimage Example Applications

The example for this scenario comprises the several multipleimage_appN (N=0..5) project folders. The same basic application code from the singleimage project is used for this example, with the exception that the number of writer tasks is equal to N+1 for CoreN and multipleimage_appN, just to make the applications different. The underlying application again uses a QUE (DSP/BIOS queue) to send messages and SEM (DSP/BIOS semaphore) to synchronize between one or more writer() tasks to a single reader() task. The reader task, the writer task(s), and the semaphore are statically created in the DSP/BIOS configuration .tcf files in each project folder. The memory location for the code and data are also set in the DSP/BIOS configuration .tcf files by setting different starting addresses for the memory elements. The memory partitions are implemented as shown in Figure 2.

The writer tasks will load and send a message through the DSP/BIOS QUE mechanism, then post a SEM to tell the reader task that a message is ready. The reader task will respond to the SEM by pulling a message off the QUE object and reporting the results using a LOG_printf message.

The application should be built using DSP/BIOS 5.41.07 or greater and CCS4.2.0.09 or greater. Earlier releases will not support the Printf Logs as well or at all. With DSP/BIOS 5.41.07 or greater and CCSv4.2.0.09 or greater, you will be able to observe the results of the execution by each core by using the RTA->Printf Logs window. With earlier releases of DSP/BIOS and CCSv4, you may have to use the ROV tool and click down into the log buffers to observe the results.

Each project folder has a slightly different C source file and a slightly different DSP/BIOS Configuration .tcf file.

The mi_appN.c files differ from one another only by the number of messages allocated. There are three (3) messages allocated for each of the writer tasks that are defined in the tcf file, so one #define NUMWRITERS constant is the only code difference.

Each multipleimage_appN.tcf DSP/BIOS Configuration file defines N+1 writer tasks for CoreN, defines unique partitions in both SL2RAM and DDR2, defines a DDR2_COMMON memory partition that is identical for all cores, and places different parts of the application in different memory partitions. The partitioning of SL2RAM and DDR2 is what allows the cores to use shared memory components at the same time without interfering with each other. There are some duplicate code segments stored in shared memory for each core in this example, but if the applications' differences were greater than these simple examples then there would be a greater need to keep these segments separate. Using this method, there is no need to worry about where the various sections of BIOS and application code and data are placed - all are kept separate from the other cores by the memory partitioning. The memory partitioning was done manually by editing the *.tcf files and setting the memory section address, etc., to the values desired for the example. This partitioning is even across the cores and could easily be reused by other applications.

Complete Description and Step-By-Step Procedures for Multiple Independent Applications

Click here for Multiple Independent Applications

Multiple Applications Sharing a Partial Image

The sharedimage example shows how separate applications can run on different cores while sharing a single, common, partial DSP/BIOS code image. Sharing common code between multiple applications reduces the overall memory requirement by up to 5x the size of that partial image.

The shared memory in this example is partitioned differently than the purely independent partitions in the Multiple Independent Applications example. In this shared memory scenario, there is a common partition in both SL2RAM and DDR2, plus there are unique partitions "owned" separately by each core. This example demonstrates how to place memory, load, and run such applications. And most importantly, this section explains how to build a partial DSP/BIOS image and then how to link that partial image into the example applications. Figure 3 shows how six different applications are instantiated on the six cores of the C6472, with partitions of the Shared L2 and DDR2 memory, and with a shared partial DSP/BIOS image.

Figure 3. Multiple Applications Sharing a Partial DSP/BIOS Image
Figure 3. Multiple Applications Sharing a Partial DSP/BIOS Image

sharedimage Example Applications

The example for this scenario comprises the several sharedimage_appN (N=0..5) project folders and the BIOS_partial project folder. The same basic application code from the previous examples is used for this scenario, again with the number of writer tasks being equal to N+1 for CoreN and sharedimage_appN. The basic application uses a QUE (DSP/BIOS queue) to send messages and SEM (DSP/BIOS semaphore) to synchronize between one or more writer() tasks to a single reader() task. The reader task, the writer task(s), and the semaphore are statically created in the DSP/BIOS configuration .tcf files in each project folder. The memory location for the code and data are also set in the DSP/BIOS configuration .tcf files to implement the memory partition arrangement shown in Figure 3.

The writer tasks will load and send a message through the DSP/BIOS QUE mechanism, then post a SEM to tell the reader task that a message is ready. The reader task will respond to the SEM by pulling a message off the QUE object and reporting the results using a LOG_printf message.

The application should be built using DSP/BIOS 5.41.07 or greater and CCS4.2.0.09 or greater. Earlier releases will not support the Printf Logs as well or at all. With DSP/BIOS 5.41.07 or greater and CCSv4.2.0.09 or greater, you will be able to observe the results of the execution by each core by using the RTA->Printf Logs window. With earlier releases of DSP/BIOS and CCSv4, you may have to use the ROV tool and click down into the log buffers to observe the results.

Each project folder has a slightly different C source file and a slightly different DSP/BIOS Configuration .tcf file.

The sh_appN.c files differ from one another only by the number of messages allocated. There are three (3) messages allocated for each of the writer tasks that are defined in the .tcf file, so one #define NUMWRITERS constant is the only code difference.

The sharedimage_appN.tcf DSP/BIOS Configuration file defines N+1 writer tasks for CoreN, defines common plus unique partitions in both SL2RAM and DDR2, and places different parts of the application in different memory partitions. The partitioning of SL2RAM and DDR2 is what allows the cores to share a common partial image and also to use non-conflicting memory regions in the shared memory resources. Any DSP/BIOS code segments that are needed by any of the cores are stored in a single place in the SL2RAM shared memory in this example, minimizing the total memory requirements. Using this method, you will make the best use of shared memory for both common and unique uses by the several cores.

Complete Description and Step-by-Step Procedures for Multiple Applications Sharing a Partial Image

Click here for Multiple Applications Sharing a Partial Image

Example files referenced in this Wiki topic

The following example archive project files are referenced in the topics above. You may need to remove the "Randyp_" prefix from the filename when you Extract the project files, but this has not been fully tested, yet. Still a work in progress....