C6000 Benchmarking Project

From Texas Instruments Wiki
Jump to: navigation, search

Project Overview

The c6_benchmarking project exists to provide a framework to develop tests and automate benchmarking the C6000 DSP core of various Texas Instruments SoCs. At heart, the project will attempt to establish baseline performance for the various incarnations of the TI C6000 VLIW architecture, both in simulated form and on actual silicon products.


Project Organization

This is the top-level directory of the c6_benchmarking project. The directory structure is as follows:

    \c6_benchmarking
    |
    |----\common
    |
    |----\projects
    |    |
    |    |----\arch1
    |    |    |
    |    |    |----\project1
    |    |    |
    |    |    |----\project2
    |    |         ...
    |    |
    |    |----\arch2
    |    |    |
    |    |    |----\project1
    |    |    |
    |    |    |----\project2
    |    |         ...
    |
    |----\scripts
    |
    |
    |----\targets
    |    |
         |----\arch1
         |    |
         |    |----\device1
         |    |
         |    |----\device2
         |         ...
         |
         |----\arch2
         |    |
         |    |----\device1
         |    |
         |    |----\device2
         |         ...

The common path contains code (C source and header files) used by the various benchmarking projects, and is common to the entire benchmarking effort. This path is also the destination for required platform files that are generated at the time a platform is selected (GEL file, emulation configuration, etc.).

The projects directory contains the CCSv5 benchmarking project files that are used to build the target benchmark executable images. The projects are divided based on the lowest (or oldest) architecture on which they will run.

The scripts path contains scripts and utilities used to automate the execution of the benchmarks programs on actual hardware and gather the results.

The targets path contains the support files needed to run the benchmarks on a particular target platform. The targets are sub-divided based on the DSP architecture, and then further grouped based on the particular silicon device (SoC) on which the platform is based (a platforms defines a board, a board specifies the chip, the chip determines the architecture).


Installation and Setup

Each benchmarking project can be imported and opened in CCS version 5.1. However, the benchmarking tree needs to be configured for a specific platform, so that certain required files will be created that the projects rely on (such as linker command files that describe the platform memory map, and an emulator configuration). To accomplish this configuration step, there is a Makefile at the top level of the project tree. This makefile should be used in an environment where the Bash shell is available. Such possible environments are:

One goal of this project is to create a system to automate the gathering of benchmark data. This automation is accomplished via the use of scripting, making use of the Debug Scripting Server engine that is part of the Eclipse-based CCSv5.1. The scripting capabilities also rely on the Ruby scripting language, so you will need Ruby 1.8 or higher installed and in the path of your system (testing was done against Ruby 1.8.7). Furthermore, the rubygems package manager for Ruby should be installed and any necessary packages installed using it (current required RubyGems are 'json' and 'haml').

Ubuntu Ruby Install

Run the following command in a terminal window:

   $ sudo apt-get install ruby rubygems
   $ sudo -E gem install json haml

Cygwin Ruby Install

From the cygwin setup, make sure you install the Ruby package (under Interpreters or Ruby category) and also take the opportunity to verify that GNU make is installed (under Devel category). You should also install the gcc-core and gcc-g++ packages (under Development), as these are needed when installing the JSON RubyGem.

For installing RubyGems, the following steps should be taken

  1. Download the RubyGems tarball from Ruby Forge (http://rubyforge.org/projects/rubygems/)
  2. Unpack the tarball
  3. In a bash terminal, navigate to the unpacked directory
  4. Run the following command: $ ruby setup.rb install
  5. Update RubyGems by running the following: $ gem update --system
  6. Install needed RubyGems modules: $ gem install json haml

Note that, for the rubygem commands above, the http_proxy variable should be set properly if your internet connection requires a proxy.

MSYS Ruby Install

You can install a native Win32 version of Ruby and add it to the path so that it can be used under the MSYS environment: http://rubyinstaller.org/downloads/

You can also get the RubyInstaller Development Kit as a self-extracting installer, which includes its own MSYS environment. You can then start a command shell with the Ruby executable in the path, then navigate to the directory where the development kit was extracted and run the devkitvars.bat to bring the in the path to the MSYS components (make, bash, etc.)

Code Composer Studio version 5.1

The automation features of the benchmarking project rely on Code Composer Studio (CCS) v5.1. In addition each project is delivered as a CSSv5 project, so it can be opened and built under the CCS GUI environment. To make use of the automation scripting features of the benchmarking project, you should install CCSv5.1 and point to the install path in the Rules.mak file at the top-level of the project tree.


Using the top-level Makefile

Configuring for a Platform:

  1. Edit the Rules.mak file with any permanent settings for development (typically this will be the EMULATOR variable and the CCS install directory, though any variable can be set)
  2. In a Bash shell, navigate to the top-level of the benchmarking project tree:$ cd <c6_benchmarking_install_dir>
  3. View list of available platforms:$ make list_platforms
  4. Configure a particular platform:$ make <PLATFORM>_config

Additonal options that can be passed at configure time:

  • L2_CACHE=<VALID L2 CACHE OPTION>
  • L1D_CACHE=<VALID L1D CACHE OPTION>
  • L1P_CACHE=<VALID L1P CACHE OPTION>
  • CODE_LOCATION=<VALID CODE LOCATION OPTION>
  • DATA_LOCATION=<VALID DATA LOCATION OPTION>
  • CORE_FREQ=<VALID CORE FREQUENCY OPTION>
  • EXTMEM_FREQ=<VALID EXTERNAL MEMORY FREQUENCY OPTION>

If any of these options are not specified, platform defaults are used.


Benchmark Automation

A Bash shell script is provided in the scripts directory of the project tree. The script is named 'run_benchmarks.sh'. The script accepts various arguments, including lists of parameters to iterate over. You can get more details by running

   $ ./run_benchmarks.sh --help

from the scripts directory.


At the very least, you must provide one platform (and only one) to run the tests on. For example, the below command will run all the applicable benchmarking projects for the specified platform and save the results into a file, named results_<PLATFORM>.txt, at the top-level of the benchmarking tree

   $ ./run_benchmarks.sh --platform=<PLATFORM>


If you want to limit the benchmarking run to only a certain set of projects, you can use one or more '--projects=' options. For example, the command

   $ ./run_benchmarks.sh --platform=<PLATFORM> --projects=DSP_MEM*

will run all projects whose names start with 'DSP_MEM'.


If you don't have an emulator value hard-coded in the Rules.mak (or simply want to use a different emulator for this benchmarking run), that can be provided on the run_benchmarks.sh command line:

   $ ./run_benchmarks.sh --platform=<PLATFORM> --emulator=<EMULATOR>


As a real example:

   $ ./run_benchmarks.sh --platform=evmOMAPL138 --projects=DSPF* --emulator=SD510USB --l2_cache_settings="32K 128K 256K"

The above command line will run all floating-point benchamrks (specified by --projects=DSPF*) using the Spectrun Digital 510 USB emulator and will run each project at the three specified L2 cache sizes. All results will be saved in the results_evmOMAPL138.txt file.


A final example to run all benchmarking projects on a the OMAP-L138 EVM platform using the Spectrum Digital 510 USB emulator:

   $ ./run_benchmarks.sh --platform=lcdkOMAPL138 --projects=DSP* --core_freqs="456MHz" \
   --extmem_freqs="150MHz" --code_locations="EXTRAM" --data_locations="EXTRAM" \
   --emulator=SD510USB --l2_cache_settings="256K" --l1d_cache_settings="32K" --l1p_cache_settings="32K"

Raw logging information from the runs can be found in the logs directory. Note that the run_benchmarks.sh script will repeatedly reconfigure the tree using the make <PLATFORM>_config command. Any configuration you had done prior to the run_benchmarks command will be overwritten, but the last configuration set by the run_benchmarks script will still be active.


Output Format

The results file from a benchmark run is a text file of JSON formatted data (http://json.org/). The JSON format allows for simple and portable encapsulation of the results data and the object format that the data fits into. An example result for one run of the following command line:

   $ ./run_benchmarks.sh --platform=lcdkOMAPL138 --projects=DSP_MEM_copyL2ToExtMem \
   --l2_cache_settings="256K" --extmem_freqs="150MHz" --core_freqs="456MHz" \
   --code_locations="EXTRAM" --data_locations="EXTRAM" --emulator=SD510USB

will look like:

    {
	"Platform" : "lcdkOMAPL138",
	"LinkingLocations" :
	{
		"CodeLocation" : "EXTRAM",
		"DataLocation" : "EXTRAM"
	},
	"FrequencySettings" :
	{
		"CoreFrequency" : "456MHz",
		"ExternalMemFrequency" : "150MHz"
	},
	"CacheSettings" :
	{
		"L1D" : "default",
		"L1P" : "default",
		"L2" : "256K"
	},
	"TestExecutable" :
	{
		"TestName" : "DSP_MEM_copyL2ToExtMem",
		"CodeGenVersion" : "7.3.1",
		"BuildDate" : "Feb 24 2012",
		"Endianness" : "little",
		"Format" : "ELF"
	},
	"InputParams" : ["Bytes Transferred" ],
	"OutputParams" : ["Cycles/Byte (Cached)", "Cycles/Byte (Non-Cached)" ],
	"Results" : [[128, 2.273438, 1.492188 ], [256, 2.199219, 1.871094 ], [512, 2.208984, 2.060547 ], [1024, 2.268555, 2.155273 ], [2048, 2.226074, 2.202637 ], [4096, 2.201416, 2.226318 ], [8192, 2.203247, 2.250366 ], [16384, 2.195618, 2.265686 ], [32768, 2.194672, 2.274628 ] ]
    }

The JSON format can be translated into many different formats using freely available tools. The results file will be created at the top of the C6 Benchmarking tree in a file named results_<PLATFORM>.txt, where <PLATFORM> is the name of the platform specified in the run_benchmarks command line. Any previous results file will be renamed to results_<PLATFORM>_backup.txt.

The results are also converted to CSV and HTML for easier importing and viewing. The CSV format can easily be imported into a program like Microsoft Excel. The HTML format is immediately ready for viewing in a web browser. The CSV file and HTML file are named results_<PLATFORM>.csv and results_<PLATFORM>.html respectively.

Known Issues

  1. When running the run_benchmarks script on Linux, the headless calls into eclipse/CCS result in warnings of the form: "flxInitLoad: client has not been protected." This appears to have no adverse effects.