XDS Performance comparison

From Texas Instruments Wiki
Jump to: navigation, search

Overview

This page is intended to compare the performance of the different classes of XDS emulators. It is not intended as an official benchmark and should be used for reference only, as the procedure depicted here is experimental and does not take into consideration a real-world debugging scenario.

  • It uses Code Composer Studio’s Debug Server Scripting (DSS) feature that allows running JTAG debug sessions from the command line
  • The use of DSS allows the benchmark process to be automated and reduce the influence of the graphical IDE in the measurements
  • It is focused on two major types of tests: throughput and interactive
  • The throughput tests perform data transfers to/from the target and measure the time elapsed to complete them.
  • The interactive tests perform a sequence of typical debug operations such as single step or display a series of printf statements (console output) that are relevant to the overall user experience.
  • The benchmarks chosen are very useful to compare the different JTAG debug probes.


For previous results, please check the 2015 Results

Test setup

The base host PC used specification is:

Core i7-6820HQ (Quad core, HT enabled, 2,70GHz)
16GB 2133MHz DDR4 Non-ECC SDRAM
Samsung SSD 512GB M.2 PCIe NVMe Class 40
Windows 10 Professional 64 bits


Software components:

Code Composer Studio v7.2.0.00013
BlackHawk emupack 6.0.83.003
Spectrum Digital emupack 5.2.0.14
TI Emulators 6.0.628.3
Segger Jlink SW 6.14e
Segger Jlink CCS support 6.16.7.0
BH560v2 firmware rev 5.0.573.0
ICDI firmware revision 12630
XDS200 firmware rev 1.0.0.8
XDS110 firmware rev 2.3.0.8
MSP432 device support 7.2.4
C2000 Emulation Flash 1.0.0.5
C2000 Device support 4.2.0.0
C6000 device support 1.1.3
Sitara device support 1.1.3
TI Tiva/Stellaris device support 2.1.1.15071


All JTAG TCLK speeds are 10.368MHz except where noted
All Debug probes are operating in JTAG mode except where noted


Important note about JTAG TCLK frequencies:

Some of the emulators allow setting limits to the JTAG TCLK frequency from the CCS Target Configuration editor, which can potentially increase the debugging and loading speed.

However, at the moment of connection the emulator performs some tests on the JTAG serial connection and automatically chooses a reliable TCLK speed, which can be lower than the setting configured in CCS. The most obvious practical effect is that some of the tests below show the same throughput even when setting different TCLK speeds.

For details on the JTAG emulator and board design constraints that affect performance, check the Target Connection Guide.

The XDS560 emulators actually report back the reliability of the JTAG TCLK measurements when the Test Connection button is used from the CCS Target Connection editor, and it is shown as:

The IR/DR scan-path tests used 50.00MHz as the final frequency.

-----[Measure the source and frequency of the final JTAG TCLKR input]--------

The frequency of the JTAG TCLKR input is measured as 49.99MHz.

This helps to know the actual speed being used by the JTAG debugger.

Throughput tests

The tests on this section focus on data throughput from the host PC to the target processor to compare the performance of different emulators and JTAG settings.

Note: For MCUs the executable is loaded/executed in Flash and the binaries to SRAM, while for EPs the external RAM is used in all tests.


Executable Load

The Executable Load throughput test performs a standard program load of an ELF executable to the target device’s program memory. The program contains a large data array whose size varies depending on the device's amount of available memory.

Results on Cortex M4

Target device: TM4C129NCPDT (TM4C129 connected Launchpad)

Executable (.out) load to Flash (kB/s)


Target device: MSP432P401 (MSP432 Launchpad Red)

Executable (.out) load to Flash (kB/s)


Results on Cortex A15

Target device: AM5728 (AM5728 IDK)

Executable (.out) load to DDR (kB/s)


Results on Cortex A8

Target device: AM3359 (BeagleBone Rev A4 with TI 20-pin connector)

Executable (.out) load to DDR (kB/s)


Results on C6600

Target device: C6678 (C6678 EVM)

Executable (.out) load to DDR (kB/s)


Results on C6740

Target device: C6748 (C6748 Experimenter's Kit)

Executable (.out) load to DDR (kB/s)


Results on F2800

Target device: TMS320F28377D (TMS320F28377D controlCARD)

Executable (.out) load to DDR (kB/s)


Target device: TMS320F28335 (TMS320F28335 controlCARD)

Executable (.out) load to DDR (kB/s)



Binary Load/Save

The Binary Load/Save throughput tests perform a direct memory load/save to the target device’s RAM memory. The binaries are created using the tiobj2bin utility applied to the executables of the previous test, and thus their sizes vary accordingly.

Note: results in red seem to have been skewed by cache mechanisms.


Results on Cortex M4

Target device: TM4C129NCPDT (TM4C129 connected Launchpad)

Binary (.bin) load to SRAM (kB/s)
Binary (.bin) save from SRAM (kB/s)


Target device: MSP432P401 (MSP432 Launchpad Red)

Binary (.bin) load to SRAM (kB/s)
Binary (.bin) save from SRAM (kB/s)


Results on Cortex A15

Target device: AM5728 (AM5728 IDK)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)


Results on Cortex A8

Target device: AM3359 (BeagleBone Rev A4 with TI 20-pin connector)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)


Results on C6600

Target device: C6678 (C6678 EVM)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)


Results on C6740

Target device: C6748 (C6748 Experimenter's Kit)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)


Results on F2800

Target device: TMS320F28377D (TMS320F28377D controlCARD)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)


Target device: TMS320F28335 (TMS320F28335 controlCARD)

Binary (.bin) load to DDR (kB/s)
Binary (.bin) save from DDR (kB/s)



Interactive tests

Console I/O comparison

This test performs a sequence of printf() calls triggered by a well-known recursive program called Towers of Hanoi with 9 disks, which yields 511 console messages printed in sequence of movements.

Note: this test is useful to compare debug probes, as the actual elapsed time is dependent on the RTS implementation and the device itself.


Results on Cortex M4

Target device: TM4C129NCPDT (TM4C129 connected Launchpad)

Console I/O output (ms/char)


Target device: MSP432P401 (MSP432 Launchpad Red)

Console I/O output (ms/char)


Results on Cortex A15

Target device: AM5728 (AM5728 IDK)

Console I/O output (ms/char)


Results on Cortex A8

Target device: AM3359 (BeagleBone White Rev A4 with TI 20-pin connector)

Console I/O output (ms/char)


Results on C6600

Target device: C6678 (C6678 EVM)

Console I/O output (ms/char)


Results on C6740

Target device: C6748 (C6748 Experimenter's Kit)

Console I/O output (ms/char)


Results on F2800

Target device: TMS320F28377D (TMS320F28377D controlCARD)

Console I/O output (ms/char)


Target device: TMS320F28335 (TMS320F28335 controlCARD)

Console I/O output (ms/char)



Step comparison

The single step interactive test performs 500 assembly step into operations using a valid program loaded to the target device’s program memory.

Results on Cortex M4

Target device: TM4C129NCPDT (TM4C129 connected Launchpad)

Single step operations (ms/step)


Target device: MSP432P401 (MSP432 Launchpad Red)

Single step operations (ms/step)


Results on Cortex A15

Target device: AM5728 IDK

Single step operations (ms/step)


Results on Cortex A8

Target device: AM3359 (BeagleBone Rev A4 with TI 20-pin connector)

Single step operations (ms/step)


Results on C6600

Target device: C6678 (C6678 EVM)

Single step operations (ms/step)


Results on C6740

Target device: C6748 (C6748 Experimenter's Kit)

Single step operations (ms/step)


Results on F2800

Target device: TMS320F28377D (TMS320F28377D controlCARD)

Single step operations (ms/step)


Target device: TMS320F28335 (TMS320F28335 controlCARD)

Single step operations (ms/step)