Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.

C6000 Intrinsics and SIMD Operations

From Texas Instruments Wiki
Jump to: navigation, search

Introduction

Intrinsics look and act like C function calls, but are (with very few exceptions) implemented in a single instruction. Use intrinsics to access functionality of a processor that is difficult to express in C. The acronym SIMD stands for Single Instruction Multiple Data. A SIMD instruction performs two or more operations at once. A SIMD intrinsic does the same, but is coded in C. The C6000 compiler supports many SIMD intrinsics. This article is about how data is input and output from C6000 SIMD intrinsics.

This article is of particular importance to those who use the C6000 host intrinsics package to execute their C6000 C code on a hosted system like a Windows laptop.

Do Not Use the Type double

Consider this SIMD instruction for C6600 ...

   DADDSP A1:A0, A3:A2, A5:A4

This single instruction performs 2 32-bit floating point addition operations ...

   A5 = A1 + A3
   A4 = A2 + A0

If you were to write this in C, what data type would you choose to represent the pairs of 32-bit floating point values? The type "double" seems reasonable. It is 64-bits wide, so it can hold 2 32-bit values. So, the prototype for the intrinsic could be ...

   double _daddsp(double, double);    // WRONG!

If you only ever ran this code on C6600 systems, then no problems would occur. But, for increasing numbers of customers, that isn't the case any more.

Many customers use the C6000 host intrinsics package to execute their C6000 C code on a hosted system. And these customers cannot use the type "double" here. Most hosted systems (Windows, Linux, Mac, ...) execute on an Intel x86 processor. On Intel, instructions for loading and storing double memory locations automatically convert between 64-bit double format and 80-bit "extended-real format". That's fine when the associated data really is of type double. But if the associated data is really two 32-bit floats packed together for a SIMD operation, that's very, very bad. Due to normalization, among other things, the bits stored in memory can change.

Thus, the host intrinsics package cannot use the type double to model any input or output of a SIMD intrinsic.

Use __float2_t or long long Instead

The actual prototype for _daddsp is ...

   __float2_t _daddsp(__float2_t, __float2_t);   // RIGHT!

The C6000 compiler comes with the header file c6x.h. It is in the \include directory along with all the other header files. It contains the prototypes for all the intrinsics. In c6x.h, the type __float2_t is (as of this writing) defined ...

   typedef double __float2_t;

That is subject to change in a future release. It also explains why, at this time, you can use double as the input and output to _daddsp when executing on a C6600 system.

The host intrinsics package comes with the header file C6xSimulator.h. Among other things, it contains the prototypes for all the intrinsics. You must include C6xSimulator.h in any source file which uses C6000 intrinsics. In C6xSimulator.h (via other include files), the type __float2_t is defined ...

   typedef struct {
       uint32 word0;
       uint32 word1;
   } __float2_t;

It is very unlikely this definition will ever change. The host intrinsic implementation of _daddsp depends on this interface to avoid the issues the Intel processor poses when using the double type to model SIMD inputs or outputs.

There are a number of SIMD intrinsics which are similar to _daddsp, except the operations are on integers, and not floats. These intrinsics use the type long long instead of __float2_t. For example ...

   long long _dadd(long long, long long);

This intrinsic is similar to _daddsp, except it adds 2 32-bit integers, instead of 2 32-bit floats.