Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.

C6000 Compiler: Memory Access Intrinsics

From Texas Instruments Wiki
Jump to: navigation, search


This page collects together advice and hints about using intrinsics on the C6000 compiler that directly read or write memory.

Unaligned Intrinsics Do Not Guarantee Unaligned Access

Problem Description

The unaligned memory access intrinsics do not require the address supplied be aligned on any certain boundary to work correctly. Examples include _mem2, _mem4, and _mem8. If you do supply the address of an object with an underlying type that is aligned, the compiler may use an aligned memory access instruction. For example ...

int *iptr;
// ...
_mem4(iptr) = ...  // memory store

The compiler may use STNW (unaligned store word), but it may use STW (aligned store word) instead. That's because aligned memory access instructions are cheaper. An unaligned memory access instruction cannot be in parallel with another memory access instruction. Two aligned memory access instruction can be in parallel. The compiler always seeks to use the cheapest instruction allowed. In this case, since iptr is known to point to int type memory locations, and such memory locations are known to be aligned on 4-byte boundaries, the cheaper STW instruction may be used. Use of a unaligned intrinsic does not override what the compiler knows about the alignment of int.

Conservative Solution

A conservative solution to this problem imposes no constraints on how the input address is supplied. In particular, such an approach imposes no assumptions about alignment.

Always use pointers of type "char *" or "void *" as the argument to unaligned memory access intrinsics. Such pointers assume nothing about the address alignment. Sometimes, the argument to the memory access intrinsic is developed through some computations. For example ...

void fxn(unsigned int *base_ptr, unsigned int bit_len)
   char *work_ptr = (char *)base_ptr + bit_len;
   // ...
   _mem4(work_ptr) &= ...
   _mem4(work_ptr) |= ...

Techniques in this example worth noting include:

  • base_ptr is points to int
  • work_ptr is points to char
  • The first computation is to cast base_ptr to char *
  • All further arithmetic to compute work_ptr is done with type char *