advice #30011: Consider adding assertions to indicate n-byte alignment of variables input1, input2, output if they are actually n-byte aligned: _nassert((int)(input1) % 8 == 0).
You may be able to get better loop performance by adding the "_nassert" statements in your code. This Advice is issued to alert you to this potential performance improvement in your code.
Most loops have memory access instructions. The compiler attempts to use wider load instructions, and aligned memory accesses instead of non-aligned memory accesses to reduce/balance out resources used for the memory access instructions. One of the ways to let the compiler know that it is safe to use "wider" loads is to use the keyword "_nassert".
_nasserts() make a statement about the value of variable at the point in the program where the _nassert() is located. From this information, the compiler can often derive the information about that variable at other locations in the program. For best performance, however, if the function contains multiple loops, it may be best to repeat the _nasserts() on entrance to each loop. In the following code, the _nassert tells the compiler, for every invocation of f(), that ptr is aligned to an 8-byte boundary. Such an assertion often leads to the compiler producing code which operates on multiple data values with a single instruction, also known as SIMD (single instruction multiple data) optimization.
void f(short *ptr) { _nassert((int) ptr % 8 == 0) ; a loop operating on data accessed by ptr }
The C64x and C64x+ processors support both aligned and non-aligned double-word loads and stores. If it is known that the function parameters are double-word aligned, switching to aligned, double-word memory accesses saves both D units and T address paths. In the example below, you can get the compiler to select the double-word versions of the memory access instructions by telling the compiler that the memory accesses are aligned. To tell the compiler that the memory accesses are aligned on double-word (64-bit) boundaries, use _nasserts() inside the function just prior to the loop of interest.
void BasicLoop(int *restrict output, int *restrict input1, int *restrict input2, int n) { int i; _nassert((int) input1 % 8 == 0); // input1 is 8-byte aligned _nassert((int) input2 % 8 == 0); // input2 is 8-byte aligned _nassert((int) output % 8 == 0); // output is 8-byte aligned #pragma MUST_ITERATE(4,,4) // n >= 4, n % 4 = 0 for (i=0; i<n; i++) { output[i] = input1[i] + input2[i]; } }
For further information on conveying information to the compiler to improve performance, see Tuning Software Pipelined Loops and Optimization Lab.
When you use mechanisms such as restrict, MUST_ITERATE, or _nassert, you are conveying extra information to the compiler. Always verify all such information is correct for every call.
Use "_nassert" statement.
Want to squeeze a few more Performance Cycles out of your application? Leverage the e2e (Engineer-to-Engineer) online community to get all of your Advice questions answered! Or, give back to the community with your expertise.