Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.
This article describes how to optimize at levels higher than –o3, and achieve impressive results. The drawback is it requires a change in how the code is built.
Use -o3 -pm on a Subset
When the compiler is invoked with the option combination –o3 –pm and multiple files are compiled in a single invocation …
% cl6x –o3 –pm file1.c file2.c file3.c …
The compiler combines all of those files together into a single compilation unit, and optimizes them together. This is called program mode compilation. Many optimization opportunities that arise infrequently within a single source file are common when working across multiple files. For example, the compiler may see that all of the arrays passed as arguments to some function never overlap, i.e. they do not alias. This in turn frees the compiler to choose a much more optimal schedule of instructions for the function than is otherwise possible.
In an ideal world, all of the source code for the entire application is built with a single invocation of the compiler. This is almost always impractical. However, it is possible to use program mode compilation on a group of related source files, and achieve many of the same benefits as building everything at once.
The rest of this article describes how to apply program mode compilation to a related subset of C or C++ source files. This related subset of sources files is termed "the subset". As already described, the subset must be built together with a single invocation of the compiler.
Specify Subset Entry Points
When using program mode compilation on the subset, two details need special attention in order to get the best optimization:
- The option –op2 or –op1 must be used
- The FUNC_EXT_CALLED pragma must be used
The options –op2 and –op1 both indicate that all the calls to the functions defined in the subset occur from within the subset. The only exception is functions marked with the FUNC_EXT_CALLED pragma. The difference between –op2 and –op1 is the presumption about how global data defined in the subset is modified. If that data is only modified by code within the subset, then use –op2. If that data is ever modified by code outside the subset, then use –op1. If you are unsure about how global data defined in the subset is modified, then use –op1.
When the option set –o3 –pm –op is used, and the function main is not defined within the subset, the compiler automatically reverts to the less aggressive optimization level –op0. To prevent this, apply the FUNC_EXT_CALLED pragma to every function called from outside the subset. These functions are the entry points into the subset. Ideally, there are only a few entry points into a subset.
In C, the syntax of the pragma is …
In C++, the pragma is written immediately before the declaration of the function …
#pragma FUNC_EXT_CALLED; void function_name(a_type parameters);
The pragma must appear before any declaration or reference to the function.
Restructure Your Build
Most methods of building code invoke the compiler on one file at a time. Going for supercharged optimization, however, requires you to reorganize the build so that multiple files are built at once. Note you do not have to put every source file in your application within a supercharged subset. You can apply this technique to performance critical portions of your code, and leave the rest to be built as before.
One detail to understand is the name of the object file produced when the subset is built. The name of the subset object file is the basename of the first C/C++ source file, with the extension changed to .obj.
Code Composer Studio can handle applying –pm –o3 to an entire project. But it cannot handle applying –pm –o3 to a subset of a project. The answer is to break off the subset into a library subproject. More information about library subprojects can be found in section 4.1.3 of the Code Composer Studio Development Tools v3.3 Getting Started Guide.
If you build with a make file, use make rules similar to:
file1.obj : file1.c file2.c file3.c cl6x –pm –o3 –op2 file1.c file2.c file3.c
Note how file2.obj and file3.obj are never mentioned. These files are never generated. file1.obj is the object file generated when file.c are all compiled together. Note this also means that whenever any part of the subset changes, the entire subset is rebuilt. In the final link rule write file1.obj where you would normally write file.obj.
For more details, see the section titled "Performing Program-Level Optimization" in the Compiler User’s Guide for your toolset.