Please note as of Wednesday, August 15th, 2018 this wiki has been set to read only. If you are a TI Employee and require Edit ability please contact x0211426 from the company directory.

Managing Symbols with the Linker

From Texas Instruments Wiki
Jump to: navigation, search

No Longer Up-To-Date

The techniques in this article presume that partial linking is a practical alternative to a library. That is the case with COFF object file format. COFF was dominant at the time this article was written. But as of when this update is written (February 2018), COFF has been replaced by ELF, except for C2000. For ELF, partial linking is not a practical alternative to a library.

Introduction

This article discusses techniques for managing symbols that can be performed when linking. You can control whether symbols are global or static, and you can hide symbols.

Version Information

The options discussed here are available in the linkers contained in these versions of the code generation tools.

Target Version
C6000 v6.1.0
ARM v4.5.0
C5500 v4.2.0
MSP430 v3.0.0
C2000 v5.1.0

Symbol Scoping vs. Symbol Hiding

First, a few concepts. The scope of a symbol refers to where in the system the symbol can be used. Hiding a symbol, on the other hand, refers to causing a symbol to disappear permanently. Scoping is a well known concept in computer programming. Hiding a symbol, as described in this article, is not so widely known. It is likely other toolsets supply features similar to this hiding, but describe it differently.

It is easy to confuse scoping and hiding. That is because scoping can cause a symbol to be hidden temporarily. Observe this C code example:

int data_sym;          // global scope
 
void fxn()
{
   int data_sym;       // fxn scope
   ...
}

During execution of fxn(), references to data_sym refer to the one defined in the scope of fxn(), and not the one defined in the global scope. It is natural to think of local data_sym as hiding global data_sym. Nonetheless, hiding a symbol, as described in this article, is different. As you will see, it is possible to make the global data_sym permanently hidden.

When the Symbols Change Properties

All of the linker options described in this article change properties of symbols. One thing in common among all the options is when the change occurs. None of these options affect symbols that are input to the linker. They only affect symbols created by the linker for the final linked output file.

-h and -g

The options -h and -g are not new with these versions of the tools. They are mentioned here because they are used to manage symbols in the linker.

These options affect the scoping of the symbols. As typically used, these options change the scope of symbols from global to static, and then back to global. If you never link against the output file again, this distinction in scope makes no difference. But if you do link again, then the global symbols in the output file can be referenced from yet other code, i.e. its global functions can be called. The static symbols, however, cannot be referenced from other code, i.e. its static functions cannot be called.

The -h option (also documented as --make_static) makes all the global symbols static. If you plan to link again, that is not very useful. So the -h option is normally accompanied by one or more instances of -g symbol (also documented as --make_global). That causes the named symbol to change from static to global.

Example

For this simple assembly language code:

; sym_ex.asm
        .global api_fxn_sym, api_data_sym
 
        .bss    api_data_sym, 1
 
        .text
api_fxn_sym:
        nop

Here are some commands that build it, link it, and inspect the resulting symbols.

% cl6x sym_ex.asm
% lnk6x sym_ex.obj -o sym_ex.out
warning: no suitable entry-point found; setting to 0
% nm6x -l sym_ex.out | find "sym"
[index]  value      sclass   section name        symbol name
[33]    |0x00000008|C_EXT   |.bss               |api_data_sym
[32]    |0x00000020|C_EXT   |.text              |api_fxn_sym

The first two commands build and link the example. Ignore the linker warning; of course this code doesn't run. The last command invokes the name utility nm6x. You can read about that utility in the Assembly Language Tools book for your toolset. It displays information about the symbols in the file. The -l option says to show that information in a long format. The DOS command find "sym" filters out all the lines that do not have "sym" on them, thus showing only the lines relevant to the example. If you are on a Unix-like system, use grep instead. The column titled sclass shows the storage class for the symbol. A C_EXT symbol is global. As expected, api_data_sym and api_fxn_sym are global symbols.

What if you add -h?

% cl6x sym_ex.asm
% lnk6x -h sym_ex.obj -o sym_ex.out
warning: no suitable entry-point found; setting to 0
% nm6x -l sym_ex.out | find "sym"
[index]  value      sclass   section name        symbol name
[33]    |0x00000008|C_STAT  |.bss               |api_data_sym
[32]    |0x00000020|C_STAT  |.text              |api_fxn_sym

Now they're static (C_STAT). Add -g for api_data_sym.

% cl6x sym_ex.asm
% lnk6x -h -g api_data_sym sym_ex.obj -o sym_ex.out
warning: no suitable entry-point found; setting to 0
% nm6x -l sym_ex.out | find "sym"
[index]  value      sclass   section name        symbol name
[33]    |0x00000008|C_EXT   |.bss               |api_data_sym
[32]    |0x00000020|C_STAT  |.text              |api_fxn_sym

Now api_data_sym is global again.

--localize and --globalize

These options affect the scoping of symbols just like -h and -g. The difference is in how you name the symbols so affected. Each option accepts a limited regular expression, and changes the scope of all the symbols which match the pattern. To do the same thing as the last example above:

% lnk6x --localize=* --globalize=api_data_sym sym_ex.obj -o sym_ex.out

In typical usage, however, you make everything static, except for the symbols in your API you want to expose:

% lnk6x --localize=* --globalize=api_* sym_ex.obj -o sym_ex.out

All the symbols are made static, except for those that begin "api_".

Do not use these options in combination with -h and -g. Use one technique or the other.

There are yet other details about the regular expression matches. See the last slide of the presentation in the article C6x Code Generation Tools v6.1.

--hide and --unhide

These options also accept a limited regular expression. However, it does not affect the scope of the symbols. Instead, --hide causes those symbols to disappear and --unhide prevents a symbol from disappearing.

Example

Here is how to hide everything except api_data_sym.

% cl6x sym_ex.asm
% lnk6x --hide=* --unhide=api_data_sym sym_ex.obj -o sym_ex.out
warning: no suitable entry-point found; setting to 0
% nm6x -l sym_ex.out | find "sym"
[index]  value      sclass   section name        symbol name
[33]    |0x00000008|C_EXT   |.bss               |api_data_sym

Poof! They're gone! Well, not really. Don't filter the name utility output.

% nm6x -l sym_ex.out

[index]  value      sclass   section name        symbol name

[0]     |0xffffffff|C_STAT  |ABSOLUTE           |
[1]     |0xffffffff|C_STAT  |ABSOLUTE           |
[2]     |0xffffffff|C_STAT  |ABSOLUTE           |
[3]     |0xffffffff|C_STAT  |ABSOLUTE           |
[4]     |0xffffffff|C_STAT  |ABSOLUTE           |
[5]     |0xffffffff|C_STAT  |ABSOLUTE           |
[6]     |0xffffffff|C_STAT  |ABSOLUTE           |
[7]     |0x00000020|C_STAT  |.text              |
[8]     |0x00000020|C_STAT  |.text              |
[9]     |0x00000040|C_STAT  |.text              |
[10]    |0x00000040|C_STAT  |.text              |
[11]    |0x00000008|C_STAT  |.data              |
[12]    |0x00000008|C_STAT  |.data              |
[13]    |0x00000008|C_STAT  |.data              |
[14]    |0x00000008|C_STAT  |.data              |
[15]    |0x00000008|C_STAT  |.bss               |
[16]    |0x00000008|C_STAT  |.bss               |
[17]    |0x00000009|C_STAT  |.bss               |
[18]    |0x00000009|C_STAT  |.bss               |
[19]    |0xffffffff|C_STAT  |ABSOLUTE           |
[20]    |0xffffffff|C_STAT  |ABSOLUTE           |
[21]    |0xffffffff|C_STAT  |ABSOLUTE           |
[22]    |0x00000008|C_STAT  |.bss               |
[23]    |0x00000000|C_STAT  |SYMBOLIC DEBUG     |
[24]    |0x00000020|C_STAT  |.text              |
[26]    |0x00000008|C_STAT  |.bss               |
[28]    |0x00000000|C_STAT  |.debug_line        |
[30]    |0x00000020|C_STAT  |.text              |
[31]    |0x0000009d|C_STAT  |.debug_info        |
[33]    |0x00000008|C_EXT   |.bss               |api_data_sym

Actually, the symbols are still present in the file. But the name field is changed to all zeros.

Again, in typical usage, you hide everything except the API related symbols.

% lnk6x --hide=* --unhide=api_* sym_ex.obj -o sym_ex.out

Note that if you combine the old -h with the new --hide the linker will produce an error (recall as per above that --make_static is the long-form equivalent of -h): -

error: --hide option conflicts with previous option --make_static


Use-cases

Creating XDAIS-compliant libraries with localize and globalize

This FAQ shows how to create XDAIS-compliant libraries (from the namespace rule perspective) in a CCS environment using -h and -g to expose the key algorithm entry points.

Note that it may seem strange to discuss linking and symbol localizing or hiding in the context of a library. In reality it's not strange at all. Many codec libraries are built via the above partial linking technique since it creates an opportunity to group code together for cache conflict optimization purposes.

Naturally you can continue to use -h and -g. Alternatively you can achieve the same thing via: -

--localize=*
--globalize=_FIR_TI_*

Running nm6x -lg on the library then yields: -

[index]  value      sclass   section name        symbol name

[92]    |0x00000450|C_EXT   |.far               |_FIR_TI_IALG
[93]    |0x00000450|C_EXT   |.far               |_FIR_TI_IFIR
[85]    |0x00000360|C_EXT   |.text:algActivate  |_FIR_TI_activate
[90]    |0x000001e0|C_EXT   |.text:algAlloc     |_FIR_TI_alloc
[84]    |0x000003a0|C_EXT   |.text:algControl   |_FIR_TI_control
[86]    |0x000003e0|C_EXT   |.text:algDeactivate|_FIR_TI_deactivate
[81]    |0x000004c0|C_EXT   |.text:exit         |_FIR_TI_exit
[83]    |0x00000144|C_EXT   |.text:filter       |_FIR_TI_filter
[88]    |0x00000280|C_EXT   |.text:algFree      |_FIR_TI_free
[82]    |0x000004e0|C_EXT   |.text:init         |_FIR_TI_init
[91]    |0x00000300|C_EXT   |.text:algInit      |_FIR_TI_initObj
[87]    |0x00000480|C_EXT   |.text:algMoved     |_FIR_TI_moved
[89]    |0x000004a0|C_EXT   |.text:algNumAlloc  |_FIR_TI_numAlloc
[95]    |0x00000000|C_EXT   |UNDEFINED EXTERNAL |_IFIR_PARAMS
[94]    |0x00000000|C_EXT   |UNDEFINED EXTERNAL |_memcpy

Observe that all FIR_TI_* symbols are now global scope (C_EXT). There was no need to list out each symbol separately via -g _FIR_TI_symName because we leveraged the wildcard '*' specifier.

All FIR_TI_* symbols are XDAIS-compliant by virtue of their <MODULE>_<VENDOR>_ prefix.

IP-Protection - using hide and unhide to remove symbols from codec libraries

NOTE - this is NOT intended to be a full-proof, legal guarantee that the techniques below protect IP - instead it simply highlights techniques to improve IP protection.

Codec vendor 3rd parties desire the ability to ship libraries without compromising their IP or details of their implementation. There is a theory that if all the symbol names are exposed then hackers could attempt to reverse engineer the code from its object files.

In XDAIS algorithms the key entry point is the _MODULE_VENDOR_IMODULE symbol. This is essential. It cannot be hidden (nor localized). For the FIR_TI algorithm we could therefore do: -

--hide=*
--unhide=_FIR_TI_IFIR

Running nm6x -lg on the library then yields just: -

[index]  value      sclass   section name        symbol name

[92]    |0x00000450|C_EXT   |.far               |_FIR_TI_IFIR
[95]    |0x00000000|C_EXT   |UNDEFINED EXTERNAL |_IFIR_PARAMS
[94]    |0x00000000|C_EXT   |UNDEFINED EXTERNAL |_memcpy

We have now hidden all other global symbols in the codec library, thus making it much more difficult to reverse engineer.

Reducing load-time and object-file size

This use-case is a little different. Some customers need to reduce load-time. For example if a DSP executable is being loaded by DSPLink then, if the executable is huge, it may take some time to kick-start the end-product from a user-perspective.

There are also (occasionally) reasons to keep the object file sizes smaller, even if the actual footprint loaded on the target stays the same.

Using --localize does nothing to help with the above problems. However --hide does help a little. It removes the string table entries for the hidden symbols thus making the object file smaller.

A better technique for reducing load-time is to leverage the strip6x utility. This should typically be used on executables instead of libraries. Note however that strip6x does not have the wildcard / selective-removal options that --hide offers.