PRU Assembly Advanced Topics

From Texas Instruments Wiki
Jump to: navigation, search

Content is no longer maintained and is being kept for reference only!
For the most up to date PRU-ICSS collateral click here

^ Up to main Programmable Realtime Unit Software Development Table of Contents

This arcticle is part of a collection of articles describing software development on the PRU subsystem included in OMAP-L1x8/C674m/AM18xx devices (where m is an even number).  To navigate to the main PRU software development page click on the link above.

Advanced Topics

Using Macros

Macros are used to define custom instructions for the CPU. They are similar to in-line subroutines in C.

Defining a Macro

A macro is defined by first declaring the start of a macro block and specifying the macro name, then specifying the assembly code to implement the intended function, and finally closing the macro block.

.macro macro name .mparam macro parameters < lines of assembly code > < lines of assembly code > < lines of assembly code > endm

The assembly code within a macro block is identical to that used outside a macro block with minor variances:

  • No dot-commands may appear within a macro block other than ".mparam".
  • Pre-processor definitions and conditional assembly are processed when the macro is defined.
  • Structure references are expanded when the macro is used.
  • Labels defined within a macro are considered local and can only be referenced from within the macro.
  • References to external labels from within a macro are allowed.

Macro Parameters

The macro parameters can be specified on one ".mparam" line or multiple. They are processed in the order that they are encountered. There are two types of parameters, mandatory and optional. Optional parameters are assigned a default value that is used in the event that they are not specified when the macro is used. Since parameters are always processed in order, any optional parameters must come last, and once an optional parameter is used, none of the remaining parameters may be specified.

For example:

    .macro  mv1             // Define macro "mv1"        
    .mparam dst=r0, src=5   // Two optional parameters
        mov dst, src

For the above macro, the following expansions are possible:

Macro Invocation Result
mv1 r1, 7 mov r1, 7
mv1 r2 mov r2, 5
mv1 mov r0, 5

Note that option parameters can not be passed by using "empty" delimiters. For example, the following invocation of "mv1" is illegal:

    mv1      , 7     // Illegal attempt to do ’mov r0, 7’        

Example Macros

Example 1: Move 32-bit Value (mov32)

The mov32 macro is a good example of a simple macro that saves some typing and makes a source code look a little cleaner.


// mov32 : Move a 32bit value to a register
// Usage:
//     mov32   dst, src    
// Sets dst = src. Src must be a 32 bit immediate value.
.macro  mov32               
.mparam dst, src
    mov     dst.w0, src & 0xFFFF
    mov     dst.w2, src >> 16

Example Invocation: The invocation for this macro is the same as the standard mov pseudo op:

    mov32   r0, 0x12345678

Example Expansion: The expansion of the above invocation uses to immediate value moves to accomplish the 32-bit load.

    mov     r0.w0, 0x12345678 & 0xFFFF
    mov     r0.w2, 0x12345678 >> 16

Example 2: Quick Branch If in Range (qbir)

Any label defined within a macro is altered upon expansion to be unique. Thus internal labels are local to the macro and code defined outside of a macro can not make direct use of a label that is defined inside a macro. However code contained within a macro can make free use of externally defined labels.

The qbir macro is a simple example that uses a local label. The macro instruction will jump to the supplied label if the test value is within the specified range.


// qbir : Quick branch in range
// Usage:
//     qbir    label, test, low, high  
// Jumps to label if (low <= test <= high).
// Test must be a register. Low and high can be 
// a register or a 8 bit immediate value.
.macro  qbir
.mparam label, test, low, high
        qbgt    out_of_range, test, low
        qbge    label, test, high

Example Invocation: The example below checks the value in R5 for membership of two different ranges. Note that the range "low" and "high" values could also come from registers. They do not need to be immediate values:

    qbir    range1, r5,  1,  9   // Jump if (1 <= r5 <= 9)
    qbir    range2, r5, 25, 50   // Jump if (25 <= r5 <= 50)

Example Expansion: The expansion of the above invocation illustrates how external labels are used unmodified while internal labels are altered on expansion to make them unique.

    qbgt     _out_of_range_1_, R5, 1
    qbge     range1, r5, 9
    qbgt     _out_of_range_2_, R5, 25
    qbge     range2, r5, 50

Using Structures and Scope

Basic Structures

Structures are used in PASM to eliminate the tedious process of defining structure offset fields for using in LBBO/SBBO, and the even more painful process of mapping structures to registers.

Declaring Structure Types

Structures are declared in PASM using the ".struct" dot command. This is similar to using a "typedef" in C. PASM automatically processes each declared structure template and creates an internal structure type. The named structure type is not yet associated with any registers or storage. For example, say the application programmer has the following structure in C:

typedef struct _PktDesc_
    struct _PktDesc *pNext;
    char            *pBuffer;
    unsigned short  Offset;
    unsigned short  BufLength;
    unsigned short  Flags;
    unsigned short  PktLength;

The equivalent PASM structure type is created using the following syntax:

.struct PktDesc
    .u32    pNext
    .u32    pBuffer
    .u16    Offset
    .u16    BufLength
    .u16    Flags
    .u16    PktLength
Assigning Structure Instances to Registers

The second function of the PASM structure is to allow the application developer to map structures onto the PRU register file without the need to manually allocate registers to each field. This is done through the ".assign" dot command. For example, say the application programmer performs the following assignment:

   .assign PktDesc, R4, R7, RxDesc   // Make sure this uses R4 thru R7

When PASM sees this assignment, it will perform three tasks for the application developer:

  1. PASM will verify that the structure perfectly spans the declared range (in this case R4 through R7). The application developer can avoid the formal range declaration by substituting ’*’ for ’R7’ above.
  2. PASM will verify that all structure fields are able to be mapped onto the declared range without any alignment issues. If an alignment issue is found, it is reported as an error along with the field in question. Note that assignments can begin on any register boundary.
  3. PASM will create an internal data type named "RxDesc", which is of type "PktDesc".

For the above assignment, PASM will use the following variable equivalencies. Note PASM uses the little endian register mapping.

Variable Assignment
RxDesc R4
RxDesc.pNext R4
RxDesc.pBuffer R5
RxDesc.Offset R6.w0
RxDesc.BufLength R6.w2
RxDesc.Flags R7.w0
RxDesc.PktLength R7.w2

For example the source line below will be converted to the output shown:

    // Input Source Line
    add     r20, RxDesc.pBuffer, RxDesc.Offset
    // Output Source Line
    add     r20, R5, R6.w0

SIZE and OFFSET Operators

SIZE and OFFSET are two useful operators that can be applied to either structure types or structure assignments. The SIZE operator returns the byte size of the supplied structure or structure field. The OFFSET operator returns the byte offset of the supplied field from the start of the structure.

SIZE Operator Example

Using the assignment example from the previous section, the following SIZE equivalencies would apply:

Variable Operation Results!
SIZE(PktDesc) 16
SIZE(PktDesc.pNext) 4
SIZE(PktDesc.pBuffer) 4
SIZE(PktDesc.Offset) 2
SIZE(PktDesc.BufLength) 2
SIZE(PktDesc.Flags) 2
SIZE(PktDesc.PktLength) 2
SIZE(RxDesc) 16
SIZE(RxDesc.pNext) 4
SIZE(RxDesc.pBuffer) 4
SIZE(RxDesc.Offset) 2
SIZE(RxDesc.BufLength) 2
SIZE(RxDesc.Flags) 2
SIZE(RxDesc.PktLength) 2

OFFSET Operator Example

Using the assignment example from the previous section, the following OFFSET equivalencies would apply:

Variable Operation Results
OFFSET(PktDesc) 0
OFFSET(PktDesc.pNext) 0
OFFSET(PktDesc.pBuffer) 4
OFFSET(PktDesc.Offset) 8
OFFSET(PktDesc.BufLength) 10
OFFSET(PktDesc.Flags) 12
OFFSET(PktDesc.PktLength) 14
OFFSET(RxDesc) 0
OFFSET(RxDesc.pNext) 0
OFFSET(RxDesc.pBuffer) 4
OFFSET(RxDesc.Offset) 8
OFFSET(RxDesc.BufLength) 10
OFFSET(RxDesc.Flags) 12
OFFSET(RxDesc.PktLength) 14

Using Variable Scopes

On larger PASM applications, it is common for different structures to be applied to the same register range for use at different times in the code. For example, assume the programmer uses three structures, one called "global", one called "init" and one called "work". Assume that the global structure is always valid, but that the init and work structures do not need to be used at the same time.

The programmer could assign the structures as follows:

    .assign struct_global,   R2, R8,  myGlobal 
    .assign struct_init      R9, R12, init      // Registers shared with "work" 
    .assign struct_work      R9, R13, work      // Registers shared with "init" 

The program code may look something like the following:

        call  InitGlobalData
        mov   init.suff,
        call  InitProcessing
        qbbs  InitComplete, init.flags.fComplete
Using R9 to R12 for "init" structure
        call  LoadWorkRecord
        mov   r0, myGlobal.Status
        qbeq  type1, work.type, myGlobal.WorkType1
Using R9 to R13 for "work" structure
        mov   init.start, init.stuff
        set   init.flags.fComplete
Using R9 to R12 for "init" structure

The code has been shaded to emphasize when the shared registers are being used for the "init" structure and when they are been used for the "work" structure. The above is quite legal, but in this example, PASM does not provide any enforcement for the register sharing. For example, assume the work section of the code contained a reference to the "init" structure:

        call  LoadWorkRecord
        mov   r0, myGlobal.Status
        ''set   init.flags.fWorkStarted''
        qbeq  type1, work.type, myGlobal.WorkType1
The reference to "init" would not cause an assembly error!

The above example would not result in an assembly error even though using the same registers for two different purposes at the same time would result in a functional error.

To solve this potential problem, named variable scopes can be defined in which the register assignments are to be made. For example, the above shared assignments can be revised to as shown below to include the creation of variable scopes:

    .assign struct_global,   R2, R8,  myGlobal // Available in all scopes 

    .enter Init_Scope                          // Create new scope Init_Scope 
      .assign struct_init  R9, R12, init       // Only available in Init_Scope 
    .leave Init_Scope                          // Leave scope Init_Scope

    .enter Work_Scope                          // Create new scope Work_Scope
      .assign struct_work  R9, R13, work       // Only available in Work_Scope 
    .leave Work_Scope                          // Leave scope Work_Scope

Once the scopes have been defined, the structures assigned within can only be accessed while the scope is open. Previously defined scopes can be reopened via the ".using" command.

.using Init_Scope
        call  InitGlobalData
        mov   init.suff,
        call  InitProcessing
        qbbs  InitComplete, init.flags.fComplete
.leave Init_Scope
Using "Init_Scope"
.using Work_Scope
        call  LoadWorkRecord
        mov   r0, myGlobal.Status
        qbeq  type1, work.type, myGlobal.WorkType1
.leave Work_Scope
Using "Work_Scope"
.using Init_Scope
        mov   init.start, init.stuff
        set   init.flags.fComplete
.leave Init_Scope
Using "Init_Scope"

When using scopes as in the above example, any attempted reference to a structure assignment made outside a currently open scope will result in an assembly error.

Register Addressing and Spanning

Certain PRU instructions act upon or affect more than a single register field. These include MVIx, ZERO, SCAN, LBxO, and SBxO. It is important to understand how register fields are packed into registers, and how these fields are addressed when using one of these PRU functions.

Little Endian Register Mapping

The registers of the PRU are memory mapped with the little endian byte ordering scheme. For example, say we have the following registers set to the given values:

    R0 = 0x80818283
    R1 = 0x84858687

The following table shows the register mapping to byte offset in little endian:

Register Byte Mapping in Little Endian
Byte Offset 0 1 2 3 4 5 6 7
Register Field R0.b0 R0.b1 R0.b2 R0.b3 R1.b0 R1.b1 R1.b2 R1.b3
Example Value 0x83 0x82 0x81 0x80 0x87 0x86 0x85 0x84

There are three factors affected by register mapping and little endian mapping. There are register spans, the first byte affected in a register field, and register addressing. In addition, there are some alterations in PRU opcode encoding.

Register Spans

The concept of how the register file is spanned can be best viewed using the tables created in the example from section 3.3.1. Registers are spanned by incrementing the byte offset from the start of the register file for each subsequent byte.

For example assume we have the following registers set to their indicated values:

    R0 = 0x80818283
    R1 = 0x84858687
    R2 = 0x00001000

If the instruction "SBBO R0.b2, R2, 0, 5" is executed, it will result in a memory write to memory address 0x1000 as shown in little endian:

SBBO Result for Little Endian
Byte Address 0x1000 0x1001 0x1002 0x1003 0x1004
Value 0x81 0x80 0x87 0x86 0x85
First Byte Affected

The first affected byte in a register field is literally the first byte to be altered when executing a PRU instruction. For example, in the instruction "LBBO R0, R1, 0, 4", the first byte to be affected by the LBBO is R0.b0 in little endian. The width of a field in a register span operation is almost irrelevant in little endian, since the first byte affected is independent of field width. For example, consider the following table:

First Byte Affected in Little Endian
Register Expression First Byte Affected
R0 R0.b0
R0.w0 R0.b0
R0.w1 R0.b1
R0.w2 R0.b2
R0.b0 R0.b0
R0.b1 R0.b1
R0.b2 R0.b2
R0.b3 R0.b3

As can be seen in the table above, for any expression the first byte affected is always the byte offset of the field within the register. Thus in little endian, the expressions listed below all result in identical behavior.

    LBBO R0, R1, 0, 4
    LBBO R0.w0, R1, 0, 4
    LBBO R0.b0, R1, 0, 4
Register Address

The MVIx, ZERO, SCAN, LBxO, and SBxO instructions may use or require a register address instead of the direct register field in the instruction. In the assembler a leading ’&’ character is used to specify that a register address is to be used. The address of a register is defined to be the byte offset within the register file of the first affected byte in the supplied field.

Given the information already presented in this chapter, it should be straight forward to verify the following register address mappings:

Register Addressing in Little Endian
Register Address Expression Little Endian
First Bye Affected Register Address
&Rn Rn.b0 (n*4)
&Rn.w0 Rn.b0 (n*4)
&Rn.w1 Rn.b1 (n*4) + 1
&Rn.w2 Rn.b2 (n*4) + 2
&Rn.b0 Rn.b0 (n*4)
&Rn.b1 Rn.b1 (n*4) + 1
&Rn.b2 Rn.b2 (n*4) + 2
&Rn.b3 Rn.b3 (n*4) + 3

Register addresses are very useful for writing endian agnostic code, or for overriding the declared field widths in a structure element.

PRU Opcode Generation

The PRU binary opcode formats for LBBO, SBBO, LBCO, and SBCO use a byte offset for the source/destination register in the PRU register file. For example, only the following destination fields can actually be encoded into a PRU opcode for register R1:

  • LBBO R1.b0, R0, 0, 4
  • LBBO R1.b1, R0, 0, 4
  • LBBO R1.b2, R0, 0, 4
  • LBBO R1.b3, R0, 0, 4

Return to Main Page on PRU Software Development

Click here.