Cortex-A8 Neon Architecture

From Texas Instruments Wiki
Jump to: navigation, search

Content is no longer maintained and is being kept for reference only!

NEON Block Diagram

Neon block diagram.png

NEON Hardware Features

  • 16-Entry Instruction queue
  • Dual view register file
    • 32 x 64-bit
    • 16 x 128-bit
  • 6 Stage execution Pipeline
    • Integer
    • Single precision floating point
  • Load store permute
  • Non-pipelined IEEE vector floating point support
  • 12 –Entry load data queue

NEON Interface Diagram

CortexA8-NEONInterfaces.png

Skewed late in pipeline, past the retire point

  • reduces interface complexity
    • exception handling not required
    • decoupling queues from integer machine
  • removes load-use penalty
  • negative impact on NEON -> ARM transfers
    • nonblocking ARM register file helps hide latency

Streaming to and from L2 memory system

  • up to 8 outstanding transactions
  • can receive 128 bits/cycle
  • can receive data from L1 or L2 memory system
  • independent NEON store buffer

NEON Media Engine Unit

CortexA8-NEONMediaEngine.png

  • Instruction issue
    • static scheduling with fire-and-forget issue
    • 1 LS + 1 NINT/NFP can issue each cycle
  • Execution pipelines
    • All pipelines are 64-bit SIMD
    • Floating-point MAC executed using both FADD and FMUL pipelines

Data Movement: NEON and Integer Unit

  • Treated similar to Loads/Store and thus part of Load store permute pipeline
  • Uses VMOV instructions
  • Separate 32 bit buses between Load Store Unit and NEON
  • Load data is loaded in the NEON Load Data Queue
  • NEON stalls in M2 if load data queue is not valid
  • Neon sends store data along with the integer logical register address

Neon System Registers

CortexA8-NEONSystemRegs.png


FPEXC Register

  • Accessed through MRS/MSR instructions
  • Setting the EN bit to 1 activates the NEON and VFP coprocessor. Reset clears EN to 0
  • Accessible in privileged modes only

CortexA8-NEONFPEXC.png


The cp10 and cp11 fields in the CP15- c1 ‘Coprocessor Access Control Register’ control access to the NEON and VFP coprocessor

  • Reset clears the cp10 and cp11 fields and disables the NEON and VFP clocks
  • Accessible in privileged modes only

CortexA8-NEONCoproc.png