The /unsw/projects/armvfp directory contains the full
implementation details of the of the ARM Vector Floating Point Co-processor Emulator.
This work was developed by Raviraj Joshi as part of the Accelerated Tutorial
Project for COMP1721, Higher Computing 1B, in 2003.
Project Title
ARM Vector Floating Point Co-processor Emulator
Project Abstract
Architecture Overview
The Vector Floating Point (VFP) architecture is a co-processor extension to
the ARM architecture. It provides single (Version 1, or VFPv1) and double
(variant D, VFPv1D) precision floating point arithmentic, as defined by
ANSI/IEEE 754-1985 IEEE Standard for Binary Floating Point Arithmetic.
Short vectors of up to 8 single or 4 double precision numbers can be handled
by the VFP architecture. Most arithmetic instructions can be used on these
vectors, allowing 'single-instruction, multiple-data' (SIMD) parallelism.
Furthermore, the floating-point load and store instruction shave 'multiple
register' forms, allowing vectors to be transferred to and from memory
efficiently.
Registers
The ARM VFP architecture includes:
32 General Purpose Registers. Each GPR is capable of holding a single
precision flaoting point number or a 32 bit integer (unsinged long int or
2's complement signed long int). In the D variant these registers can be
used as pairs to hold up to 16 double precision floating point numbers.
Floating Point System ID register (FPSID). The FPSID register can be read
to determine which implementation of the VFP architecture is being used.
Floating Point Status and Control Register (FPSCR). The FPSCR register
supplies all user-level status and control. Status bits hold comparison
results and cumulative flags for floating point exceptions. Control bits
are provided to select rounding options and vector length/stride, and to
enale floating point exceptions traps.
Floating Point Exception register (FLEXC). The FPEXC register contains a
few bits for system-level status and control.
Instructions
ARM VFP instructions are provided for data processing, load and store, and
register transfer. These instructions are based on the generic ARM
coprocessor instructions: CDP, LDC, MCR, MRC and STC. They include
instructions that:
Load floating point values into registers from memory, and store floating
point values in registers to memory.
Transfer 32-bit values directly between VFP and ARM general-purpose
registers.
Transfer 32-bit values directly between VFP system registers and ARM
general-purpose registers.
Add, substract, multiply, divide and take the square root of floating
point register values.
Invert, clear of leave unchanged the sign bit of floating point values for
negation, absolute and copying of floating point values between registers.
Perform combined multple-accumulate operations on floating point values,
providing space-efficient equivalents for common sequences of multiply,
negate, add, and substract.
Perform conversions between single precision values, double precision
values, unsigned 32-bit integers and 2's complement signed 32-bit integers.
Exceptions
The VFP architecture supports all five of the floating point exceptions
defined in the IEEE 754 standard:
invalid operation
division by zero
overflow
underflow
inexact
Untrapped exceptions causes appropriate cumulative flag in the FPSCR to be
set to 1, and any result registers of the exception-generating instruction
are set to the default result values specified by the standard. Execution
of the program containing the exception-generating instruction then
continues. Trapped exceptions involve an implementation defined trap
handler software routine.
Implementation
The VFP architecture can be implemented in software only, or with both
software/hardware components. The software support code (ie. the ARM VFP
emulator) is installed on the ARM undefined instruction vector. The ARM
undefined instruction exception can catch a VFP instruction and pass it
into a trap handler call that causes the software support code to execute.
This software implementation of the ARM VFP emulator is based on the
specification described in Chapter C of the ARM Architecture Reference
Manual.
The function and variable declarations and definitions for this implementation
of the ARM Vector Floating Point Co-processor Emulator are distributed
across 1 header file, armvfp.h, and 6 source files, including
armvfp.c, inout.c,
decode.c, execute.c, arithmetic.c,
and exceptions.c.
References
References are listed in alphabetical order by author:
ANSI/IEEE Std 754-1985. IEEE standard for binary floating-point arithmetic.
Standards Committee of the IEEE Computer Society, USA.
Ercegovac, MD. and Lang, T. (2003) Digital Arithmetic.
Morgan Kaufmann, USA.
Furber, S. (2000) ARM System-On-Chip Architecture. 2nd Edition.
Addison-Wesley, Great Britain.
Overton, ML. (2001) Numerical Computing with IEEE Floating Point Arithmetic.
Society of Industrial and Applied Mathematics, Philadelphia.
Seal, D. (2000) ARM Architecture Reference Manual. 2nd Edition.
Pearson Educational, UK.