|
|
|
|
FPGAs are Field programmable Gate Arrays. They have a far higher density of
programmable gates than more conventional computing systems, but a lower clock
speed. They offer enormous potential for fine-grain parallel computing,
where the results of individual computations are almost immediately available
to other computing elements. This is quite unlike the coarse-grain parallel
computing used in modern supercomputers, where there are significant delays in
transferring data between different computing elements.
Quad-core PC chips are now widely available. A new generation of parallel
chips is under development where there may be as many as 64 parallel cores in a
closely coupled system. These devices are far less parallel than FPGAs but are
faster and may offer similar scope for fine-grain parallelism.
|
|
Algorithms for fine-grain parallel computing were developed by a member of
SimCon in 1983 for the
Applied Dynamics International AD10 simulation computer.
This machine was by far the fastest real-time simulation system available in the mid
1980s, and was used in the design of Space Shuttle, comunications satellites, nuclear
powerplant, helicopters and a range of military hardware. The algorithms were implemented
as a stand-alone optimiser and instruction scheduler. They are applicable
to Fortran or C programming for FPGAs and multicore systems. Research is in progress in
collaboration with Liverpool Hope University and The University of Cape Town to apply
the algorithms to these devices.
|
|
"Hydra", the first implementation of the parallel algorithmes for an FPGA is a pipeline of
processes made up of:
WinFPT: WinFPT replaces separate variable lives with different variables
(For a description of variables lives, please see the description of this issue in the
research on physical units and dimensions. Essentially,
variables which are assigned more than once are replaced by separate variables, each with only
a single assignment). WinFPT expands sub-programs in-line and unrolls DO loops. It cuts
long expressons into "triples" - expressions of the form a = b + c, a = b * c or a = sin(b);
WinFPT API: Code written in the WinFPT API outputs the triples in a format
appropriate for the parallel optimiser;
Hydra Optimiser: The Hydra optimiser, derived from the original system for the AD10,
schedules the instructions in an optimised parallel sequence;
Hydra Image Builder: The image builder constructs the load image for the FPGA.
The FPGA program consists of an interpreter which reads the load image and executes
groups of operations in parallel. The FPGA program is data driven. It is a single generic
code which will execute any program. Compilation of a Fortran code through the pipeline does
not require any further coding within the FPGA.
The Hydra pipeline is described in detail in the paper
"APPRASE: Automatic parallelisation of Fortran to run on an FPGA" presented at the SCSC 2008
conference in Edinburgh.
Hydra forms part of the APPRASE project at Liverpool Hope University, partly supported
by the HEIF UK funding.
|
|
FPGAs are most efficiently programmed in fixed point arithmetic. Real numbers
are represented by scaled fractions. Each real value, x, is assigned a maximum absolute value,
scale(x), and is represented internally by an integer which is calculated as
X = 2**(n-1) * x / scale(x) where n is the number of bits used in the scaled fraction.
Usually, n would be 32 bits, but in an FPGA this is not necessarily the case.
The AD10 was also a fixed-point machine. A system was developed to maintain
the coefficients and data structures required to support the scaled-fraction
arithmetic. For example, where
a = b + c
and a, b and c have different scales,
a set of coefficients must be generated to correct the values so that the effective
calculation would be:
A = ( B + C * ( scale(c) / scale(b) ) ) * ( scale(b) / scale(a) )
where A, B and C are the scaled fractions
representing a, b and c.
This system will be applied in the FPGA programs.
|
 |
|
|
|
Copyright ©1995 to 2010 Software Validation Ltd. All rights reserved. |
|
|