Friedrich-Alexander-Universität UnivisDeutsch FAU-Logo
Techn. Fakultät Willkommen am Institut für Informatik FAU-Logo
Logo LSS
Chair for System Simulation (Department of Computer Science 10)
Performance Optimization for Future Hardware: FPGA
Description
Dept. of Computer Science  >  Computer Science 10  >  Research  >  Projects  >  POfFH  >  Project Part FPGA

Performance Optimization for Future Hardware: FPGA

The FPGA branch of this project is destined to investigate in hardware structures fitting optimally to the needs of scientific codes.
FPGAs are an interesting candidate for high-performance computing, because the user is able to freely configure its computational resources and their interconnection. Even manufacturers of supercomputers and general-purpose CPUs are about to introduce FPGAs or reconfigurable parts in their products. Therefore, recently a lot of contributions on the use of FPGAs for high-performance computing were published. Nearly without exception they came to the conclusion, that FPGAs are not well suited for this community. The main reasons for this are manifold. However, most of them are due to too superficial examination:
  • "FPGAs are of no use for floating-point operations"
    The common opinion is, that only integer or at most fixed-point arithmetic is really efficient on FPGAs. However, also the FPGA development had rapid advances in recent years. There are devices with thousands of basic floating-point multiplier units that can be used to assemble even double-precision. The manufacturers often offer libraries that makes the use very easy while remaining efficient at 75% of the device's clock speed or higher.
    Moreover, least of the scientists have examined their methods on the need for accuracy. Mostly they know they can be sure that double-precision is sufficiently accurate and since it is not expensive on a general-purpose CPU, they use floating-point operations in double precision. When developing a special hardware device of a specific method (i.e. using an FPGA), one has freedom in choice of the accuracy for each operation needed. Spending effort into an accuracy study for the method one can optimize each single operator for the trade-off of width (i.e. space) and speed.
  • "FPGAs cannot evade the memory bottle neck"
    Most scientists who try to port their application to an FPGA buy a standard evaluation board from the FPGA manufacturer and hope to execute one iteration of their problem per cycle. But then they realize that the data cannot transferred fast enough from and to memory. What they do forget is, that they would have the ability to evade the bottle neck. Again they used a preconfigured architecture and tried to adapt their problem to this structure.
    Completely free from any constraints a specific architecture for every scientific problem could be assembled. For this not only the integrated circuit, implemented in the FPGA, is necessary, but also a elaborate board design that exactly fits to the needs of the algorithm. I.e. for dealing with the memory bottle neck, several memory controllers or a memory bus width of hundreds of bits could be used. Of course the design of an individual board demands even more knowledge of electrical engineering, such that the evaluation process of FPGA as alternative architecture would last several years.
  • "Programming FPGAs isn't like programming C++"
    Since FPGAs simulate hardware circuits, normally the configuration is compiled from a description written in a hardware description language (HDL) on the Register Transfer abstraction Level (RTL). Albeit learning a new language is no obstacle for programmers, HDLs are very different because they not only describe the processes (like a normal programming language) but also the structure. The conclusion of most scientists is, that it is not possible to expect from the standard scientists to learn writing in HDLs.
    However, there are tools that try to smooth this way by translating codes written in a programming language like C++ to a hardware description. All the current tools have drawbacks and limitations, however, with enough effort and scientific motivation better tools could be made available.
Concluding, FPGAs enable the creation of the perfectly fitting architecture, but the costs (in terms of both effort and financial costs) are too high. However, even large general-purpose CPU manufacturers think about partial reconfigurable designs and thus investigation of the principles will be worthwile.
Therefore, this part of the project will address the limitations and possibilities of reconfigurable hardware in general. With small studies on different evaluation boards principles will be derived that allow conclusions whether a scientific method would benefit from reconfigurable parts in a computer (or CPU) and in which way current architectures have to be improved to be better suitable for modern scientific problems.
  Contact Last modified: 2011-11-10 09:42   sd