|
|
 |
 |
Performance optimization of numerical methods on graphic boards using the NVidis CUDA Framework
Supervision:
|
Background:
Graphics accelerator boards increase in popularity, also for the high-performance computing community. Recent graphics cards
featuring multicore processors and large Memories enforce this trend, such that they are more and more interesting for numerical
computations, too.
Moreover, for graphics cards from Nvidia, there is the Common Unified Device Architecture (CUDA) which promises easy and fast portation
of existing C code by the use of special library commands.
|
Tasks:
From long-lasting experience in performance optimization of numerical codes, a deep understanding for the internal structures and
limitations of IA32 and IA64-based architectures has been developed at out chair. In order to get similar insight into graphics
processors, in this thesis a in-depth study of the internal architecture of graphics boards is to be performed. This will be achieved
by first porting simple kernels (like vector triad and other benchmarks) and later a more complex algorithm (i.e. lattice Boltzmann
method) to CUDA, evaluating the cost-benefit ratio of coding effort to sustained performance speedup. If possible, this ratio is
to be evaluated for both the simple library commands and the hardware-specific instructions of CUDA.
Recommended knowledge:
- Programming experience
- Lectures on performance optimization
- Basic experiences with CUDA preferable
Literature:
- Nvidia CUDA Programming Guide
Type:
Master Thesis or Diploma Thesis
Status:
Free
|
 |
 |
|