Seminar in Numerical Analysis: Holger Fröning (Universität Heidelberg)
We are observing a continuous increase in concurrency and heterogeneity for computing systems of any scale, ranging from small mobile devices to huge datacenters, and driven by a steady demand for more computing power. One of the prime examples for an application with virtually unlimited computational requirements is machine learning, in particular deep neural networks (DNN). At the level of data-centers, DNN training has already led to a ubiquitous use of graphics processing units (GPUs), forming a prime example for specialization for computational improvement. Still, this application is strongly hindered by insufficient compute power and by scalability limitations. Contrary, mobile architectures for DNN inference are still nascent, and a large amount of proposals have been published in the recent years. Both applications, training and inference, can furthermore benefit a lot from algorithmic optimizations to reduce the computational requirements. This talk presents a short introduction of the application, a summary of our observations, and our own research on reduced precision by extreme forms of quantizations. Finally, this talk will offer some opinions on anticipated research problems.
Export event as iCal