Processing Cyclic Multimedia Workloads on Modern Architectures
Supervisor(s) and Committee member(s): Pål Halvorsen (main advisor), Carsten Griwodz (advisor), Christian Plessl (first opponent), Mei Wen (second opponent), Tor Skeie (third opponent)
Working with modern architectures for high performance applications is increasingly more diffi- cult for programmers as the complexity of both the system architectures and software continue to increase. The level of hand tuning and native adaptations required to achieve high performance comes at the cost of limiting the portability of the software. For instance, we show that a compute intensive DCT algorithm performs better on graphic processors than the best algorithm for x86. In particular, limited portability is true for cyclic multimedia workloads, a set of programs that run continuously with strict requirements for high performance and low latency. An example of a typical multimedia workload is a pipeline of many small image processing algorithms working in tandem to complete a particular task. The input can be videos from one or more live cameras, and the output is a set of video frames with elements from several of the source videos, for example as stitched panorama frames or 3D warped video. Such a setup runs continuously and potentially needs to adapt to various degrees of changes in the setup without interruptions or downtime.
To reach the performance goal required by multimedia pipelines, modern, heterogeneous architectures are considered instead of the traditional symmetric multi-processing architectures. We also investigate variations between recent microarchitectures of symmetric processors to identify differences that a low-level scheduler must take into account. Further, since multimedia workloads often need to adapt to various external conditions, e.g., adding another participant to a video conference, we also investigate elastic and portable processing of multimedia work- loads. To do this, we propose a framework design and language, which we call P2G. In the age of Big Data, this idea differs from the typical frameworks used for distributed processing, such as MapReduce and Dryad, in that it is designed for continuous operation instead of batch processing of large workloads. We emphasize heterogeneous support and expose parallel opportunities in workloads in a way that is easy to target since it is similar to sequential execution with multidimensional arrays. The framework ideas are implemented as a prototype and released as an open source platform for further experimentation and evaluation.
Media Performance Group, Simula Research Laboratory
The Media Performance Group (MPG) addresses resource utilization and performance challenges to support a wide range of interactive multimedia services to the large user masses in the Internet. The goals are to reduce the costs, increase the number of users and optimize the perceived service quality. MPG’s activities branch into several areas of multimedia systems to maintain and improve our ability to evaluate the performance of complete multimedia systems. This goal ties research branches together that are as diverse as multicore programming and user perception. Any level of a system may constitute a performance bottleneck, and the critical bottlenecks are known to move from component to component as the state of the art develops. Therefore, MPG’s research keeps a global scope, while its research activities target the critical performance question.