Author Archives: Mauricio Alvarez-Mesa

Spin Digital presents a H.265/HEVC media player with HDR at IBC 2016

Spin Digital Video Technologies GmbH demonstrated at IBC 2016 an end-to-end HDR HEVC/H.265 software solution for ultra-high definition video (4K, 8K, and beyond).  A demonstration was presented at the Amsterdam RAI at Hall 1 Booth 1.F11. Using the HDR-enabled video codec it is possible to encode, decode, and display high quality HDR video using a software solution running on PC platforms.

Read more »

Spin Digital released a world’s first 8K HEVC/H.265 media player

Berlin, June 2nd 2016: Spin Digital has released a complete software media player supporting 8K HEVC video and 22.2 audio. A demonstration has been presented recently at several international events including NAB in Las Vegas (April 18-21), the “After NAB” show in Tokyo (May 19-20), and the NHK Open House in Tokyo (May 26-29).

Read more »

Spin Digital joins LPGPU2

Spin Digital Video Technologies GmbH (Spin Digital), a German company specialized in high-performance video codecs for the next generation of high-quality video applications, has joined LPGPU2, an EU consortium composed of technology companies and universities collaborating on tools for low-power parallel computing using GPUs.

Read more »

A Technology Transfer Project has been awarded to AES – TU Berlin and Think Silicon

A technology transfer project called “eGPU accelerated HEVC/H.265 video decoder” has been awarded to AES TU Berlin and Think Silicon. The project is financed by TETRACOM (Technology Transfer in Computing Systems), a coordination action funded by the European Commission under the FP7 program.

Read more »

TU Berlin paper to appear at SAMOS XIV

The paper titled “GPGPU Workload Characteristics and Performance Analysis” by Sohan Lal, Jan Lucas, Michael Andersch, Mauricio Alvarez-Mesa, Ahmed Elhossini and Ben Juurlink has been accepted  at SAMOS 2014.

Abstract: GPUs are much more power-efficient devices compared to CPUs, but due to several performance bottlenecks, the performance per watt of GPUs is often much lower than what could be achieved theoretically. To sustain and continue high performance computing growth, new architectural and application techniques are required to create power-efficient computing systems. To find such techniques, however, it is necessary to study the power consumption at a detailed level and understand the bottlenecks which cause low performance. Therefore, in this paper, we study GPU power consumption at component level and investigate the bottlenecks that cause low performance and low energy efficiency.

Read more »

TU Berlin Paper to appear at MTAGS13 workshop, Co-located with SC 2013

The paper “FPGA-Based Prototype of Nexus++ Task Manager”, by Tamer Dallou, Ahmed Elhossini and Ben Juurlink, is accepted to appear at the 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers, which is Co-located with Supercomputing/SC 2013, on November 17th, 2013, Denver, Colorado, USA.

The Nexus++ task manager is designed for task-based programming Nexus++_HL2models. Furthermore, it will be ported to GPGPUSim as an extension to add dependency-awareness to GPUs, at block level granularity.

Abstract: StarSs is one of several programming models that try to relieve parallel programming. In StarSs, the programmer has to identify pieces of code that can be executed as tasks, as well as their inputs and outputs. Thereafter, the runtime system (RTS) determines the dependencies between tasks and schedules ready tasks onto worker cores. Previous work has shown, however, that the StarSs RTS may constitute a bottleneck that limits the scalability of the system and proposed a hardware task management system called Nexus++ to eliminate this bottleneck. The first prototype of Nexus++ was implemented in SystemC. Its architecture also had a nondeterministic multi-cycle search algorithm in its critical path, potentially limiting its scalability. In this paper, we improved the architecture of Nexus++ and employed a multi-way set-associative cache-like data structures to optimize its search algorithm and increase task throughput. We also modeled the new architecture in VHDL and targeted a Virtex~5 FPGA from Xilinx. Experimental results show that the new architecture is very resource-efficient utilizing only 19% of the target FPGA. It also shows that Nexus++ achieves a speedup of up to 81x using some synthetic benchmarks modeled after H.264 decoding. Hence, Nexus++ significantly enhances the scalability of applications parallelized using StarSs.


GPUSimPow power simulator released!

The LPGPU Consortium announces the release of the GPUSimPow power simulator.

GPUSimPow is a flexible architectural power and performance simulator. It can be used to simulate the power consumption of gpgpu applications on current and future GPUs. A combination of empirical and analytical models are used by the simulator to provide both high accuracy and high flexibility. The simulator has been validated against NVidia GT240 and GTX580 GPUs using the LPGPU developed power measurement testbed.

More information will be available in the talk “A Framework for Modeling GPUs Power Consumption” by Sohan Lal at the PEGPUM Workshop. GPUSimPow can now be downloaded from TU Berlin(

TU-Berlin has setup a testbed to accurately measure GPU power consumption

TU-Berlin has setup a testbed to accurately measure GPU power consumption. This testbed is being used to evaluate power reduction techniques on available GPUs. It will also be used to validate the power modeling of GPUSimPow, the GPU power simulator developed within the LPGPU project. Its high bandwidth and high sampling speeds enable it to accurately measure short, sub-ms power events.
The TU-Berlin developed measurement software allows developers to pinpoint power consumption down to the individual kernel.

Testbed for power measurementPower measurements

TU-Berlin will present a paper at HeteroPar 2012

IDCT mapping

The paper “An Optimized Parallel IDCT on Graphics Processing Units” by Biao Wang, Mauricio Alvarez-Mesa, Chi Ching Chi, and Ben Juurlink has been accepted at the 2012 International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar’2012) which will be held in Rhodes Island, Greece on August 27, 2012. The paper presents the work of optimizing the H.264 inverse transform on GPUs, which has been conducted at TU_berlin as part of the LPGPU project. More information on HeteroPar can be found at


By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.