Analysis and Parallelization of JPEG-2000 Reference Software for General-Purpose Processors
MetadataShow full item record
Like many other multimedia applications, image compression involves a significant amount of data processing for coding images. Sophisticated general-purpose processors with parallel architectures and advanced cache systems can be dedicated to enhancing performance for serial multimedia applications through parallelization. This thesis describes parallelization of the JasPer reference software for the JPEG-2000 image compression standard and presents results from simulation, and from hardware execution on a multicore processor where speedups of more than 2 are obtained with 4 processors. Results from execution and cache behavior analysis are presented to establish the expected speedup and to further characterize JasPer execution. The JasPer encoding process has been analyzed on a single processor for both simulated and hardware execution in order to obtain more insights into application behavior. On recent hardware platforms, the significant contributors to the total execution time have been identified through profiling. The granularity of parallelism for parallelizable loops have been analyzed for execution on real hardware. Cache behavior and memory access pattern have been studied closely for the simulated execution. To facilitate parallelization, selected parallelizable loops have been transformed in order to assist the partitioning of loop iterations for parallel execution and to increase workload granularity and reduce synchronization overhead. These modifications include loop index and body transformation, and loop fusion. A memory access pattern tracking feature has also been introduced for serial and parallel execution of a program in simulation. This feature tracks the number of memory accesses in a particular data region during a particular interval of time in order to gain additional insights into execution behavior. The multithreaded execution of the parallelized JasPer encoder presents a relatively balanced workload which indicates a reasonable efficiency for parallel execution. The generated images have been compared against their original images by using analytical tools to ensure the image quality and to verify correctness.