Two novel cache management mechanisms on CPU-GPU heterogeneous processors

Main Article Content

Huijing Yang
Tingwen Yu

Keywords

heterogeneous, multicore, CPU-GPU, cache partitioning

Abstract

Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the same
chip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-Level
Cache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance,
the majority of the LLC space is utilized by GPU applications. Under the current cache management
policies, the LLC sharing of CPU applications can be remarkably decreased due to the existence of
GPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contention
within CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms:
static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamic
cache partitioning scheme based on GPU missing awareness (DGMA). SARP scheme first uses cache
partitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacement
policy depending on the type of the requested message. DGMA scheme monitors GPU’s cache
performance metrics at run time and set appropriate threshold to dynamically change the cache ratio
of the mutual LLC between various kernels. Experimental results show that SARP mechanism
can further increase CPU performance, up to 32.6% and an average increase of 8.4%. And DGMA
scheme improves CPU performance under the premise of ensuring that GPU performance is not affected,
and achieves a maximum increase of 18.1% and an average increase of 7.7%.