Laboratory for Internet and Innovative Technologies

This paper explains why there are performance drawbacks for matrix multiplication algorithm using associative cache memory. We give an overview of cache memory organization and theoretical analysis why performance drawback appears in matrix multiplication. We also provide a method to avoid situations where this performance drawback is significant and how to achieve maximum performance. The analyzed problem appears in storage of matrix columns and inefficient usage of cache where the matrix will always map onto a small group of same cache sets and initiate a significant number of cache misses. In this case it looks like the processor is using only a small group of cache sets instead of complete number of sets in associative memory where maximum performance can be achieved.


Marjan Gusev, and Sasko Ristov


High performance computing, performance drawback, processor speed, shared memory multiprocessor, superlinear speedup

Full Paper

The paper is published in 8th Int. Conf Computing and Information Management ICCM, IEEE Conference proceedings, ICNIT 2012, Proceedings of 3rd International Conference, Next Generation Information Technology, Seoul, Korea, 2012