Matrix Multiplication Optimization

As the final lab assignment for our Concurrent Programming module, we were asked to optimize matrix multiplication. C++ was set as the language and Linux was set as the environment for this task. The tasks for this lab assignment were as follows:

  • Write a naive sequential program to perform matrix multiplication on a n x n matrix containing double precision values. The matrices were to be filled with random double values. Only the time taken for the actual multiplication part was to be recorded.
  • Identify a C++ library for Linux environment which supports parallel for loops and write a parallelized version of the above program.
  • Running the above 2 versions of matrix multiplication for matrix sizes of n = 100 to n = 1000 in increments of 100 and record the time taken for each of these executions.
  • We were then supposed to look in to ways to optimize matrix multiplication taking the processor architecture into account as well and come up with a 3rd version of the matrix multiplication making use of both parallel for loops and various optimization techniques.

The full source code of this project can be found here:

Naive Sequential Version

For this, the standard matrix multiplication algorithm was used. If the matrices are called A, B and C,

for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
for (int k = 0; k < n; k++)
C[i][j] += A[i][k] * B[k][j];

Continue reading