As the final lab assignment for our Concurrent Programming module, we were asked to optimize matrix multiplication. C++ was set as the language and Linux was set as the environment for this task. The tasks for this lab assignment were as follows:
- Write a naive sequential program to perform matrix multiplication on a n x n matrix containing double precision values. The matrices were to be filled with random double values. Only the time taken for the actual multiplication part was to be recorded.
- Identify a C++ library for Linux environment which supports parallel for loops and write a parallelized version of the above program.
- Running the above 2 versions of matrix multiplication for matrix sizes of n = 100 to n = 1000 in increments of 100 and record the time taken for each of these executions.
- We were then supposed to look in to ways to optimize matrix multiplication taking the processor architecture into account as well and come up with a 3rd version of the matrix multiplication making use of both parallel for loops and various optimization techniques.
The full source code of this project can be found here: https://github.com/pubudu91/mm-optimization
Naive Sequential Version
For this, the standard matrix multiplication algorithm was used. If the matrices are called A, B and C,
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|for (int i = 0; i < n; i++)|
|for (int j = 0; j < n; j++)|
|for (int k = 0; k < n; k++)|
|C[i][j] += A[i][k] * B[k][j];|