- May 15, 2020

CITS3402 Assignment 1 2019: Sparse MatricesNicholas PritchardAugust 20191 Details• Due Date: 25th September 2019• Worth: 25%This assignment is designed to provide students with experience implementingsome fundamental mathematical codes. You are encouraged to read furtherthan the provided notes on sparse matrices in order to complete the assignment.Please set aside enough time to complete the assignment as all functionality willrequire time to test, analyse and report upon.Your ability to investigate code-performance is of more interest to us than yourcode’s performance. (We care more about what you do, not your machine).2 DescriptionMatrix algebra underpins many compute intense applications in many fields(most fields of Engineering, Machine Learning, Scientific Modelling etc.); com-puters were originally used to crunch numbers so numbers we shall crunch.Your goal, broadly speaking, is to:• Create a piece of software implementing a variety of sparse matrix opera-tions• Ensure this software conforms to our input/output specification• Performance test your software and comment on any speedup you observe(or lack thereof) in a brief report.13 Sparse Matrix Introduction3.1 What is a MatrixA mathematical matrix is a rectangular array of numbers arranged in rows andcolumns. For simplicity we use the convention of (rows, columns) when givingdimensions. An example of a 3× 2 matrix (read ’three by two’) isA =2 62 −30 1 (1)In many practical cases involving matrix algebra, many elements will containzero as their value. Storing all elements at all times wastes a potentially vastamount of memory in these cases. For a given machine this heavily restricts thesize of problem able to be computed and so sparse matrices were developed tosave space at the cost of more time. Loosely speaking, a matrix is considered’sparse’ if it contains ≈ 10% non-zero elements.The general idea is to save space by storing fewer zero elements (since theygenerally do not matter in any calculation). There are three broadly acceptedsparse matrix representations used:3.2 Coordinate Format (COO)The most intuitive sparse matrix format specifies each non-zero element as atripleAi,j = (i, j, val) (2)For example, the matrix,X =0 0 13 0 20 0 0 (3)becomesX = [(0, 2, 1), (1, 0, 3), (1, 2, 2)] (4)This representation while simple can be cumbersome when wanting to look forall values in a particular row or column.3.3 Compressed Sparse Row Format (CSR)The compressed sparse row format stores a matrix with three arrays:• NNZ: The non-zero values stored in row-major order (left to right, top tobottom)• IA: The number of elements in each row. An extra element IA[0] = 0 isused by convention. This array can be used to index into the NNZ arrayfor each i-th row2• JA: Stores the column index of each non-zero element.For example, the matrix,X =0 0 13 0 20 0 0 (5)becomesNNZ = [1, 3, 2] (6)IA = [0, 1, 3, 3] (7)JA = [2, 0, 2] (8)3.4 Compressed Sparse Column Format (CSC)The compressed sparse column format is very similar to CSR format exceptthe non-zero values are stored in a column-wise ordering, the IA array corre-sponds to the extent of each column and the JA array provides row indices. Forexample, the matrix,X =0 0 13 0 20 0 0 (9)becomesNNZ = [3, 1, 2] (10)IA = [0, 1, 1, 3] (11)JA = [1, 0, 1] (12)34 TasksYour code will be a simple command-line application that will:• Read in up to two dense matrix files• Convert this matrix (these matrices) to a suitable sparse format.• Perform a single matrix algebra routine specified by command-line• Time the whole process and log the results to a file• Cleanup all memory usage and terminateWe elaborate on the tasks required of you (and your code) below:4.1 Reading in FilesDepending on the algebra operation requested one or two matrices will be re-quired. These should be provided by command-line as .in files (see SectionB).4.2 Sparse Matrix RepresentationYou have a choice of using any of the three sparse matrix representations pre-sented in Section 3. There will be pros and cons to each representation depend-ing on what operation is requested; you will need to comment on this in yourreport.4.3 Matrix Algebra RoutinesWe request five different operations to be implemented. They range in difficulty(and are presented in roughly increasing difficulty). Marks are distributionaccordingly.4.3.1 Scalar MultiplicationType: Single matrixCommand-line argument: –sm %f——dFor a matrix A and scalar (float) α compute:Y = α ·A (13)i.e. multiply each element of A by α.The command line argument expects the scalar to be supplied immediately after(either an int or float). You only need to consider a floating point scalar whenusing floating-point matrices.44.3.2 TraceType: Single matrixCommand-line argument: –trThe trace(t) of a matrix A is given as:t =∑i=1,n(Ai,i) (14)i.e. the sum of each diagonal element. Note: the Trace is only well defined forsquare matrices.4.3.3 Matrix AdditionType: Double matrixCommand-line argument: –adFor two matrices A and B (of identical dimensions (n,m) their pair-wise sumis:Yi,j = Ai,j +Bi,j∀i, j|i ≤ n, j ≤ m (15)i.e. add each element of both matrices together.4.3.4 TransposeType: Single matrixCommand-line argument: –tsThe transpose of matrix A (denoted A′) is given by:A′j,i = Ai,j∀i ≤ n, j ≤ m (16)i.e. The rows of A become the columns of A′ and vice-versa for the columns.4.3.5 Matrix MultiplicationType: Double matrixCommand-line argument: –mmNote: Sparse matrix multiplication is a notorious bottle-neck in many codes;achieving speedup will be difficult and that is okay. Focus on implementing asimple, correct solution and working from there.4.4 Timing and LoggingYour code is required to time the following events (in seconds)• The time to load in and convert any matrix files.• Time to execute the requested operation5This information must be logged to a .out file with the a header containing thefollowing information:• Operation requested• File1• File2 (if needed)• Number of threads usedFollowed by the result of your computation (single value or dense-matrix de-pending on the operation).4.5 ReportYour report should be fairly brief but covers the following:• The sparse matrix representation(s) used• For each matrix operation implemented:– Description of the parallelism implemented– Some informal reasoning about expected run-time and scalability– Testing results– Comment on the performance observedWe expect the report to be around six pages (including figures).4.6 RestrictionsWe place a number of restrictions on the type of matrix your solution willencounter and the design of your solution. They are briefly summarised belowas a reminder• Input File format – See App. B for a detailed description of the matrixfile-type we use.• Log-File format – See App. D for an example. Some further clarifications:– Operation requested – Provide the command line argument fed to theprogram as the label– Input Filenames – Provide the command line argument fed to theprogram as the label– Number of threads – Given as an integer– Output filename – Provide the date and time of execution plus yourstudent number(s) .out as the filename6– Computation result – Finally, output the result of the requested op-eration on the provided matrix (matrices). Floating point results willbe tested up to 1e− 6 difference– We provide example output files for all types of operations for refer-ence• Matrix Datatype– Integers– Floats– You will never be required to perform operations involving matricesof different data-types• Command-Line Arguments – See App. C for a detailed description. Weguarantee that we will test your solution with arguments presented in theorder described and will only provide valid command line arguments.• Input matrices – We provide a number of guarantees in our test matrices– All command line arguments provided will be valid– All input files will be well-formed, valid and of a single data-type– The dimensions of tested matrices will range in powers of two in therange [2, 16, 384]– Input matrices are not guaranteed to be square– Input matrices are not guaranteed to be the correct dimensions forthe requested operation (e.g. we may ask for the Trace of a non-square matrix or addition of two matrices differing in dimension(s))7Marking RubricCritera Highly SatisfactorySatisfactory UnsatisfactoryFile Reading (5) Successfully readsin matrix files in the format specified. Implements arguments as specified.Reads in matrix files correctly.Attempts to read matrix files and attempts to implement command line arguments.Fails to read matrix files and does not implement correct command line arguments.Sparse matrix representation (10)Implements suitable sparse matrix formats and coverts from dense to sparse formats correctly.Uses appropriate sparse matrix formats and converts from dense to sparse formats correctly.Attempts to convert from dense to sparse formats but is not always correct.Does not convert from dense to sparse matrix formats correctly.Scalar multiplication (5)Able to multiply both integer and floating point sparse matrices bya floating point scalar.Is able to multiply both integer and floating point maitrices by a floating point scalar correctly.Is able to multiply one type of sparse matrix by one typeof scalar correctly.Is unable to multipy a sparse matrix by a scalar.Trace (5) Computes the trace of both integer and floating point sparse matrices.Can compute the trace of integer and floating point matrices correctly.Computes the trace of one type of sparse matrix correctly. Is unable to compute the trace of a sparse matrix.Matrix addition (10)Correctly adds two sparse matrices together. Exits if not possible.Adds two integer or two floating point matrices together correctly. Exits gracefully if addition is not possible.Correctly adds sparse matrices together. Does not exit gracefully if not possible.Is unable to add two sparse matrices together correctly.Transpose (10) Computes the transpose of integer and floating point sparse matricesComputes the transpose of integer and floating point matrices correctly.Computes the transpose of one type of sparse matrix correctly.In unable to compute the transpose of a sparse matrix correcly. Matrix multiplication (15)Correctly multiplies two matrices together. Exits gracefully if not possibleCorrectly multiplies two integer or two floating point matrices. Exits gracefully if multiplication is not possible.Attempts to multiply two sparse matrices together. Does not exit gracefully if not possible.Is unable to multiply two sparse matrices together.Log file format (10)Output files conform to the assignment specificationOutput files contain the correctheader and contentin all cases.Produces output files containing most information. Some minor formatting issues present.Does not write correct content to output files.Report (30) Addresses all required points and is presented clearly. Testing is thorough and clear.Report addresses all points effectively and is presented clearly. Performance testing is thoroughand meaningful.Report does not address all points required or some minor readiblity issues.Report addresses few points or is very difficult to read.A Some words of adviceWriting, testing and reporting on a piece of threaded software will feel slightlydifferent to your previous projects. Here are some humbly offered tips to getstarted:• Start early – Said with every project but is especially true here. You willneed time to test your final solution• Practice good code management – Write your code in multiple files, com-ment as you go, use a good text editor and a debugger etc. Make life easyfor yourself• On sparse matrices – Start by thinking carefully about how you will man-age your data separate to the matrix functions• On matrix functions – Focus on correctness to begin with. A ’sub-optimal’solution that exists will be more useful to you than a fantastic solutionthat you have not finished. Get something simple working, then make itfast.• On command-line-arguments and log files – Read the specification carefullyand clear up any possible confusions early.• On the report – We are most concerned with your testing method andresults. Again, start early while writing your code.Finally, feel free to ask the lab demonstrators and lecturer for advice if you getstuck.B Dense Matrix File FormatThere exist many sophisticated methods to store matrices in industry. For thesake of simplicity however we will use a plain-text representation of the data.Our format is quite simple• Datatype: ”int” or ”float”• Number of rows: An integer n > 0• Number of columns: An integer m > 0• n×m space-separated integers/floats representing each valueFor example int1.inint441 0 0 0 0 1 0 0 0 0 1 0 0 0 0 110This file represents the identity matrix1 0 0 00 1 0 00 0 1 00 0 0 1 (17)For an example of a float matrix float1.infloat441.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.5 0.0 0.0 0.0 0.0 0.75This file represents the following matrix1 0 0 00 0 0 00 1.0 0.5 00 0 0 0.75 (18)C Command Line Arguments GlossaryTo give direction in how to design your solution we require that your solutionimplements specific command line arguments (CLAs). You are allowed to im-plement more than these but these are a required minimum.• Several operators, determines what matrix operation will be performed(have been discussed previously)– –sc – Scalar multiplication– –tr – Trace– –ad – Addition– –ts – Transpose– –mm – Matrix multiplication• -f %s (%s) depending on the operation requested, one or more matrixfiles will need to be passed– Example: ./mysolution.exe –sc -f matrix1.in– Example: ./mysolution.exe –mm -f matrix1.in matrix2.in• -l Log. If present, results will be logged to fileAgain, you are allowed to add extra command line arguments but for full marks,our specified options must be implemented.11D Log-file FormatIn addition to the specified command line arguments we also have specific re-quirements for the .out files your solution must produce. The following filewould be the result of calling./mysolution.exe –tr -f int1.in -t 4at 11 : 59pm on the 22nd of August.Filename: 21726929_22082019_2359_tr.outtrint1.in44The following file would be the result of calling./mysolution.exe –mm -f float1.in float2.in -t 4at 12 : 00am on the 23rd of August.Filename: 21726929_23082019_0000_mm.outmmfloat1.infloat2.in4float440.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.25 0.0 0.0 0.0 0.0 0.375We have included a number of other example output files for your interest.12