- June 11, 2020

COMPSCI 369 THE UNIVERSITY OF AUCKLAND FIRST SEMESTER, 2020 —MID-SEMESTER TEST Version 1 Campus: VIRTUAL COMPUTER SCIENCE Computational Science (Time allowed: ONE hour) NOTE: The test has 15 questions, and it is worth 100 marks in total. This test contributes 10% to your final grade Answer all questions in a single document and upload it as a pdf file to Canvas, like you do for the assignments. This test has been designed as an open book one hour test. However you have a full 24 hours to work on this test. You are allowed to use the full 24 hours and you are allowed to use the provided materials. When a question requests to explain your answer, that means you need to justify how you came up with the solution or why you made a certain choice. Be concise and as clear as possible. The work must be your own. You are not allowed to copy text or code or solutions. We will not accept quotes to be valid answers. So, use your own words when answering the questions. We will run the test through Turnitin. You will be asked to include a sentence which states that the test is your own work. CONTINUED QUESTION SHEET 2 COMPSCI 369 Section A: Integrity Statement Copy down and sign the following statement. For the 24 hour duration of this test, I, (insert your name here), confirm that I will not discuss the content of the test with anyone else. I will not give any assistance to another student taking this test. I will not receive any assistance from any person or tutoring service. I will not post any information about this test on the Internet. [2 marks] Section B: Computational Biology and Numerical Integration The Runge-Kutta (RK4) Algorithm 1. Does RK4 take greater or fewer computations per iteration than the Euler method? [2 marks] 2. Explain why RK4 is preferable to the Euler method. [5 marks] 3. The RK4 algorithm calculates four ‘k-terms’ before combining them together in a weighted aver- age. In your own words, once they are all combined together, what is this value an estimation of? [5 marks] 4. Pretend that you are teaching a friend how RK4 works. The friend has never studied the algorithm before. Write down your detailed explanation of the k2 term. What does it estimate and how is it calculated? [7 marks] 5. Explain why finding a numerical artifact in a simulation is not a good thing. [3 marks] Building a dynamical model Pretend you have recently been hired by a farmer to build a computational model to help her understand her dairy farm. She wants the model to simulate her grassy grazing fields, which are shared by many cows and many sheep. Both cows and sheep eat the grass in these fields, and the number of cows grows proportionally to how much grass they eat. Similarly, the sheep grow proportionally to how much grass they eat. The grass grows at a constant rate. 6. Write down a set of differential equations for modelling this system. [12 marks] 7. List each VARIABLE in your model, and explain what it represents. [5 marks] 8. List each PARAMETER in your model and explain what it represents. [4 marks] 9. Identify two assumptions that are implicit in your model. [6 marks] 10. Draw a sketch of what you imagine a timeseries of this system might look like. Explain the main features of the plot and why [10 marks] CONTINUED QUESTION SHEET 3 COMPSCI 369 Section C: Sequence Alignments 11. You are given the following two alignments generated for the same pair of sequences. Given that the match score is sm, the mismatch score is sn, and the linear gap score is −g, which one aligns better the two sequences and why? Consider also the biological relevance as a criterion in the decision. Is there a better alignment than the two presented? Could you say that the two sequences are homologous? Explain your answers. Alignment 1 Alignment 2 ATG-CCAGG- ATGCCAGG- AT-CC-A-GG AT-CCA-GG [7 marks] 12. What is the minimum length and the maximum possible length of an optimal sequence alignment of two sequences A and B, where A can be any sequence of length 5 and B can be any sequence of length 7, in the case of (a) global alignment (b) local alignment (c) overlap alignment You can assume that the mismatch and gap scores can take only non-positive values, and the match score is always positive. Explain your answer. You can use examples to support your reasoning. [8 marks] 13. The following alignment score matrices were produced with one of the algorithms we discussed in the lectures. F1 : B1 B2 B3 B4 0 −1 −2 −3 −4 A1 −1 1 0 −1 −2 A2 −2 0 0 1 0 A3 −3 −1 −1 1 0 A4 −4 −2 −2 0 0 F2 : B1 B2 B3 B4 B5 B6 0 0 0 0 0 0 0 A1 0 0 0 0 0 0 0 A2 0 1 1 1 1 1 1 A3 0 1 2 2 2 2 2 A4 0 1 2 2 3 3 3 A5 0 1 2 2 3 4 4 A6 0 1 2 2 3 4 5 A7 0 1 2 2 3 4 5 Given that the mismatch and gap scores can take only non-positive values, and the match score is always positive, answer the following questions and explain your answers. CONTINUED QUESTION SHEET 4 COMPSCI 369 (a) What alignment algorithm could have produced the matrix F1? What is the optimal score? How will the optimal alignment look like? (b) What alignment algorithm could have produced the matrix F2? The elements in red colour and bold face trace the optimal path. Write down the alignment for the traced path. Is this the only solution? [10 marks] 14. You work with a biologist who just sequenced the complete genome of a new bacterial organism. She does not know if the sequence is new or already exists, but found an online database of all bacterial genomes sequenced so far. She does not know how to solve the problem computationally. So, she comes to you with the database containing M sequences of length N1 . . . Nm and the new sequence X of length Nx, and asks you to help her find out if the sequence is known. If the sequence is unknown, she wants to know what is the closest bacterial genome to X. (a) Explain how you will solve this problem, be explicit about the type of alignment algorithm you will use and justify your choices. (b) What will be the time complexity of the whole data analysis? Explain your result. [8 marks] 15. Can we solve the multiple alignment problem exactly and efficiently? Explain your answer. [6 marks]