Skip to main content


By May 15, 2020No Comments

CSCM77 Computer Vision and Deep Learning Coursework Released: 9 a.m., 13th March, 2020. Due: 11 a.m., 27th April, 2020. This coursework specification changed on 17th March, 2020. Please ensure you have fully read the updated sections and understand the new requirements. 1 Coursework Assignment: Pac-Man 1. To be completed by students working individually. 2. Feedback: You will receive feedback after your four-page report has been marked. 3. Learning outcome: The tasks in this assignment are based on your practical work in the lab sessions and understanding of the theories and methods. Thus, through this coursework, you are expected to demonstrate both practical skills and theoretical knowledge of several computer vision techniques. 4. Unfair practice: This work is to be attempted individually. You may request help from your lecturer, academic tutor and lab tutor, but you may not collaborate with your peers. 5. Submission deadline: The code and your four-page report must be submitted on Blackboard by 11 a.m. on Friday, 27th April, 2020. You must submit your work to Blackboard before the deadline. Late submissions without extenuating circumstances will receive zero marks. 6. You will be accessed on your four-page report. Your code will stand as supporting material to your report. 7. This coursework constitutes 20% of your final mark for this module. 2 Task Description In this coursework, you are given a set of 3D point-clouds with appearance features (i.e. RGB values). These point-clouds were collected using a Kinect system in our old PhD lab. Several virtual objects are also positioned among 1 those point clouds. Your task is to write a Python programme that can automat- ically detect those objects from an image and use them as anchors to navigate through the 3D scene. A set of example images that contain those virtual ob- jects are provided. These example images are used to train a random-forest classifier (or a classifier of another kind explored in the labs) in order detect the objects. Some Python code is provided to help you complete the task. Code demonstrating how to obtain a 2D image by projecting 3D point-clouds onto the camera image-plane, and how to re-position and rotate the camera, is provided as well. You will also write a four-page report on your work, which you will submit to Blackboard alongside your code. As of 17th March, 2020, you will be accessed on this report with your code as supporting material. The report must be structured as an academic paper, with the following structure: Introduction (10%). Contextualise the machine-learning problem and intro- duce the task and the hypothesis. Make sure to include a few references to previous work. You should demonstrate an awareness of the research-area. Methodology (50%). The model(s) you trained to undertake the task. Any decisions on hyperparameters must be stated here, including motivation for your choices where applicable. If the basis of your decision is experi- mentation with a number of parameters, then state this. Results (30%). Describe, compare and contrast the results you obtained on your model(s). Any relationships in the data should be outlined and pointed out here. Only the most important conclusions should be men- tioned in the text. By using tables and confusion-matrices to support the section, you can avoid describing the results fully. Discussion and Conclusion (10%). Restate the task and hypothesis/-ses con- cisely. Reiterate the methods used. Describe the outcome of the experi- ment and the conclusion that you can draw from these results in respect of the hypothesis/-ses. The following materials from lectures are relevant to this task: 1. Camera translation and orientation. 2 2. Feature descriptors, e.g. histograms of pixel intensity, histograms of ori- ented gradients, etc. 3. Supervised learning using random forests, convolutional neural networks, region-proposal networks. All the software and data are available on Blackboard. 2.1 Demo Code Demo code is provided for orienting the camera view and obtaining an image from the current camera view. The software to generate and visualise the point clouds are also provided. The Python functions that are required to train and test random forests are explained below. Python provides a suite of methods for random-forest classification, namely the sklearn.ensemble.RandomForestClassifier class, which is able to train an ensemble for classification given some training observations and their labels. Key uses of sklearn.ensemble.RandomForestClassifier include: • To instantiate a random-forest object: classifier = sklearn.ensemble.RandomForestClassifier( n_estimators=n_trees) • Fit the Random Forest to the data and labels:, labels) • Get predicted class probabilities given data: prediction_probability = classifier.predict_proba(data) The use of these functions will be further explained in the lab sessions. Random-forest classification is required to detect those artificial objects among the point clouds. Training should be carried out on the provided ex- ample images. The detection requires a sliding-window-based evaluation that is same as in human detection (HoG, covered in lectures). The sliding window should be the same size as the training images. You can use any features to train the classifier, for example, a histogram of pixel values or histogram of oriented gradients. Once a virtual object is detected, you need to move your camera to where the virtual object is located in space and start your search for the next one until all virtual objects are found. In the event that multiple virtual objects are detected in a single view, the nearest virtual object should be selected. 2.2 Viva Voce Following the termination of all face-to-face lectures at Swansea University on Tuesday, 17th March, 2020, there will be no viva voces. You will be accessed remotely based on your four-page report. You must however submit your code alongside your report. 3 2.3 Marking Criteria Your report work will be assessed on its structure, content and presentation. We expect it to be read as an academic paper, with the explanation appropriately divided as per the structure described in the Task Description above. You should demonstrate your knowledge of the field, along with any conclusions you can draw from your results. We expect at least the same standard of work as in CSC345, the prerequiste module for CSCM77. 2.4 Assessment You will be accessed on the four-page report that you submit to Blackboard. You must submit your code alongside your report, as this will act as supporting material. This assignment is worth 20% of the total credit. 4


Author admin

More posts by admin