Students

COMP7220 – Data Science and Machine Learning

2023 – Session 1, In person-scheduled-weekday, North Ryde

General Information

Download as PDF
Unit convenor and teaching staff Unit convenor and teaching staff Convenor, Lecturer
Rolf Schwitter
Contact via via email
4RPD, room 359
by appointment
Lecturer
Mark Dras
Contact via via email
4RPD, room 208
by appointment
Tutor
Matineh Pooshideh
Contact via via email
4RPD
by appointment
Credit points Credit points
10
Prerequisites Prerequisites
Admission to MRes
Corequisites Corequisites
Co-badged status Co-badged status
COMP8220
Unit description Unit description

This unit begins with conventional machine learning techniques for constructing classifiers and regression models, including widely applicable standard techniques such as Naive Bayes, decision trees, logistic regression and support vector machines (SVMs); in this part, given required prior knowledge of machine learning, we focus on more advanced aspects. We then look in detail at deep learning and other state-of-the-art approaches. We discuss in detail the advantages and disadvantages of each method, in terms of computational requirements, ease of use, and performance, and we study the practical application of these methods in a number of use cases.

Important Academic Dates

Information about important academic dates including deadlines for withdrawing from units are available at https://www.mq.edu.au/study/calendar-of-dates

Learning Outcomes

On successful completion of this unit, you will be able to:

  • ULO1: Derive algorithms to solve machine learning problems based on an understanding of how machine learning and data science problems are mathematically formulated and analysed.
  • ULO2: Create machine learning solutions to data science problems by identifying and applying appropriate algorithms and implementations.
  • ULO3: Analyse real-world data science problems, identify which methods are appropriate, organise the data appropriately, apply one or more methods, and evaluate the quality of the solution.
  • ULO4: Evaluate one or more approaches to advanced topics in machine learning and data science and report the findings in oral and written form.

General Assessment Information

Requirement to Pass this Unit

To pass this unit, you must achieve a total mark equal to or greater than 50%.

Late Assessment Submission Penalty 

Unless a Special Consideration request has been submitted and approved, a 5% penalty (of the total possible mark of the task) will be applied for each day a written report or presentation assessment is not submitted, up until the 7th day (including weekends). After the 7th day, a grade of ‘0’ will be awarded even if the assessment is submitted. The submission time for all uploaded assessments is 11:55 pm. A 1-hour grace period will be provided to students who experience a technical concern.

For any late submission of time-sensitive tasks, such as scheduled tests/exams, performance assessments/presentations, and/or scheduled practical assessments/labs, please apply for Special Consideration

Assessments where Late Submissions will be accepted/not accepted:

  • Assessed Taks #1:(Multiple Choice Test): No, unless Special Consideration is granted.
  • Assessed Task #2: Yes, Standard Late Penalty applies.
  • Assessed Task #3: Yes, Standard Late Penalty applies.
  • Major Project: No, unless Special Consideration is granted.
  • Individual Project: Yes, Standard Late Penalty applies.

Special Consideration

The Special Consideration Policy aims to support students who have been impacted by short-term circumstances or events that are serious, unavoidable and significantly disruptive, and which may affect their performance in assessment. If you experience circumstances or events that affect your ability to complete the assessments in this unit on time, please inform the convenor and submit a Special Consideration request through ask.mq.edu.au.

 

Assessment Tasks

Name Weighting Hurdle Due
Individual Project 30% No Initial: week 6; final: week 13
Practical Exercises 30% No Throughout semester (see iLearn)
Major Project 40% No Initial: end first week of break; final: week 13

Individual Project

Assessment Type 1: Project
Indicative Time on Task 2: 25 hours
Due: Initial: week 6; final: week 13
Weighting: 30%

In contrast to the Major Project, in this one the student will select a dataset from an appropriate domain, and then design and implement a solution to a task on this chosen dataset. The deliverables will be the implementation and a report describing this implementation.


On successful completion you will be able to:
  • Derive algorithms to solve machine learning problems based on an understanding of how machine learning and data science problems are mathematically formulated and analysed.
  • Create machine learning solutions to data science problems by identifying and applying appropriate algorithms and implementations.
  • Analyse real-world data science problems, identify which methods are appropriate, organise the data appropriately, apply one or more methods, and evaluate the quality of the solution.
  • Evaluate one or more approaches to advanced topics in machine learning and data science and report the findings in oral and written form.

Practical Exercises

Assessment Type 1: Problem set
Indicative Time on Task 2: 30 hours
Due: Throughout semester (see iLearn)
Weighting: 30%

These will consist of practical exercises set throughout the semester.


On successful completion you will be able to:
  • Derive algorithms to solve machine learning problems based on an understanding of how machine learning and data science problems are mathematically formulated and analysed.
  • Create machine learning solutions to data science problems by identifying and applying appropriate algorithms and implementations.

Major Project

Assessment Type 1: Project
Indicative Time on Task 2: 30 hours
Due: Initial: end first week of break; final: week 13
Weighting: 40%

The student will apply knowledge of conventional machine learning and deep learning to design and implement a solution to a (classification or other) task on a defined dataset. The deliverables will be the implementation and a report describing this implementation.


On successful completion you will be able to:
  • Derive algorithms to solve machine learning problems based on an understanding of how machine learning and data science problems are mathematically formulated and analysed.
  • Create machine learning solutions to data science problems by identifying and applying appropriate algorithms and implementations.
  • Analyse real-world data science problems, identify which methods are appropriate, organise the data appropriately, apply one or more methods, and evaluate the quality of the solution.
  • Evaluate one or more approaches to advanced topics in machine learning and data science and report the findings in oral and written form.

1 If you need help with your assignment, please contact:

  • the academic teaching staff in your unit for guidance in understanding or completing this type of assessment
  • the Writing Centre for academic skills support.

2 Indicative time-on-task is an estimate of the time required for completion of the assessment task and is subject to individual variation

Delivery and Resources

Classes

  • Classes: There will be a two hour lecture each week, and additionally a small practical class that will focus on working through practical tasks.
  • Textbook: The main textbook for the unit is Aurélien Géron (2019)  "Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow" (2nd edition; September 2019).  This is available through the MQ library (MQ has an arrangement with publisher O'Reilly: you can register at O'Reilly using your MQ email, and get access to the book there).  The book comes with source code that is available from https://github.com/ageron/handson-ml2. A supplementary source of material for a deeper understanding of the theoretical material is Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009; corrected 12th printing Jan 2017) "The Elements of Statistical Learning: Data Mining, Inference, and Prediction."  A freely downloadable pdf is available at the first author's webpage
  • Pracitcal classes start in Week 1.

Background Material

  • The unit requires a sound background in programming, and particularly Python.  If you feel you need a refresher on Python (or an introduction from scratch, as long as you're a quick and independent learner), there's a popular tutorial at http://learnpython.org/.  This goes all the way from basic programming to the mathematical and data science libraries used by Python, like numpy and pandas.  There's also the resources at the Python website at python.org, like the Beginner's Guide.
  • For a refresher on linear algebra as it is relevant to machine learning, Jason Brownlee (2018) "Basics of Linear Algebra for Machine Learning" has useful material that's linked to Python data structures.  (The book used to have a freely available pdf, but this seems to have disappeared.  It is published by Machine Learning Mastery.)

Unit Webpage and Technology Used and Required

  • iLearn is going to be used as a main web server for the unit.
  • The programming language for the unit will be Python.  The "conventional" machine learning section will use Python's scikit-learn, and the deep learning section will use TensorFlow and Keras.
  • FWe'll typically be running these notebooks on Google Colab.

Methods of Communication

  • We will communicate with you via your university email or through announcements on iLearn. Queries to academics can either be placed on the iLearn discussion board or sent to them from your university email address.

COVID Information

  • For the latest information on the University’s response to COVID-19, please refer to the Coronavirus infection page on the Macquarie website: https://www.mq.edu.au/about/coronavirus-faqs. Remember to check this page regularly in case the information and requirements change during semester. If there are any changes to this unit in relation to COVID, these will be communicated via iLearn.

Unit Schedule

Week Topic Readings (from Géron)
1

What is Machine Learning?

Ch 1
2

Workflow of a Machine Learning Project

Ch 2
3 Support Vector Machines and Decision Trees Ch 3-6
4

Ensemble Learning, Random Forests, and Dimensionality Reduction

Ch 7-8
5 Handling Text Data supplementary notes
6-7 Introduction to Artificial Neural Networks:
  • ANN basics
  • Multi-Layer Perceptrons
  • The Tensorflow and Keras frameworks
Ch 10-11
8-9

Deep Neural Networks

  • The structure of deep NNs
  • Convolutional NNs
  • Practical issues in training NNs
Ch 11-14, supplementary notes
10

NNs for sequences, and advanced topics:

  • Recurrent NNs
  • Autoencoders
Ch 15 and onwards, supplementary notes
11-12 Reinforcement Learning supplementary notes
13 Unit review  

 

Policies and Procedures

Macquarie University policies and procedures are accessible from Policy Central (https://policies.mq.edu.au). Students should be aware of the following policies in particular with regard to Learning and Teaching:

Students seeking more policy resources can visit Student Policies (https://students.mq.edu.au/support/study/policies). It is your one-stop-shop for the key policies you need to know about throughout your undergraduate student journey.

To find other policies relating to Teaching and Learning, visit Policy Central (https://policies.mq.edu.au) and use the search tool.

Student Code of Conduct

Macquarie University students have a responsibility to be familiar with the Student Code of Conduct: https://students.mq.edu.au/admin/other-resources/student-conduct

Results

Results published on platform other than eStudent, (eg. iLearn, Coursera etc.) or released directly by your Unit Convenor, are not confirmed as they are subject to final approval by the University. Once approved, final results will be sent to your student email address and will be made available in eStudent. For more information visit ask.mq.edu.au or if you are a Global MBA student contact globalmba.support@mq.edu.au

Academic Integrity

At Macquarie, we believe academic integrity – honesty, respect, trust, responsibility, fairness and courage – is at the core of learning, teaching and research. We recognise that meeting the expectations required to complete your assessments can be challenging. So, we offer you a range of resources and services to help you reach your potential, including free online writing and maths support, academic skills development and wellbeing consultations.

Student Support

Macquarie University provides a range of support services for students. For details, visit http://students.mq.edu.au/support/

The Writing Centre

The Writing Centre provides resources to develop your English language proficiency, academic writing, and communication skills.

The Library provides online and face to face support to help you find and use relevant information resources. 

Student Services and Support

Macquarie University offers a range of Student Support Services including:

Student Enquiries

Got a question? Ask us via AskMQ, or contact Service Connect.

IT Help

For help with University computer systems and technology, visit http://www.mq.edu.au/about_us/offices_and_units/information_technology/help/

When using the University's IT, you must adhere to the Acceptable Use of IT Resources Policy. The policy applies to all who connect to the MQ network including students.

Changes from Previous Offering

The late submission rule was changed to align with the new Faculty policy.

Changes since First Published

Date Description
10/03/2023 There was a mix-up with respect to due dates. The following is correct (and not vice versa as in the first published version). Individual project: Due: Initial: week 6; final: week 13 Practical exercises:: Due: Throughout semester (see iLearn)

Unit information based on version 2023.01R of the Handbook