Students

STAT8107 – Statistical Learning

2025 – Session 2, Online-scheduled-In person assessment, North Ryde

General Information

Download as PDF
Unit convenor and teaching staff Unit convenor and teaching staff Convenor
Iris Jiang
610, 12 Wally's Walk
By appointment only. See iLearn for details.
Credit points Credit points
10
Prerequisites Prerequisites
STAT6110 or STAT6191 or STAT8310 or (Admission to GradCertResFSE or GradDiptResFSE)
Corequisites Corequisites
Co-badged status Co-badged status
COMP8107
Unit description Unit description

This unit provides a comprehensive exploration of statistical learning, equipping students with both theoretical foundations and practical skills for analysing complex data. With a strong emphasis on the entire predictive modelling workflow, students learn to build, evaluate, and refine models using R, the tidyverse, and the tidy modelling framework. Key statistical learning concepts, such as the bias-variance trade-off and empirical risk minimisation, are thoroughly examined to develop a critical understanding of model performance and generalisation. Students gain hands-on experience in model assessment using resampling techniques and implementing diverse models for regression and classification. The unit also covers advanced non-linear methods, including decision trees, random forests, and boosting, alongside unsupervised learning techniques like principal component analysis, k-means, and hierarchical clustering. Through theoretical insights and real-world case studies, students are equipped to tackle complex data-driven challenges with confidence and rigour.

Learning in this unit enhances student understanding of global challenges identified by the United Nations Sustainable Development Goals (UNSDGs) Industry, Innovation and Infrastructure

Important Academic Dates

Information about important academic dates including deadlines for withdrawing from units are available at https://www.mq.edu.au/study/calendar-of-dates

Learning Outcomes

On successful completion of this unit, you will be able to:

  • ULO1: Demonstrate an understanding of empirical risk minimisation and its applications in regression and classification.
  • ULO2: Apply effective resampling techniques for model development.
  • ULO3: Understand the impact of underfitting and overfitting on model generalisability to unseen data, and analyse strategies for mitigating overfitting through regularised regression and effective hyperparameter tuning.
  • ULO4: Implement advanced ensemble methods, including random forests and boosting, to enhance predictive performance in complex scenarios.
  • ULO5: Apply unsupervised learning techniques such as clustering and dimension reduction to analyse complex datasets and develop insights into underlying patterns and relationships.
  • ULO6: Utilise statistical learning techniques and state-of-the-art software tools to solve real-world problems.

General Assessment Information

Requirements to Pass this Unit

To pass this unit you must:

  • Achieve a total mark equal to or greater than 50%.

Hurdle Assessments

There is no Hurdle Assessment.

Attendance and participation

We strongly encourage all students to actively participate in all learning activities. Regular engagement is crucial for your success in this unit, as these activities provide opportunities to deepen your understanding of the material, collaborate with peers, and receive valuable feedback from instructors, to assist in completing the unit assessments. Your active participation not only enhances your own learning experience but also contributes to a vibrant and dynamic learning environment for everyone.

Late Assessment Submission Penalty

Unless a Special Consideration request has been submitted and approved, a 5% penalty (of the total possible mark) will be applied each day a written assessment is not submitted, up until the 7th day (including weekends). After the 7th day, a grade of 0 will be awarded even if the assessment is submitted. Submission time for all written assessments is set at 11:55 pm. A 1-hour grace period is provided to students who experience a technical concern.

For any late submission of time-sensitive tasks, such as scheduled tests/exams, performance assessments/presentations, and/or scheduled practical assessments/labs, students need to submit an application for Special Consideration.

Assessments where Late Submissions will be accepted.

In this unit late submissions will be accepted as follows:

  • Assignment – YES, Standard Late Penalty applies;
  • Case study/analysis – YES, Standard Late Penalty applies.
  • Final Exam – NO, unless Special Consideration is granted.

Special Consideration

The Special Consideration Policy aims to support students who have been impacted by short-term circumstances or events that are serious, unavoidable and significantly disruptive, and which may affect their performance in assessment. If you experience circumstances or events that affect your ability to complete the written assessments in this unit on time, please inform the convenor and submit a Special Consideration request through http://connect.mq.edu.au.

Written Assessments/Quizzes/Tests: If you experience circumstances or events that affect your ability to complete the written assessments in this unit on time, please inform the convenor and submit a Special Consideration request through https://connect.mq.edu.au.

Assessment Tasks

Name Weighting Hurdle Due
Assignment 25% No 05/09/2025
Case study 35% No 17/10/2025
Final Exam 40% No Exam Period

Assignment

Assessment Type 1: Quantitative analysis task
Indicative Time on Task 2: 20 hours
Due: 05/09/2025
Weighting: 25%

 

Written Report

 


On successful completion you will be able to:
  • Demonstrate an understanding of empirical risk minimisation and its applications in regression and classification.
  • Apply effective resampling techniques for model development.
  • Understand the impact of underfitting and overfitting on model generalisability to unseen data, and analyse strategies for mitigating overfitting through regularised regression and effective hyperparameter tuning.
  • Utilise statistical learning techniques and state-of-the-art software tools to solve real-world problems.

Case study

Assessment Type 1: Case study/analysis
Indicative Time on Task 2: 20 hours
Due: 17/10/2025
Weighting: 35%

 

An authentic case study where students turn real data into a magazine-style article, focusing on clear communication and storytelling for a non-technical audience.

 


On successful completion you will be able to:
  • Demonstrate an understanding of empirical risk minimisation and its applications in regression and classification.
  • Apply effective resampling techniques for model development.
  • Understand the impact of underfitting and overfitting on model generalisability to unseen data, and analyse strategies for mitigating overfitting through regularised regression and effective hyperparameter tuning.
  • Implement advanced ensemble methods, including random forests and boosting, to enhance predictive performance in complex scenarios.
  • Apply unsupervised learning techniques such as clustering and dimension reduction to analyse complex datasets and develop insights into underlying patterns and relationships.
  • Utilise statistical learning techniques and state-of-the-art software tools to solve real-world problems.

Final Exam

Assessment Type 1: Examination
Indicative Time on Task 2: 2 hours
Due: Exam Period
Weighting: 40%

 

An invigilated exam is to be scheduled in the university exam period. 

 


On successful completion you will be able to:
  • Demonstrate an understanding of empirical risk minimisation and its applications in regression and classification.
  • Apply effective resampling techniques for model development.
  • Understand the impact of underfitting and overfitting on model generalisability to unseen data, and analyse strategies for mitigating overfitting through regularised regression and effective hyperparameter tuning.
  • Implement advanced ensemble methods, including random forests and boosting, to enhance predictive performance in complex scenarios.
  • Apply unsupervised learning techniques such as clustering and dimension reduction to analyse complex datasets and develop insights into underlying patterns and relationships.

1 If you need help with your assignment, please contact:

  • the academic teaching staff in your unit for guidance in understanding or completing this type of assessment
  • the Writing Centre for academic skills support.

2 Indicative time-on-task is an estimate of the time required for completion of the assessment task and is subject to individual variation

Delivery and Resources

Classes

Lectures (beginning in Week 1): There is one two-hour lectures each week.

SGTA classes (beginning in Week 2): Students must register for one one-hour class per week.

The timetable for classes can be found on the University website at: https://publish.mq.edu.au

Enrolment can be managed using eStudent at: https://students.mq.edu.au/support/technology/systems/estudent

Suggested textbooks

The following book is useful as supplementary resources, for additional questions and explanations. 

  1. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An introduction to statistical learning: With applications in R (2nd ed.). Springer.
  2. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.

  3. Kuhn, M., & Silge, J. (2022). Tidy modeling with R. O’Reilly Media, Inc.

Technology Used and Required

This subject requires the use of the following computer software:

  • R: R is a free statistical software package. Access and installation instructions may be found at: https://www.r-project.org/
  • RStudio: RStudio is an open source tool that is used to manage and present work performed using R. Access and installation instructions may be found at https://rstudio.com/products/rstudio/download/
  • LaTeX: LaTeX is a free mathematical typesetting program. You should use this to help to typeset your assignment. Access and installation instructions may be found at: https://www.latex-project.org/get/
  • Quarto: An open-source scientific and technical publishing system, insatlled by default with the latest release of RStudio. 

Communication

We will communicate with you via your university email or through announcements on iLearn. Queries to convenors can either be placed on the iLearn discussion forum or sent to your lecturers from your university email address.

COVID Information

For the latest information on the University’s response to COVID-19, please refer to the Coron- avirus infection page on the Macquarie website: https://www.mq.edu.au/about/coronavirus-faqs. Remember to check this page regularly in case the information and requirements change during semester. If there are any changes to this unit in relation to COVID, these will be communicated via iLearn.

Unit Schedule

This is a draft schedule and is subjected to change.

Week

Topics

 

1

Introduction to Statistical Learning

 

2

Linear methods for Regression

 

3

Resampling and Model Selection

 

4

Variable Selection and Regularisation

 

5

Exploratory Data Analysis & Assignment Q&A

 

6

From Probabilities to Decisions: Logistic Regression & Classification Metrics

Assignment 1 due

7

LDA: From Linear Decision Rules to Lower Dimensions

 

8

Beyond Linear Boundaries: QDA & Naïve Bayes

 

Session 2 Break

 

 

9

Beyond Linear Boundaries: QDA & Naïve Bayes (Continued) & Case Study Q&A

 

10

Trees and Tree-based Ensemble Methods 

Case Study due

11

Trees and Tree-based Ensemble Methods (Continued)

 

12

Exploring Unlabeled Data: Principal Component Analysis and Clustering

 

Policies and Procedures

Macquarie University policies and procedures are accessible from Policy Central (https://policies.mq.edu.au). Students should be aware of the following policies in particular with regard to Learning and Teaching:

Students seeking more policy resources can visit Student Policies (https://students.mq.edu.au/support/study/policies). It is your one-stop-shop for the key policies you need to know about throughout your undergraduate student journey.

To find other policies relating to Teaching and Learning, visit Policy Central (https://policies.mq.edu.au) and use the search tool.

Student Code of Conduct

Macquarie University students have a responsibility to be familiar with the Student Code of Conduct: https://students.mq.edu.au/admin/other-resources/student-conduct

Results

Results published on platform other than eStudent, (eg. iLearn, Coursera etc.) or released directly by your Unit Convenor, are not confirmed as they are subject to final approval by the University. Once approved, final results will be sent to your student email address and will be made available in eStudent. For more information visit connect.mq.edu.au or if you are a Global MBA student contact globalmba.support@mq.edu.au

Academic Integrity

At Macquarie, we believe academic integrity – honesty, respect, trust, responsibility, fairness and courage – is at the core of learning, teaching and research. We recognise that meeting the expectations required to complete your assessments can be challenging. So, we offer you a range of resources and services to help you reach your potential, including free online writing and maths support, academic skills development and wellbeing consultations.

Student Support

Macquarie University provides a range of support services for students. For details, visit http://students.mq.edu.au/support/

Academic Success

Academic Success provides resources to develop your English language proficiency, academic writing, and communication skills.

The Library provides online and face to face support to help you find and use relevant information resources. 

Student Services and Support

Macquarie University offers a range of Student Support Services including:

Student Enquiries

Got a question? Ask us via the Service Connect Portal, or contact Service Connect.

IT Help

For help with University computer systems and technology, visit http://www.mq.edu.au/about_us/offices_and_units/information_technology/help/

When using the University's IT, you must adhere to the Acceptable Use of IT Resources Policy. The policy applies to all who connect to the MQ network including students.

Changes from Previous Offering

Student feedback from the previous offering of this unit was very positive, with students appreciated the clarity of assessment requirements, the availability of support, and the overall structure and delivery of the unit. Many also greatly valued the open-ended case study, noting it provided a meaningful opportunity to practise independent analysis and apply statistical reasoning to real-world problems. Although some students initially found the case study challenging, they acknowledged its value in developing skills for stakeholder communication and in navigating uncertainty or ambiguity in data analysis.

Given the strong feedback, no changes have been made to the delivery of the unit for this offering. However, we remain committed to continuously improving the level of support and student engagement, and we welcome further feedback as the session progresses.

 

 


Unit information based on version 2025.07 of the Handbook