Students

STAT1379 – Statistical Technologies for Data Science

2025 – Session 2, In person-scheduled-weekday, North Ryde

General Information

Download as PDF
Unit convenor and teaching staff Unit convenor and teaching staff Unit Convenor & Lecturer
Thomas Fung
12WW 626
By appointment. See iLearn for details
Unit Convenor & Lecturer
Iris Jiang
12WW 610
By appointment. See iLearn for details
Credit points Credit points
10
Prerequisites Prerequisites
STAT1170 or STAT1371 or STAT1250 or STAT1103 or FOSE1015
Corequisites Corequisites
Co-badged status Co-badged status
STAT6179 and COMP6179
Unit description Unit description

Professional statistical work often involves a blend of statistical modelling, coding, and effective communication of results. This unit introduces R, an industry-standard coding language for statistical computing and graphics, to address a wide spectrum of data science challenges, encompassing tasks like data importation, data wrangling, and visualisation. Through hands-on projects and real-world case studies, students will gain expertise in the end-to-end data science workflow. Students will also acquire proficiency in utilising relational databases and executing SQL queries to leverage the full potential of data for informed decision-making. The unit places a strong emphasis on practicality, with most classes and assessments conducted in a computer lab. Students who have completed this unit will emerge with a versatile skill set that bridges the gap between statistical analysis and data technology, positioning them for success in a data-driven world.

Learning in this unit enhances student understanding of global challenges identified by the United Nations Sustainable Development Goals (UNSDGs) Quality Education; Decent Work and Economic Growth; Industry, Innovation and Infrastructure

Important Academic Dates

Information about important academic dates including deadlines for withdrawing from units are available at https://www.mq.edu.au/study/calendar-of-dates

Learning Outcomes

On successful completion of this unit, you will be able to:

  • ULO1: Translate statistical thinking into coding syntax and appropriate data structures
  • ULO2: Compose SQL queries to manage and retrieve data from relational databases
  • ULO3: Import, clean, and transform real-world datasets using R
  • ULO4: Develop and implement computational strategies to tackle data challenges
  • ULO5: Generate meaningful visualisations using R and communicate results to diverse audiences
  • ULO6: Apply good practice of statistical project workflow

General Assessment Information

Attendance and participation

We strongly encourage all students to participate actively in all learning activities. Regular engagement is crucial for your success in this unit, as these activities provide opportunities to deepen your understanding of the material, collaborate with peers, and receive valuable feedback from instructors, to assist in completing the unit assessments. Your active participation not only enhances your own learning experience but also contributes to a vibrant and dynamic learning environment for everyone.

Requirements to Pass this Unit

To pass this unit you must:

  • Achieve a total mark equal to or greater than 50%.

Hurdle Assessments

There is no Hurdle Assessment in this unit.

Late Assessment Submission Penalty

Unless a Special Consideration request has been submitted and approved, a 5% penalty (of the total possible mark) will be applied each day a written assessment is not submitted, up until the 7th day (including weekends). After the 7th day, a grade of 0 will be awarded even if the assessment is submitted. Submission time for all written assessments is set at 11:55 pm. A 1-hour grace period is provided to students who experience a technical concern.

For any late submission of time-sensitive tasks, such as scheduled tests/exams, performance assessments/presentations, and/or scheduled practical assessments/labs, students need to submit an application for Special Consideration.

Assessments where Late Submissions will be accepted.

  • Problem Set 1 – YES, Standard Late Penalty applies;
  • Problem Set 2 – YES, Standard Late Penalty applies;
  • Individual Project – YES, Standard Late Penalty applies.

Special Consideration

The Special Consideration Policy aims to support students who have been impacted by short-term circumstances or events that are serious, unavoidable and significantly disruptive, and which may affect their performance in assessment. If you experience circumstances or events that affect your ability to complete the assessments in this unit on time, please inform the convenor and submit a Special Consideration request through https://connect.mq.edu.au.

Written Assessments/Quizzes/Tests: If you experience circumstances or events that affect your ability to complete the written assessments in this unit on time, please inform the convenor and submit a Special Consideration request through https://connect.mq.edu.au.

Assessment Tasks

Name Weighting Hurdle Due
Problem Set 1 20% No 29/08/2025
Problem Set 2 30% No 17/10/2025
Individual Project 50% No 07/11/2025

Problem Set 1

Assessment Type 1: Quantitative analysis task
Indicative Time on Task 2: 10 hours
Due: 29/08/2025
Weighting: 20%

 

This task will test the ability of students to use data technologies to analyse provided problems.

 


On successful completion you will be able to:
  • Translate statistical thinking into coding syntax and appropriate data structures
  • Import, clean, and transform real-world datasets using R
  • Develop and implement computational strategies to tackle data challenges
  • Generate meaningful visualisations using R and communicate results to diverse audiences

Problem Set 2

Assessment Type 1: Quantitative analysis task
Indicative Time on Task 2: 10 hours
Due: 17/10/2025
Weighting: 30%

 

This task will test the ability of students to use data technologies to analyse provided problems.

 


On successful completion you will be able to:
  • Translate statistical thinking into coding syntax and appropriate data structures
  • Compose SQL queries to manage and retrieve data from relational databases
  • Import, clean, and transform real-world datasets using R
  • Develop and implement computational strategies to tackle data challenges
  • Generate meaningful visualisations using R and communicate results to diverse audiences
  • Apply good practice of statistical project workflow

Individual Project

Assessment Type 1: Project
Indicative Time on Task 2: 40 hours
Due: 07/11/2025
Weighting: 50%

 

Students will be assigned a few statistical challenges. They will be required to study these problems using appropriate statistical and computational techniques implemented with statistical software and other data technologies. 

 


On successful completion you will be able to:
  • Translate statistical thinking into coding syntax and appropriate data structures
  • Compose SQL queries to manage and retrieve data from relational databases
  • Import, clean, and transform real-world datasets using R
  • Develop and implement computational strategies to tackle data challenges
  • Generate meaningful visualisations using R and communicate results to diverse audiences
  • Apply good practice of statistical project workflow

1 If you need help with your assignment, please contact:

  • the academic teaching staff in your unit for guidance in understanding or completing this type of assessment
  • the Writing Centre for academic skills support.

2 Indicative time-on-task is an estimate of the time required for completion of the assessment task and is subject to individual variation

Delivery and Resources

Classes

Workshops (beginning in Week 1): There is one 1-hour workshop each week. During this hour, we will provide a brief overview of the week's content, run a Q&A session, and engage in some interesting activites together. 

Pre-recorded Lectures (released weekly from Week 1): There are no formal live lectures scheduled for this unit. Each week we will have some video recordings covering the unit materials.

SGTA classes (beginning in Week 2): Students must register in and attend one 2-hour class per week.

The timetable for classes can be found on the University website at: https://publish.mq.edu.au

Enrolment can be managed using eStudent at: https://students.mq.edu.au/support/technology/systems/estudent

Suggested textbooks

The following textbooks are useful as supplementary resources, for additional questions and explanations. They are available from the Macquarie University library:

  • Garrett Grolemund, Hadley Wickham & Mine Çetinkaya-Rundel (2023) R for Data Science, 2nd Edition. O’Reilly Media, Inc.https://r4ds.hadley.nz
  • Grolemund, G. (2014) Hands-on programming with R. 1st edition. Sebastopol, CA, O’Reilly Media. https://rstudio-education.github.io/hopr/
  • Wickham, H., Navarro, D. & Pedersen T.L. (2025) ggplot2 Elegant Graphics for Data Analysis . 3nd ed (work in progress). Springer. https://ggplot2-book.org 
  • Wickham, H. & Bryan, J. (2023) R packages. 2nd ed. Sebastopol, CA, O’Reilly. https://r-pkgs.org. 

Technology Used and Required

This subject requires the use of the following computer software:

  • R: R is a free statistical software package. Access and installation instructions may be found at: https://www.r-project.org/
  • RStudio: RStudio is an open-source tool that is used to manage and present work performed using R. Access and installation instructions may be found at https://rstudio.com/products/rstudio/download/
  • LaTeX: LaTeX is a free mathematical typesetting program. Access and installation instructions may be found at: https://www.latex-project.org/get/
  • Git: Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
  • Quarto: An open-source scientific and technical publishing system, insatlled by default with the latest release of RStudio. 

Students are invited to bring their own devices (BYOD) and a laptop is recommended. Acceptable platforms are Windows, Linux and Mac. 

Communication

We will communicate with you via your university email or through announcements on iLearn. Queries to convenors can either be placed on the iLearn discussion forum or sent to your lecturers from your university email address.

COVID Information

For the latest information on the University’s response to COVID-19, please refer to the Coronavirus infection page on the Macquarie website: https://www.mq.edu.au/about/coronavirus-faqs. Remember to check this page regularly in case the information and requirements change during semester. If there are any changes to this unit in relation to COVID, these will be communicated via iLearn.

Unit Schedule

This is the first delivery of this unit, so the following scheudle is subject to change. 

Week Topics Assessement
1 Meet your toolkits: R, RStudio, Git, Quarto  
2 The Basics + Flow Control  
3 Summarise, tidy and transform data with tidyverse  
4 Create elegant data visualisation with ggplot2  
5 Iteration with purrr, a functional programming toolkit Problem Set 1 due
6 Retrive data from a database with DBI and dbplyr  
7 Writing R packages  
8 Introduction to Markdown  
Session 2 Break    
9 Reproducible report with Quarto  
10 Quarto Presentation in beamer and revealjs Problem Set 2 due
11 Storytelling with data  
12 Web scrapping with rvest  
13 Looking beyond ST4DS Project due

 

Policies and Procedures

Macquarie University policies and procedures are accessible from Policy Central (https://policies.mq.edu.au). Students should be aware of the following policies in particular with regard to Learning and Teaching:

Students seeking more policy resources can visit Student Policies (https://students.mq.edu.au/support/study/policies). It is your one-stop-shop for the key policies you need to know about throughout your undergraduate student journey.

To find other policies relating to Teaching and Learning, visit Policy Central (https://policies.mq.edu.au) and use the search tool.

Student Code of Conduct

Macquarie University students have a responsibility to be familiar with the Student Code of Conduct: https://students.mq.edu.au/admin/other-resources/student-conduct

Results

Results published on platform other than eStudent, (eg. iLearn, Coursera etc.) or released directly by your Unit Convenor, are not confirmed as they are subject to final approval by the University. Once approved, final results will be sent to your student email address and will be made available in eStudent. For more information visit connect.mq.edu.au or if you are a Global MBA student contact globalmba.support@mq.edu.au

Academic Integrity

At Macquarie, we believe academic integrity – honesty, respect, trust, responsibility, fairness and courage – is at the core of learning, teaching and research. We recognise that meeting the expectations required to complete your assessments can be challenging. So, we offer you a range of resources and services to help you reach your potential, including free online writing and maths support, academic skills development and wellbeing consultations.

Student Support

Macquarie University provides a range of support services for students. For details, visit http://students.mq.edu.au/support/

Academic Success

Academic Success provides resources to develop your English language proficiency, academic writing, and communication skills.

The Library provides online and face to face support to help you find and use relevant information resources. 

Student Services and Support

Macquarie University offers a range of Student Support Services including:

Student Enquiries

Got a question? Ask us via the Service Connect Portal, or contact Service Connect.

IT Help

For help with University computer systems and technology, visit http://www.mq.edu.au/about_us/offices_and_units/information_technology/help/

When using the University's IT, you must adhere to the Acceptable Use of IT Resources Policy. The policy applies to all who connect to the MQ network including students.

Changes from Previous Offering

STAT1379 superceded our previous offering of STAT1378. The main changes are:

  • The unit schedule has been redesigned to allow more time to cover the basics and to reduce the time spent on various typsettting systems. This has created space to include some new topics such as databases, web scrapping and storytelling with data. 
  • Quarto is now the primary typesetting system we used in this unit. Together with version control using Git & Github, these tools are integrated into our weekly activities, rather than being used just once or twice for assessemnt purposes. 
  • An extra hours of workshop has been added so we can do some engaging activities together as a community. 

We highly appreciate student feedback as it helps us enhance our unit offerings continually. Therefore, we encourage students to provide constructive feedback through various channels, such as student surveys, direct communication with teaching staff, or by utilising the FSE Student Experience & Feedback link available on the iLearn page.

 


Unit information based on version 2025.05 of the Handbook