Students

COMP8240 – Applications of Data Science

2025 – Session 2, In person-scheduled-weekday, North Ryde

General Information

Download as PDF
Unit convenor and teaching staff Unit convenor and teaching staff Convenor, Lecturer
Dr Rolf Schwitter
4 RPD, office 359
By appointment.
Lecturer
Prof. Mark Dras
4 RPD, office 208
By appointment.
Credit points Credit points
10
Prerequisites Prerequisites
COMP6200 or COMP6210 or Admission to the GradDipRes or GradCertRes
Corequisites Corequisites
Co-badged status Co-badged status
Unit description Unit description

This unit deals with the application of Data Science techniques to the analysis of data in a research context. Topics covered include the management of data and keeping track of intermediate results, small and large scale data processing techniques, scripting experiments, version control for source code and data, the problem of replication of research results, data publication and presentation of results in various forms. Students will complete a significant data analysis project that will use available data sets to address a research question and present results to a well defined target audience.

Learning in this unit enhances student understanding of global challenges identified by the United Nations Sustainable Development Goals (UNSDGs) Industry, Innovation and Infrastructure

Important Academic Dates

Information about important academic dates including deadlines for withdrawing from units are available at https://www.mq.edu.au/study/calendar-of-dates

Learning Outcomes

On successful completion of this unit, you will be able to:

  • ULO1: Define and manage a project involving empirical research.
  • ULO2: Apply a knowledge of programming and/or use of appropriate applications (for e.g. data gathering, curation, cleaning or analysis) in the context of practical work relevant to an empirical research project.
  • ULO3: Articulate clearly a coherent argument in written and oral form to a variety of audiences.
  • ULO4: Apply a knowledge of the principles of ethical conduct of research, including an examination of the role of open access to data and publications.
  • ULO5: Demonstrate best practice in document preparation and management in research.

General Assessment Information

Requirement to Pass this unit

To pass this unit, you must achieve a total mark equal to or greater than 50%.

Assessment Availability Dates

  • Assessment 1: Friday, Week 2
  • Assessment 2: Friday, Week 6
  • Assessment 3: Friday, Week 9

Late Assessment Submission Penalty 

Unless a Special Consideration request has been submitted and approved, a 5% penalty (of the total possible mark of the task) will be applied for each day a written report or presentation assessment is not submitted, up until the 7th day (including weekends). After the 7th day, a grade of ‘0’ will be awarded even if the assessment is submitted. The submission time for all uploaded assessments is 11:55 pm. A 1-hour grace period will be provided to students who experience a technical concern.

For any late submission of time-sensitive tasks, such as scheduled assessments, please apply for Special Consideration

Assignments where Late Submissions will be accepted:

  • Assignment #1  Yes, Standard Late Penalty applies.
  • Assignment #2: Yes, Standard Late Penalty applies.
  • Assignment #3: Yes, Standard Late Penalty applies.

Special Consideration

The Special Consideration Policy aims to support students who have been impacted by short-term circumstances or events that are serious, unavoidable and significantly disruptive, and which may affect their performance in assessment. If you experience circumstances or events that affect your ability to complete the assessments in this unit on time, please inform the convenor and submit a Special Consideration request through Service Connect.

Assessment Tasks

Name Weighting Hurdle Due
Project Proposal 20% No Week 5, Friday, 23:55
Project Update Presentation 40% No Week 9, Friday, 23:55
Final Report 40% No Week 13, Friday, 23:55

Project Proposal

Assessment Type 1: Plan
Indicative Time on Task 2: 20 hours
Due: Week 5, Friday, 23:55
Weighting: 20%

 

You will develop a project proposal that introduces the selected work you intend to reproduce. The proposal should explain the data you will use, provide the necessary background, and outline a plan for carrying out the project.

 


On successful completion you will be able to:
  • Define and manage a project involving empirical research.
  • Articulate clearly a coherent argument in written and oral form to a variety of audiences.

Project Update Presentation

Assessment Type 1: Media presentation
Indicative Time on Task 2: 40 hours
Due: Week 9, Friday, 23:55
Weighting: 40%

 

This video presentation, which will go into detail on work achieved (e.g. by including a detailed code walkthrough), will provide an update on the project's progress.The workload for this task encompasses the time spent on data and code preparation, as well as the time spent creating the presentation.

 


On successful completion you will be able to:
  • Define and manage a project involving empirical research.
  • Apply a knowledge of programming and/or use of appropriate applications (for e.g. data gathering, curation, cleaning or analysis) in the context of practical work relevant to an empirical research project.
  • Articulate clearly a coherent argument in written and oral form to a variety of audiences.

Final Report

Assessment Type 1: Report
Indicative Time on Task 2: 33 hours
Due: Week 13, Friday, 23:55
Weighting: 40%

 

This report will describe the completed project as a whole: what the goals were, what data was used, how it was processed, and what the results were relative to the goals. It may also include any related programs written as part of the project, etc.

 


On successful completion you will be able to:
  • Define and manage a project involving empirical research.
  • Apply a knowledge of programming and/or use of appropriate applications (for e.g. data gathering, curation, cleaning or analysis) in the context of practical work relevant to an empirical research project.
  • Articulate clearly a coherent argument in written and oral form to a variety of audiences.
  • Apply a knowledge of the principles of ethical conduct of research, including an examination of the role of open access to data and publications.
  • Demonstrate best practice in document preparation and management in research.

1 If you need help with your assignment, please contact:

  • the academic teaching staff in your unit for guidance in understanding or completing this type of assessment
  • the Writing Centre for academic skills support.

2 Indicative time-on-task is an estimate of the time required for completion of the assessment task and is subject to individual variation

Delivery and Resources

CLASSES

Each week consists of a formally designated two hours of lecture and one hour practical session, although the lecture session may involve some practical aspects as well. For details of days, times and rooms, consult the University timetables webpage. Lectures start in Week1 and practicals in Week2.

REQUIRED AND RECOMMENDED TEXTS AND/OR MATERIALS

There is no set text for the unit.  We will be providing pointers to reading material over the course of the unit. The unit has some parallels with the freely available Software Carpentry course. We'll be using those resources as supplementary ones for the unit.

UNIT WEBPAGE AND TECHNOLOGY USED AND REQUIRED

Web Home Page

The unit will make extensive use of the iLearn course management system, including for delivery of class materials, discussion boards, submission of work and access to marks and feedback. Students should check the iLearn site (https://ilearn.mq.edu.au) regularly for unit updates.

Questions and general queries regarding the content of this unit, its lectures, or its assignments should be posted to the discussion boards on the iLearn site. In particular, any questions which are of interest to all students in this unit should be posted to one of these discussion boards, so that everyone can benefit from the answers. Questions of a private nature should be directed to the unit teaching staff.

Technology Used and Required

The practical work in this unit involves programming in the Python language (http://www.python.org/) which is widely used for the sorts of scripting purposes covered in this unit. Python can be downloaded free of charge for a range of operating systems from the Python website.  

Note that as this is a master's unit, there will be some self-directed learning.  We do not expect that you will know Python before the unit starts, but will pick up the necessary elements in the first few weeks of the unit; we will give pointers to resources for learning Python, and will include snippets of Python in lecture notes where relevant to computational experiments.  We will generally (but not always) use Jupyter Notebooks for Python examples, and will use Google Colab as the environment for running them.  (Google Colab is a free environment that can used for some sorts of data analysis relevant to practical tasks and the major project.) We will also use GitHub and GitHub Codespaces: https://github.com/codespaces.

The unit will also use various other tools for e.g. data gathering and annotation.  

Unit Schedule

The focus of this unit is understanding the notions of open science and reproducible research.  Much work in both academia and industry is driven by the free availability of papers, code and data that allow the replication and extension of existing work. In this unit, your major project will involve getting access to some of these resources, reproducing some existing work with the original data, and then investigating whether e.g. the replication works with new data. To engage fully with these freely available resources, competence with a range of techniques and tools is necessary. 

Below is a tentative schedule. The weekly topics are intended to cover useful techniques and tools for carrying out your data-oriented project, and may change depending upon chosen student projects, etc.  

 

Week 1

Empirical research

Discussion of data-based projects

Week 2

Introduction to cloud computing, virtual machines and Google Colab

Discussion of data-based projects

Week 3

Latex and document typesetting

Discussion of data-based projects

Week 4

Version control and the linux shell

Week 5

Introduction to data gathering and curation

Week 6 Project proposal feedback
Week 7

Data analysis in Python

Week 8

Microblogging and handling messy data

 

RECESS

Week 9

Data annotation and surveys

Week 10

Project update feedback

Week 11

Databases and information extraction

Week 12

Additional topics

Week 13

Final report preparation

 

Policies and Procedures

Macquarie University policies and procedures are accessible from Policy Central (https://policies.mq.edu.au). Students should be aware of the following policies in particular with regard to Learning and Teaching:

Students seeking more policy resources can visit Student Policies (https://students.mq.edu.au/support/study/policies). It is your one-stop-shop for the key policies you need to know about throughout your undergraduate student journey.

To find other policies relating to Teaching and Learning, visit Policy Central (https://policies.mq.edu.au) and use the search tool.

Student Code of Conduct

Macquarie University students have a responsibility to be familiar with the Student Code of Conduct: https://students.mq.edu.au/admin/other-resources/student-conduct

Results

Results published on platform other than eStudent, (eg. iLearn, Coursera etc.) or released directly by your Unit Convenor, are not confirmed as they are subject to final approval by the University. Once approved, final results will be sent to your student email address and will be made available in eStudent. For more information visit connect.mq.edu.au or if you are a Global MBA student contact globalmba.support@mq.edu.au

Academic Integrity

At Macquarie, we believe academic integrity – honesty, respect, trust, responsibility, fairness and courage – is at the core of learning, teaching and research. We recognise that meeting the expectations required to complete your assessments can be challenging. So, we offer you a range of resources and services to help you reach your potential, including free online writing and maths support, academic skills development and wellbeing consultations.

Student Support

Macquarie University provides a range of support services for students. For details, visit http://students.mq.edu.au/support/

Academic Success

Academic Success provides resources to develop your English language proficiency, academic writing, and communication skills.

The Library provides online and face to face support to help you find and use relevant information resources. 

Student Services and Support

Macquarie University offers a range of Student Support Services including:

Student Enquiries

Got a question? Ask us via the Service Connect Portal, or contact Service Connect.

IT Help

For help with University computer systems and technology, visit http://www.mq.edu.au/about_us/offices_and_units/information_technology/help/

When using the University's IT, you must adhere to the Acceptable Use of IT Resources Policy. The policy applies to all who connect to the MQ network including students.

Changes from Previous Offering

Since 2025, this unit has adopted a three-assessment model. All reproducibility projects are now carried out individually.

Assessment Standards

 

The unit will be graded according to the following general descriptions of the letter grades as specified by Macquarie University.  In the course of the unit, these grade descriptions will be discussed with respect to example projects.

High Distinction (HD, 85-100): provides consistent evidence of deep and critical understanding in relation to the learning outcomes. There is substantial originality and insight in identifying, generating and communicating competing arguments, perspectives or problem solving approaches; critical evaluation of problems, their solutions and their implications; creativity in application as appropriate to the discipline.

In the context of this unit, the project has a good design, and has used some data that is interesting or non-obvious, or has required some effort to obtain or use.  It involves a good analysis of the data, and fairly extensively draws on the techniques and tools presented in the unit and possibly on others discovered independently by the student.  The project consists of a plan, a media presentation and a final report that are unequivocal, persuasive, and essentially free from errors. The final report would be of a standard that could be presented at a conference with little or no polishing.

Distinction (D, 75-84): provides evidence of integration and evaluation of critical ideas, principles and theories, distinctive insight and ability in applying relevant skills and concepts in relation to learning outcomes. There is demonstration of frequent originality in defining and analysing issues or problems and providing solutions; and the use of means of communication appropriate to the discipline and the audience.

In the context of this unit, the project has a good design, and has used some data that is interesting or non-obvious, or has required some effort to obtain or use.  It involves a good analysis of the data, and fairly extensively draws on the techniques and tools presented in the unit. The project consists of a plan, a media presentation and a final report that are explicit, very informative, and mostly free from errors. The final report would be of a standard that could be presented at a conference with some polishing.

Credit (Cr, 65-74): provides evidence of learning that goes beyond replication of content knowledge or skills relevant to the learning outcomes. There is demonstration of substantial understanding of fundamental concepts in the field of study and the ability to apply these concepts in a variety of contexts; convincing argumentation with appropriate coherent justification; communication of ideas fluently and clearly in terms of the conventions of the discipline.

In the context of this unit, the project has a sound design, and demonstrates some thought in the choice of data.  It involves a good analysis of the data, and uses a reasonable number of the techniques and tools presented in the unit. The project consists of a plan, a media presentation and a final report that are clear, informative and mostly free from errors.

Pass (P, 50-64): provides sufficient evidence of the achievement of learning outcomes. There is demonstration of understanding and application of fundamental concepts of the field of study; routine argumentation with acceptable justification; communication of information and ideas adequately in terms of the conventions of the discipline. The learning attainment is considered satisfactory or adequate or competent or capable in relation to the specified outcomes.

In the context of this unit, the project has a satisfactory design and uses some easily accessible data.  It involves a successful, or nearly successful, analysis of data, and shows some familiarity with tools or techniques presented in the unit.  The project consists of a plan, a media presentation and a final report, which must be satisfactory overall. 

Fail (F, 0-49): does not provide evidence of attainment of learning outcomes. There is missing or partial or superficial or faulty understanding and application of the fundamental concepts in the field of study; missing, undeveloped, inappropriate or confusing argumentation; incomplete, confusing or lacking communication of ideas in ways that give little attention to the conventions of the discipline.

 

 


Unit information based on version 2025.06 of the Handbook