Unit convenor and teaching staff |
Unit convenor and teaching staff
Rolf Schwitter
Contact via Email
4 Research Park Drive; Office 359
By appointment
Diego Molla-Aliod
Contact via Email
4 Research Park Drive; Office 358
By appointment
|
---|---|
Credit points |
Credit points
10
|
Prerequisites |
Prerequisites
130cp at 1000 level or above including COMP2110 or COMP249 or COMP2200 or COMP257
|
Corequisites |
Corequisites
|
Co-badged status |
Co-badged status
|
Unit description |
Unit description
This unit explores the issues involved in building natural language processing (NLP) applications that operate on large bodies of real text such as are found on the world wide web. In this unit we discuss some core methods and tools for dealing with data on the web; in particular machine learning platforms widely used in industry. The unit also explores some recent developments of the web, such as emerging semantic web technologies and the corresponding standards promoted by the Word Wide Web Consortium (W3C). Application areas covered include web search, sentiment analysis, and information extraction. |
Information about important academic dates including deadlines for withdrawing from units are available at https://www.mq.edu.au/study/calendar-of-dates
On successful completion of this unit, you will be able to:
The assessment of this unit consists of three assignments and a final exam. You will submit the solutions to the three assignments via iLearn by the due date. The final examination is a closed book examination, and will be taken in person during the exam period.
Late Submission
Late submissions will not be accepted without an approved Special Consideration request. Assessments submitted after the due date will receive a mark of zero.
Supplementary Exam
If you receive Special Consideration for the final exam, a supplementary exam will be scheduled after the normal exam period, following the release of marks. By making a special consideration application for the final exam you are declaring yourself available for a resit during the supplementary examination period and will not be eligible for a second special consideration approval based on pre-existing commitments. Please ensure you are familiar with the policy prior to submitting an application. Approved applicants will receive an individual notification one week prior to the exam with the exact date and time of their supplementary examination.
Name | Weighting | Hurdle | Due |
---|---|---|---|
Assignment 1 | 10% | No | Week 3 |
Assignment 2 | 20% | No | 2nd Week of Recess |
Assignment 3 | 20% | No | Week 12 |
Final Exam | 50% | No | Examination Period |
Assessment Type 1: Programming Task
Indicative Time on Task 2: 10 hours
Due: Week 3
Weighting: 10%
In this assignment you will implement a simple document processing application that uses pre-packaged tools.
Assessment Type 1: Programming Task
Indicative Time on Task 2: 20 hours
Due: 2nd Week of Recess
Weighting: 20%
This assignment will use more powerful techniques such as those used in commercial and research applications. You will experience the processing of real text data, which can be messy and unpredictable at times. At the end of the assignment you will submit a report describing the system, its implementation, and its evaluation.
Assessment Type 1: Programming Task
Indicative Time on Task 2: 20 hours
Due: Week 12
Weighting: 20%
In this assignment you will experiment with the integration of Semantic Web technology into document processing. You will be asked to study a particular domain and report on the integration of Semantic Web technologies suitable for the domain, including what sort of SPARQL queries would be applicable to solve specific user needs.
Assessment Type 1: Examination
Indicative Time on Task 2: 2 hours
Due: Examination Period
Weighting: 50%
The final exam will focus on the theoretical aspects of the unit. There will be few questions about implementation issues.
1 If you need help with your assignment, please contact:
2 Indicative time-on-task is an estimate of the time required for completion of the assessment task and is subject to individual variation
Most of the contents of the unit will be based on the following two books:
Dan Jurafsky and James H. Martin (2021), Speech and Language Processing (3rd ed. draft), Dec 29, 2021. Available online.
Additional material will be made available during the semester, in conjunction with the lecture notes. See the unit schedule for a listing of the most relevant reading for each week.
The following software is used in COMP3220:
This software is installed in the labs; you should also ensure that you have working copies of all the above on your own machine. Note that many packages come in various versions; to avoid potential incompatibilities, you should install versions as close as possible to those used in the labs.
Note that the majority of the unit materials is publicly available while some material requires you to log in to iLearn to access it.
The unit will make extensive use of discussion boards hosted within iLearn. Please post questions there, they will be monitored by the staff on the unit.
Week | Topic | Reading | |
1 |
Python for Text Processing |
||
2 |
Information Retrieval |
||
3 |
Text Classification |
||
4 |
Deep Learning for Text |
Chollet, Ch. 2 & 3 |
|
5 |
Processing Text Sequences |
Chollet, Ch. 6 |
|
6 |
Advanced Use of Deep Learning for Text |
See lecture notes |
|
7 |
Semantic Technologies |
||
|
Recess |
|
|
8 |
RDF, RDF Schema and SPARQL |
||
9 |
DBpedia and Wikidata |
||
10 |
|
||
11 |
Rule Languages |
||
12 |
Recent Trends in Semantic Technologies |
See lecture notes |
|
13 |
Revision |
Macquarie University policies and procedures are accessible from Policy Central (https://policies.mq.edu.au). Students should be aware of the following policies in particular with regard to Learning and Teaching:
Students seeking more policy resources can visit Student Policies (https://students.mq.edu.au/support/study/policies). It is your one-stop-shop for the key policies you need to know about throughout your undergraduate student journey.
To find other policies relating to Teaching and Learning, visit Policy Central (https://policies.mq.edu.au) and use the search tool.
Macquarie University students have a responsibility to be familiar with the Student Code of Conduct: https://students.mq.edu.au/admin/other-resources/student-conduct
Results published on platform other than eStudent, (eg. iLearn, Coursera etc.) or released directly by your Unit Convenor, are not confirmed as they are subject to final approval by the University. Once approved, final results will be sent to your student email address and will be made available in eStudent. For more information visit ask.mq.edu.au or if you are a Global MBA student contact globalmba.support@mq.edu.au
At Macquarie, we believe academic integrity – honesty, respect, trust, responsibility, fairness and courage – is at the core of learning, teaching and research. We recognise that meeting the expectations required to complete your assessments can be challenging. So, we offer you a range of resources and services to help you reach your potential, including free online writing and maths support, academic skills development and wellbeing consultations.
Macquarie University provides a range of support services for students. For details, visit http://students.mq.edu.au/support/
The Writing Centre provides resources to develop your English language proficiency, academic writing, and communication skills.
The Library provides online and face to face support to help you find and use relevant information resources.
Macquarie University offers a range of Student Support Services including:
Got a question? Ask us via AskMQ, or contact Service Connect.
For help with University computer systems and technology, visit http://www.mq.edu.au/about_us/offices_and_units/information_technology/help/.
When using the University's IT, you must adhere to the Acceptable Use of IT Resources Policy. The policy applies to all who connect to the MQ network including students.
COMP3220 will be assessed and graded according to the University assessment and grading policies.
The following general standards of achievement will be used to assess each of the assessment tasks with respect to the letter grades.
Grade | Range | Description |
---|---|---|
HD | 85-100 | Provides consistent evidence of deep and critical understanding in relation to the learning outcomes. There is substantial originality, insight or creativity in identifying, generating and communicating competing arguments, perspectives or problem solving approaches; critical evaluation of problems, their solutions and their implications; creativity in application as appropriate to the course/program. |
D | 75-84 | Provides evidence of integration and evaluation of critical ideas, principles and theories, distinctive insight and ability in applying relevant skills and concepts in relation to learning outcomes. There is demonstration of frequent originality or creativity in defining and analysing issues or problems and providing solutions; and the use of means of communication appropriate to the course/program and the audience. |
CR | 65-74 | Provides evidence of learning that goes beyond replication of content knowledge or skills relevant to the learning outcomes. There is demonstration of substantial understanding of fundamental concepts in the field of study and the ability to apply these concepts in a variety of contexts; convincing argumentation with appropriate coherent justification; communication of ideas fluently and clearly in terms of the conventions of the course/program. |
P | 50-64 | Provides sufficient evidence of the achievement of learning outcomes. There is demonstration of understanding and application of fundamental concepts of the course/program; routine argumentation with acceptable justification; communication of information and ideas adequately in terms of the conventions of the course/program. The learning attainment is considered satisfactory or adequate or competent or capable in relation to the specified outcomes. |
F | 0-49 | Does not provide evidence of attainment of learning outcomes. There is missing or partial or superficial or faulty understanding and application of the fundamental concepts in the field of study; missing, undeveloped, inappropriate or confusing argumentation; incomplete, confusing or lacking communication of ideas in ways that give little attention to the conventions of the course/program. |
Assessment Process
These assessment standards will be used to give a numeric mark to each assessment submission during marking. The mark will correspond to an appropriate letter grade when relevantly weighted. The final mark for the unit will be calculated by combining the marks for all assessment tasks according to the percentage weightings shown in the assessment summary.
Unit information based on version 2022.02 of the Handbook