There will be one two hour online lecture each week, and one two hour workshop in the computing laboratory or online. The online lecture would be in the form of live streaming or pre-recorded lecture videos. You are expected to attend both classes as they provide complimentary learning activities each week. In practical classes you will write code and experiment with various data sets; in lectures we will discuss the methods you are learning and how the results of your analysis can be interpreted.
We will refer to the following texts during the semester:
Introduction to Data Science A Python Approach to Concepts, Techniques and Applications Igual, Laura, Seguí, Santi (electronic edition available via MQ Library)
Computational and Inferential Thinking: The Foundations of Data Science By Ani Adhikari and John DeNero (available on GitBooks)
You will be given readings from these and other sources each week.
Technology Used and Required
We will make use of Python 3 for data analysis, including a range of modules such as scikit-learn, pandas, numpythat provide additional features. These can all be installed via the Anaconda Python distribution. We will discuss this environment and the installation process in the first week of classes.
We will use Jupyter Notebook as a way of developing and presenting the analysis results. This is included in the full Anaconda distribution.
A major part of the assessment in this unit is based on a project that you will complete in groups. This will allow you to explore the techniques you are learning in class in a real-world data analysis exercise.