Math3346 -- 2009: Course Schedule
Tools andIssues for Data Miners --
Classification, Visualization, and Generalization
Lectures are Thurs 14.00 – 16.00 in Copland G30, and Fri 9.00-10.00 in JD G30
Laboratories are Tuedays 15:00 - 16:00 in JD LG104
Codes f=felix; j=John; g=Graham; m=mayukh; a=Alan; s=Stephen; tba=to be announced
Week 01 20-24 July 2009
Introduction
: Lect 01f - Course Overview
: Lect 02f - Introduction to R
: Lect 03f - Introduction to R
Laboratory 01j - R Basics (Lab exercises 1)
Week 02 27–31 July 2009
Statistical Basics for Data Mining
: Lect 03f - Introduction to R
: Lect 04f - Introduction to R
: Lect 05f - Distributions and Sampling Distributions
Laboratory 02f - R Basics (Lab exercises 1)
Assignment 1j 20 marks
Week 03 03-07 August 2009
Classification and other models - Models and Model Accuracy assessment
: Lect 06j: Aug6 - Population & sample; source & target population, etc.
: Lect 07j: Aug6 - Linear and Other Models; model formulae;
: Lect 08j: Aug7 - Classification models - multi-way tables;
Laboratory 03j: Aug4 - Practice with R (Lab exercises 2)
Week 04 10–14 August 2009
Statistics and Data Mining
: Lect 9j: Aug13 - Training/test, cross-validation, bootstrap I
: Lect 10j: Aug13 - Training/test, cross-validation, bootstrap II
: Lect 11j: Aug14 - Generalizing from models; measurement of accuracy
Laboratory 04j: Aug11 - Informal & Formal Data Exploration
(Lab exercises 3)
Week 05 17-21 August 2009
: Lect 12j: Aug20 - Source/target differences; reject inference, etc.
: Lect 13j: Aug20 - Linear versus non-linear models
: Lect 14j: Aug21 - Variable selection effects
Laboratory 05j: Aug18 – Populations & Samples (Lab exercises 6, cf also 7 & 8)
Week 06 24-28 August 2009
: Lect 15m: Aug27 - Use and Interpretation of regression coefficients
: Lect 16m: Aug27 – Errors in variables
: Lect 17m: Aug28 – Discriminant Methods & Associated Ordinations
Laboratory 06j: Aug25 - Linear Discriminant Analysis vs Random Forests (Lab exs 12)
Assignment 2j 20 marks
Week 07 31Aug - 4Sept 2009
: Lect 18j: Sept3 – Ordination methods – non-parametric
: Lect 19j: Sept3 – Review of Lectures to date
Data Mining Techniques
: Lect 20g: Sept4 - Data mining issues + tools
Laboratory 07j – Sept 1: Discriminant Methods & Associated Ordinations (Lab exs 13)
Week 08 7-11 September 2009
: Lect 21g: Sept10 - Clustering
: Lect 22g: Sept10 - Association Rules
: Lect 23g: Sept11 – Decision Trees + Deployment
Laboratory 08j: Sept8 – Trees, SVM & random forest discriminants (Lab exs 14)
Week 09 14-18 September 2009
: Lect 24g: Sept17 – Boosting and Random Forests
: Lect 25g: Sept17 - Neural Nets and Support Vector Machines
Special Topics
: Lect 26j: Sept18 – Worked example
Laboratory 09g: Sept13 – Rattle
Assignment 3g 15 marks –
Week 10 21-25 September 2009
Practical data analysis
: Lect 27tba: Sept24 - Commentary on "Hastie, Tibshirani & Freedman's
: Lect 28tba: Sept24 - Elements of Statistical Learning
: Lect 29tba: Sept25 – Support vector machines
Laboratory 10j: Sept22 - Data summary - traps for the unwary
(Lab exercises 5)
TERM BREAK
Week 11 12-16 October 2009
Practical data analysis
: Lect 27j: Oct15 – Overview
: Lect 28j: Oct15 - Worked example?
: Lect 29j: Oct16 - Worked example?
Laboratory 11j: Oct13 - Data analysis - a 'large' data set (Lab exs 16)
Week 12 19-23 October 2009
: Lect 30j: Oct22 – Course review
: Lect 31j: Oct22 – Wrap up and Survey and Feedback
: Lect 32j: Oct23 – To Be Announced
Week 13 26-30 October 2009
Student Presentations: Oct29 & 30 - 30 Marks
Commentary on Presentations: 10 marks