Outline for Today
1. Welcome
Course Outline
Some Decisions - Two Open Sessions
One will be another lecture - What would people like to cover?
Last will be larger project - Papers? Class Hackathon - Kaggle competition?
2. Background on Ensemble Methods
Combine many "weak learners"
Widely Used
Available in R, Mahout, EigenDog, Salford Systems.
Private versions at Google and Yahoo
3. Binary Trees for Classification and Regression
Strengths_and_Weaknesses_of_Trees.pdf
In class problems
abloneTable <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data",header=FALSE, sep=",")
http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data
http://archive.ics.uci.edu/ml/machine-learning-databases/glass/glass.data
Classification Problems
http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat
4. Strengths and Weaknesses of Trees
5. AdaBoost - Great Leap Forward
In class problems.
Recorded class
Part 1. https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=115988617&rKey=9eab3f0b71473bbd
Part 2. https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=115988627&rKey=dede3186b2774fa9