Trees


Outline for Today

1.  Welcome

     Course Outline

     Some Decisions - Two Open Sessions

          One will be another lecture - What would people like to cover?

          Last will be larger project - Papers?  Class Hackathon - Kaggle competition?

2.  Background on Ensemble Methods

     Combine many "weak learners"

     Widely Used

     Available in R, Mahout, EigenDog, Salford Systems.

     Private versions at Google and Yahoo    

3.  Binary Trees for Classification and Regression

      trees.pdf

     breimanTh.pdf

     Strengths_and_Weaknesses_of_Trees.pdf

     In class problems

 

 

abloneTable <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data",header=FALSE, sep=",")

http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data

http://archive.ics.uci.edu/ml/machine-learning-databases/glass/glass.data

Concrete_Data-1.csv

 

Classification Problems

http://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data

 

http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat

 

 

 

 

 

4.  Strengths and Weaknesses of Trees

5.  AdaBoost - Great Leap Forward

     AdaBoost

     In class problems.

     FriedmanStatViewBoosting.pdf

 

Recorded class

Part 1. https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=115988617&rKey=9eab3f0b71473bbd

 

Part 2. https://datamining.webex.com/datamining/ldr.php?AT=pb&SP=MC&rID=115988627&rKey=dede3186b2774fa9