Machine Learning 202

Organizer: Doug Chang

Instructors: Dr. Michael Bowles & Dr. Patricia Hoffman


We will cover several machine learning topics in detail.  Among those are Bayesian Belief Networks, Prof. Jerome Friedman's gradient boosting papers, svd's and recommender systems. 


For Machine Learning 202, we assume you are familiar with basic statistical concepts and have the ability to write programs running different algorithms on public data sets. We assume knowledge at the level of "Introduction to Data Mining" by Tan.  If you have taken our Machine Learning 101102 classes and Machine Learning 201, you are well prepared for this course.


We expect you to have previously used R.  We will use R for discussing homework problems and comparing different solution approaches.  .   http://cran.r-project.org/  For your review, R are here: References for R,  Reference for R Comments,  More R references.  To integrate R with Eclipse click here.


Machine Learning 202 Syllabus:  


Week  Topics  Homework  Links 
1st Week  Collaborative Filtering HW01
     8/24/2011 Singular Value Decomposition    
     8/25/2011  Recommendation Engines     
2nd Week       Gradient Boosting
    8/31/2011 Analysis of AdaBoost
HW #1 Due   
    9/01/2011  Friedman's Stochastic Gradient Boosting
3rd Week  Active Learning & AdaBoost
    9/7/2011     Active Learning
HW #2 Due  ActiveLearningLinks
    9/8/2011   Learning Theory  & AdaBoost
4th Week  EM Method
    9/14/2011   Basic Background
HW #3 Due  EMLinks  
    9/15/2011   EM for Gaussian Mixture Models
5th Week  More EM Applications
    9/21/2011 LDA I
    9/22/2011   Hidden Markov Models


General Sequence of Classes:


Machine Learning 101:   Learn about ML algorithms and implement them in r  

     Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

Machine Learning 102:  Enable you to read and implement algorithms from current papers

     Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar


Machine Learning 201:    Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models    

     Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

Machine Learning 202:   Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees

     Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman


Machine Learning Big Data:  Adaptation and execution of machine learning algorithms in the map reduce framework.


Machine Learning Text Processing:  Machine learning applied to natural language text documents using statistical algorithms including  indexing, automatic classification (e.g. spam filtering) part of speech identification, topic and modeling, sentiment extraction


General Calendar for the Year:

Fall 2010: Basic Machine Learning Machine Learning 101 &  Machine Learning 102

Winter  2011:  Machine Learning 101 &  Machine Learning 201

Early Spring 2011:  Machine Learning 102 &  Machine Learning 202


We will be using the following text as a reference for 201 and 202:


"The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
This book is free to look at on line.  http://www-stat.stanford.edu/~tibs/ElemStatLearn


There are more Machine Learning References on Patricia's web site http://patriciahoffmanphd.com/


