| 
View
 

week4Notes

Page history last edited by hoffman.tricia@gmail.com 15 years, 2 months ago

We had a request for coverage of several topics including support vector machines and fraud detection.  We'll run through several flavors of SVM and how to compute them and in doing that we'll cover their use in fraud (or outlier) detection.  Here are references that we'll use in this process. 

 

Notes from stanford class cs229 - coverage of basic svm:  13-svm.pdf

 

Microsoft tech report on using SVM for single class classification (basically find the support of an unknown, empirical density function).  You can think of this as finding the subspace of feature space, where all the normal cases (non-fraud, non-outlier) live.  this approach has the strength that it trains well on huge volumes of normal data.  Fraud and outlier detection usually present relatively few examples of the stuff we want to identify.  OneClassSVM-tr-99-87.pdf

 

Here are some more references regarding fast algorithms for svm: 

fastKernelClassifiers-bordes-1.pdf

fastScalableLocalKernelMachine-segata.pdf

multiclassSVM-LaRank-bordes.pdf

p1737-bordes.pdf

stochasticSubGradient-shalev-shwartz.pdf

 

Here's are a couple of papers evaluating relative performance of different supervised learning algorithms (including SVM).

Modest-sized data sets:  performanceCompSupervisedLearning-caruana.pdf

High-dimension data sets:  perfEvalSupervisedLearningHighDim-caruana.pdf

 

Here's R-code for the examples from class: 

svmExamp.R

oneClassSVM.R

 

Simple Paper on SVM: http://patriciahoffmanphd.com/resources/papers/SVM/SupportVectorJune3.pdf

Comments (0)

You don't have permission to comment on this page.