PSfrag replaements Review of Leture 17 • Sampling bias Hi • Oam's Razor The simplest model that ts the data is also the most plausible. P(x) testing training x • Hi Data snooping Cumulative Prot % 30 snooping 20 10 omplexity of h ←→ omplexity of H unlikely event ←→ signiant if it happens 0 -10 0 no snooping 100 200 300 Day 400 500 Learning From Data Yaser S. Abu-Mostafa California Institute of Tehnology Leture 18: Epilogue Sponsored by Calteh's Provost Oe, E&AS Division, and IST • Thursday, May 31, 2012 Outline • The map of mahine learning • Bayesian learning • Aggregation methods • Aknowledgments AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 2/23 It's a jungle out there semi−supervised learning stochastic gradient descent overfitting Gaussian processes distribution−free collaborative filtering deterministic noise linear regression VC dimension nonlinear transformation decision trees data snooping sampling bias Q learning SVM learning curves mixture of expe neural networks no free training versus testing RBF noisy targets Bayesian prior active learning linear models bias−variance tradeoff weak learners ordinal regression logistic regression data contamination cross validation ensemble learning types of learning xploration versus exploitation error measures is learning feasible? clustering AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 regularization kernel methods hidden Markov mod perceptrons graphical models soft−order constraint weight decay Occam’s razor Boltzmann mach 3/23 The map THEORY TECHNIQUES models VC bias−variance complexity linear methods supervised regularization neural networks SVM nearest neighbors bayesian PARADIGMS RBF gaussian processes unsupervised validation reinforcement aggregation active input processing online SVD graphical models AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 4/23 Outline • The map of mahine learning • Bayesian learning • Aggregation methods • Aknowledgments AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 5/23 Probabilisti approah Hi Extend probabilisti role to all omponents P (D | h = f ) How about deides whih P (h = f | D) ? h UNKNOWN TARGET DISTRIBUTION P(y | x) target function f: X Y UNKNOWN INPUT DISTRIBUTION plus noise P( x ) (likelihood) x1 , ... , xN DATA SET D = ( x1 , y1 ), ... , ( xN , yN ) x g ( x )~ ~ f (x ) LEARNING ALGORITHM FINAL HYPOTHESIS g: X Y A HYPOTHESIS SET H Hi AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 6/23 The prior P (h = f | D) requires an additional probability distribution: P (D | h = f ) P (h = f ) P (h = f | D) = ∝ P (D | h = f ) P (h = f ) P (D) P (h = f ) is the P (h = f | D) prior is the posterior Given the prior, we have the full distribution AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 7/23 Example of a prior Consider a pereptron: A possible prior on w: h is Eah determined by wi w = w0, w1, · · · , wd is independent, uniform over [−1, 1] This determines the prior over Given D, we an ompute h - P (h = f ) P (D | h = f ) Putting them together, we get P (h = f | D) ∝ P (h = f )P (D | h = f ) AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 8/23 A prior is an assumption Even the most neutral prior: Hi x is unknown x is random P(x) −1 −1 1 1 x Hi The true equivalent would be: Hi x is unknown x is random δ(x−a) −1 AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 1 −1 a 1 x Hi 9/23 If we knew the prior ... we ould ompute P (h = f | D) for every =⇒ h∈H we an nd the most probable we an derive E(h(x)) we an derive the h given the data for every x error bar for every x we an derive everything in a prinipled way AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 10/23 When is Bayesian learning justied? 1. The prior is valid trumps all other methods 2. The prior is irrelevant just a omputational atalyst AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 11/23 Outline • The map of mahine learning • Bayesian learning • Aggregation methods • Aknowledgments AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 12/23 What is aggregation? Combining dierent solutions Hi h 1 , h2 , · · · , hT that were trained on D: Hi Regression: take an average Classiation: take a vote a.k.a. AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 ensemble learning and boosting 13/23 Dierent from 2-layer learning Hi In a 2-layer model, all units learn In aggregation, they learn training data jointly: Learning Algorithm independently then get ombined: Hi Hi training data Learning Algorithm Hi AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 14/23 Two types of aggregation 1. After the fat: ombines existing solutions Example. Netix teams merging blending 2. Before the fat: reates solutions to be ombined Example. Bagging - resampling D Hi training data Learning Algorithm Hi AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 15/23 Deorrelation - boosting Create h 1 , · · · , ht , · · · sequentially: Make ht deorrelated with previous h's: Hi training data Learning Algorithm Hi Emphasize points in Choose weight of AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 ht D that were mislassied based on E (ht) in 16/23 Blending - after the fat For regression, h 1 , h2 , · · · , hT −→ g(x) = T X αt ht(x) t=1 Prinipled hoie of αt's: minimize the error on an aggregation data set Some αt's pseudo-inverse an ome out negative Most valuable ht in the blend? Unorrelated ht's help the blend AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 17/23 Outline • The map of mahine learning • Bayesian learning • Aggregation methods • Aknowledgments AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 18/23 Course ontent AM L Professor Malik Magdon-Ismail, RPI Professor Hsuan-Tien Lin, NTU Creator: Yaser Abu-Mostafa - LFD Leture 18 19/23 Course sta Carlos Gonzalez (Head TA) Ron Appel Costis Sideris Doris Xin AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 20/23 Filming, prodution, and infrastruture Leslie Maxeld and the AMT sta Rih Fagen and the IMSS sta AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 21/23 Calteh support IST - AM L Mathieu Desbrun E&AS Division - Ares Rosakis Provost's Oe - Ed Stolper Creator: Yaser Abu-Mostafa - LFD Leture 18 and and Mani Chandy Melany Hunt 22/23 Many others Calteh TA's and sta members Calteh alumni and Alumni Assoiation Colleagues all over the world AM L Creator: Yaser Abu-Mostafa - LFD Leture 18 23/23 To the fond memory of Faiza A. Ibrahim

© Copyright 2021 DropDoc