Hur relaterar det optimala valet av klassificeringsmetod till datamaterialets egenskaper? En jämförande studie mellan logistisk regression, elastic net och boosting tillämpat på klassificeringsträd. Blaise Ngendangenzwa Jonathan Sundin Student Vt 2015 Kandidatuppsats, 15 hp Statistik C, 30 hp
P (y x 1, x 2,..., x p ) p(y i = 1 x i ) = η i = β 0 + e(ηi) 1 + e (ηi) p β j x ij. j=1 P (y i = 1 x i ) > 1 2 l(β 0, β) l(β 0, β) = 1 N N i=1 ( y i β0 + x T i β ) ( log 1 + e (β 0+x T β)) i
max {l(β 0, β) λp α (β)} (β 0,β)ϵR p+1 P α (β) = P α (β) p j=1 [ ] 1 2 (1 α) β2 j + α β j
λ λ = 0 λ λ = λ λ = 0 λ α α l 1 l 2 α = 1 λ p j=1 β j l 1 α = 0 1 2 λ p j=1 β2 j α
(p n) λ α α α λ λ λ
α (0, 1) α α = 0, 7.
X 1 X 1 a X 2 X 2 b K ˆp sk 0 1 ˆp sk 1 k 1 2 k = 2 ˆp sk K G (T ) = ˆp sk (1 ˆp sk ) k=1
ˆp sk = 1 N s x i R s I (y i = k) ˆp sk k s T N s s G split (T ) = N 1 N G (T 1) + N 2 N G (T 2). T T 1 T 2 N 1 N 2
w i = 1 N m = 1 M i = 1, 2,..., N. w i p m (x) = ˆP m (y i = 1 x i ) [0, 1]. [ ] f m (x) 1 2 pm (x) 1 p m(x) R w i w i [ y i f m (x i )], i = 1, 2,..., N i w i = 1. [ M ] sign m=1 f m(x) m p m (X) f m (x) 1 2 f m (x) f m+1 (x) M P (y = 1 x) [ M p (x) = P (y i = 1 x i ) = 1 + ] m=1 f m (x) [ M ] m=1 f m (x)
M M M M λ λ 0, 01 0, 001 λ M λ
M λ = 0, 01 d = 1 d d = 1 d = 1 d d > 1 d = 2 λ = 0, 01 d = 1 100.000 10.000
n p n > p p > n n > p
λ λ
λ λ