Doyun-lab

[ML] Naive Bayes (๋‚˜์ด๋ธŒ ๋ฒ ์ด์ฆˆ) ๋ณธ๋ฌธ

Study/Machine Learning

[ML] Naive Bayes (๋‚˜์ด๋ธŒ ๋ฒ ์ด์ฆˆ)

Doyun+ 2021. 6. 21. 21:58

 

๐–ก๐–บ๐—’๐–พ๐—Œ ๐–ฑ๐—Ž๐—…๐–พ

e = event or evidence / H = Hypothesis

- Likelihood = hypothesis๊ฐ€ ์‚ฌ์„ค์ด๋ผ๋Š” ์กฐ๊ฑด์—์„œ, evidence์ด ์ผ์–ด๋‚  ํ™•๋ฅ 

- Posterior = evidence๊ฐ€ ๊ด€์ธก๋˜์—ˆ์„ ๋•Œ, hypothesis๊ฐ€ ์ผ์–ด๋‚  ํ™•๋ฅ  (์‚ฌํ›„ํ™•๋ฅ )

- Prior = evidence๋ฅผ ๊ด€์ธก๋˜๊ธฐ ์ „, hypothesis๊ฐ€ ์ผ์–ด๋‚  ํ™•๋ฅ  (์‚ฌ์ „ํ™•๋ฅ )

- Marginal = ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ hypothesis ์•„๋ž˜, ์ƒˆ๋กœ์šด evidence๊ฐ€ ์ผ์–ด๋‚  ํ™•๋ฅ 

โ€‹

 

* ์ค‘์š” ์ˆ˜์‹

  • ์žฅ์ 

- ํ†ต๊ณ„์  ์ถ”๋ก ์— ์˜ํ•ด ๊ทœ์น™์ด ์ •ํ•ด์ง€๊ณ  ์œ ์—ฐํ•˜๊ฒŒ ๋ฐ˜์˜๋จ

- ์‚ฌ์ „ ์ง€์‹ ํ†ตํ•ฉ

ex) ๋ฐ”๊ตฌ๋‹ˆ ํฌ๊ธฐ์— ๋Œ€ํ•œ ์ง€์‹์„ ๋ฐ”๊ตฌ๋‹ˆ ์„ ํƒ ํ™•๋ฃฐ์— ๋ฐ˜์˜

โ€‹

  • ๋‹จ์ 

- ์ •ํ™•ํ•œ ๊ฐ’ ๊ณ„์‚ฐ x, ์ถ”๋ก 

- feature๊ฐ€ ๋‘๊ฐœ๋กœ ๋Š˜์–ด๋‚˜๋ฉด ๊ตฌํ•˜์ง€ ๋ชปํ•จ

- 0์ด ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๋‹คโ€‹

 

<๊นœ์ง ๋ฌธ์ œ>

๐–ญ๐–บ๐—‚๐—๐–พ ๐–ก๐–บ๐—’๐–พ๐—Œ ๐–ข๐—…๐–บ๐—Œ๐—Œ๐—‚๐–ฟ๐—‚๐–พ๐—‹

- Assumption : Feature๋“ค์ด ์„œ๋กœ ๋…๋ฆฝ์ 

- ์˜ˆ์ธก ํ™•๋ฅ  ๊ฐ’์ด 0์ด ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๋Š” ์œ„ํ—˜ ์กด์žฌ (Normalization ์ ์šฉํ•˜์—ฌ ํ•ด๊ฒฐ)

โ€‹

๐–ฆ๐–บ๐—Ž๐—Œ๐—Œ๐—‚๐–บ๐—‡ ๐–ญ๐–บ๐—‚๐—๐–พ ๐–ก๐–บ๐—’๐–พ๐—Œ ๐–ข๐—…๐–บ๐—Œ๐—Œ๐—‚๐–ฟ๐—‚๐–พ๐—‹

= P(x|class)๋ฅผ ๋‹จ์ˆœํžˆ ๋นˆ๋„์ˆ˜์—๋งŒ ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๊ณ„์‚ฐํ–ˆ์ง€๋งŒ, ํŠน์ • ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅธ๋‹ค๊ณ  ๊ฐ€์ • ๊ฐ€๋Šฅ

 

· Gaussian Naive Bayes (GNB) Classifier

 

  • GNB Classifier ์ ์šฉ

> ์œ„ : NB

- ๋ชจ๋“  i์— ๋Œ€ํ•ด likelihood ๊ฒฝ์šฐ์˜ ์ˆ˜ ๊ณฑํ•˜๊ธฐ

- k๋ผ๋Š” ๋ ˆ์ด๋ธ”์˜ ํ™•๋ฅ  * k๋ผ๋Š” ๋ ˆ์ด๋ธ”์ผ ๋•Œ ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ Feature์˜ ๊ฒฝ์šฐ์˜ ํ™•๋ฅ 

- arg max yk = ์–ด๋–ค ๊ฐ’์— ๋Œ€ํ•ด ์ตœ๋Œ€๊ฐ’์„ ๊ฐ–๋Š” yk

โ€‹

> ๋ฐ‘ : GNB

* θijk = N(Xnew i; uik, oik)

- N() = Normalization

- k๋ฒˆ์งธ ๋ ˆ์ด๋ธ” ํ™•๋ฅ  * k๋ฒˆ์งธ ๋ ˆ์ด๋ธ”์— ๋Œ€ํ•œ M๊ณผ sd์˜ ๋ชจ๋“  X ์ •๊ทœ๋ถ„ํฌ ํ™•๋ฅ 

โ€‹

- P(Y = yk) = Prior (์•Œ๊ณ ์‹ถ์€ ํ™•๋ฅ )

- ๋’ท๋ถ€๋ถ„ = Likelihood

โ€‹

์ ์šฉ ์‹œ, y์˜ Label ๊ฐœ์ˆ˜๊ฐ€ N๊ฐœ๋ผ๊ณ  ํ•˜๋ฉด, estimate ํ•ด์•ผํ•˜๋Š” P(y | Xnew)์˜ ์‹ค์ œ ๊ฐœ์†Œ๋Š” ๋ช‡ ๊ฐœ์ธ๊ฐ€?

- ์ตœ์†Œ N-1 (ํ™•๋ฅ ์€ ๋ชจ๋‘ ๋”ํ•ด 1 ์ด๋ฏ€๋กœ, ๋‚˜๋จธ์ง€ ํ•˜๋‚˜๋Š” ๊ณ„์‚ฐํ•˜์ง€ ์•Š์•„๋„ ๋‹ต์ด ๋‚˜์˜ด)

โ€‹

 

ํ•™์Šต ๋•Œ, mean/variance ๊ฐ’๋“ค์„ estimate ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ?

MLE

- ๋ชจ์ข…์˜ ํ•จ์ˆ˜ δ : j๋ฒˆ์งธ ๋ ˆ์ด๋ธ” Y๊ฐ€ k๋ฒˆ์งธ ๋ ˆ์ด๋ธ” Y์™€ ๊ฐ™์œผ๋ฉด 1, ๋‹ค๋ฅด๋ฉด 0 ๋ฐ˜ํ™˜

- j : Train Data์˜ j๋ฒˆ์งธ Data (0 ~ N-1๊ฐœ์˜ Data ๋ชจ๋‘ ๋‹ค๋ฃธ)

- μik : Feature์˜ ์ˆœ์„œ์— ๋”ฐ๋ฅธ Label์˜ ์ˆœ๋ฒˆ ๊ต์ง‘ํ•ฉ ๋ถ€๋ถ„

โ€‹

> ์œ„ : ํ‰๊ท 

- ๋ถ„์ž : k ๋ ˆ์ด๋ธ”์ธ ๋ฐ์ดํ„ฐ j์— ๋Œ€ํ•œ Feature i๋ฅผ ๋ชจ๋‘ ๋”ํ•œ ๊ฐ’

- ๋ถ„๋ชจ : j๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ ๋ ˆ์ด๋ธ”๊ณผ k๋ฒˆ์งธ ๋ ˆ์ด๋ธ”์ด ๊ฐ™์œผ๋ฉด 1, ๋‹ค๋ฅด๋ฉด 0์„ ๋ฐ˜ํ™˜ํ•œ ๊ฐ’๋“ค์˜ ํ•ฉ

โ€‹

> ์•„๋ž˜ : ๋ถ„์‚ฐ

- ๋ถ„์ž : (k ๋ ˆ์ด๋ธ”์ธ ๋ฐ์ดํ„ฐ j์— ๋Œ€ํ•œ Feature i ๊ฐ’ — k ๋ ˆ์ด๋ธ”์˜ Feature i์˜ ํ‰๊ท ๊ฐ’)์˜ ํŽธ์ฐจ ๊ฐ’์˜ ์ œ๊ณฑ ํ•ฉ

- ๋ถ„๋ชจ : j๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ ๋ ˆ์ด๋ธ”๊ณผ k๋ฒˆ์งธ ๋ ˆ์ด๋ธ”์ด ๊ฐ™์œผ๋ฉด 1, ๋‹ค๋ฅด๋ฉด 0์„ ๋ฐ˜ํ™˜ํ•œ ๊ฐ’๋“ค์˜ ํ•ฉ

โ€‹

 

  • Decision Bounary

- Probability Distribution ๋ชจ์–‘์— ๋”ฐ๋ผ Boundary ๋ชจ์–‘ ๋ฐ”๋€œ

- XOR๊ณผ ๊ฐ™์€ ๋น„์„ ํ˜• ํŒจํ„ด์˜ ๋ฌธ์ œ๋Š” ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ•จ

โ€‹

 

๐–ก๐–บ๐—’๐–พ๐—Œ๐—‚๐–บ๐—‡ ๐–ญ๐–พ๐—๐—๐—ˆ๐—‹๐—„

= ๋žœ๋ค ๋ณ€์ˆ˜์˜ ์ง‘ํ•ฉ๊ณผ ๋ฐฉํ–ฅ์„ฑ ๋น„์ˆœํ™˜ ๊ทธ๋ž˜ํ”„๋ฅผ ํ†ตํ•ด ์ง‘ํ•ฉ์„ ์กฐ๊ฑด๋ถ€ ๋…๋ฆฝ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ํ™•๋ฅ ์˜ ๊ทธ๋ž˜ํ”ฝ ๋ชจ๋ธ

- ๋ณต์žกํ•œ ๊ฒฐํ•ฉ ๋ถ„ํฌ๋ณด๋‹ค ์ง์ ‘์ ์ธ ์˜์กด์„ฑ๊ณผ ์ง€์—ญ ๋ถ„ํฌ๋ฅผ ์ดํ•ดํ•˜๋Š”๋ฐ ์ง๊ด€์ 

Feature A์™€ Feature B๋กœ๋ถ€ํ„ฐ ์˜ํ–ฅ์„ ๋ฐ›์€ Feature C

- ๋ฐฉํ–ฅ์ด ์žˆ๋Š” ์„  : Direct Dependence

- ์„ ์ด ์—†๋Š” ๊ฒƒ : Conditional Independence (์กฐ๊ฑด๋ถ€ ๋…๋ฆฝ)

โ€‹

 

  • Bayesian Network์˜ 6๊ฐ€์ง€ ๊ธฐ๋ณธ ๊ทœ์น™

 

  • Conditional Independence

- N์€ G์™€ ๋…๋ฆฝ (R์ด ์ฃผ์–ด์ ธ ์žˆ์„ ๋•Œ)

- N๊ณผ G๋Š” ๋…๋ฆฝ ์•„๋‹˜ (R๊ณผ D๊ฐ€ ์ฃผ์–ด์ ธ ์žˆ์„ ๊ฒฝ์šฐ, D๊ฐ€ ์ •ํ•ด์ง„ ๊ฐ’์ด๋ฉด N๊ณผ G๋„ ์„œ๋กœ ์˜ํ–ฅ์„ ์คŒ — Explaining Away)

- R, G, D๊ฐ€ ์ฃผ์–ด์ง€๋ฉด N๊ณผ S๋Š” ๊ฒฐ์ •๋˜๋ฏ€๋กœ ๋…๋ฆฝ

โ€‹

 

* Explaining Away

- ์„œ๋กœ ์—ฐ๊ด€ ์—†๋Š” S์™€ R์ด W๋ฅผ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด ์• ์“ด๋‹ค

- W = 1, R = 1 ์ผ๋•Œ, S = 0 ํ™•๋ฅ ์ด ์ปค์ง