๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Deep Learning/2023 DL ๊ธฐ์ดˆ ์ด๋ก  ๊ณต๋ถ€

[๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ 1] chap7(ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง)

by ์ œ๋ฃฝ 2023. 7. 8.
728x90
๋ฐ˜์‘ํ˜•

 

 

  • CNN(ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง- convolutional neural network)
  • ์ด๋ฏธ์ง€ ์ธ์‹ + ์Œ์„ฑ ์ธ์‹ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ณณ์—์„œ ์‚ฌ์šฉ๋จ.
7-1) ์ „์ฒด ๊ตฌ์กฐ
  • ๊ธฐ์กด : ์™„์ „์—ฐ๊ฒฐ๊ณ„์ธต(=Affine ๊ณ„์ธต)
  • CNN : Conv๊ณ„์ธต, Pooling ๊ณ„์ธต์ด ๋”ํ•ด์ ธ์„œ 'Afiine - Relu' -> 'Conv -> Relu -> (pooling)'์œผ๋กœ ๋ฐ”๋€œ.
  •  
7-2) ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต
  • ์ž…์ฒด์ ์ธ ๋ฐ์ดํ„ฐ๊ฐ€ ํ๋ฅธ๋‹ค๋Š” ์ฐจ์ด์ ์ด ์žˆ์Œ
7-2-1) ์™„์ „์—ฐ๊ฒฐ ๊ณ„์ธต์˜ ๋ฌธ์ œ์ 
  • ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ์ด ๋ฌด์‹œ๋จ
  • ex) ๊ธฐ์กด: 3์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ 1์ฐจ์› ๋ฐ์ดํ„ฐ๋กœ ๋ฐ”๊ฟ”์„œ ๊ณ„์‚ฐํ–‡์—ˆ์Œ
  • ์ด๋ฏธ์ง€์˜ ๊ฒฝ์šฐ, 3์ฐจ์›์ด๊ธฐ์— 1์ฐจ์›์œผ๋กœ ๋ฐ”๊ฟ”๋ฒ„๋ฆฌ๋ฉด ๊ทธ์— ๋‹ด๊ธด ์ •๋ณด๋“ค์ด ์‚ฌ๋ผ์ ธ๋ฒ„๋ฆผ
  • CNN์˜ ์ž…์ถœ๋ ฅ ๋ฐ์ดํ„ฐ: Feature Map(ํŠน์ง•๋งต)์ด๋ผ๊ณ  ํ•จ - ์ž…๋ ฅ ํŠน์ง• ๋งต/ ์ถœ๋ ฅ ํŠน์ง• ๋งต
7-2-2) ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ
  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ - ํ•„ํ„ฐ(์ปค๋„) - ์ถœ๋ ฅ
  • ์œˆ๋„์šฐ: ์ผ์ • ๊ฐ„๊ฒฉ์œผ๋กœ ์ด๋™ํ•ด๊ฐ€๋ฉฐ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ์ ์šฉ (์—ฌ๊ธฐ์„œ๋Š” ์œˆ๋„์šฐ๊ฐ€ 3x3)
  • ์ž…๋ ฅ๊ณผ ํ•„ํ„ฐ์—์„œ ๋Œ€์‘ํ•˜๋Š” ์›์†Œ๋“ค๋ผ๋ฆฌ ๊ณฑํ•ด์„œ ์ดํ•ฉ ๊ตฌํ•จ ⇒ ๋‹จ์ผ ๊ณฑ์…ˆ ๋ˆ„์‚ฐ
  • ํŽธํ–ฅ : ํ•„ํ„ฐ ์ ์šฉํ•œ ํ›„ ์ถœ๋ ฅ๊ฐ’(2x2)์— ํŽธํ–ฅ์„ ๊ฐ๊ฐ ๋”ํ•ด์คŒ ex) ํŽธํ–ฅ:3, [18, 19, 9, 18]
7-2-3) ํŒจ๋”ฉ
  • ํŒจ๋”ฉ: ์ถœ๋ ฅ ํฌ๊ธฐ๋ฅผ ์กฐ์ •ํ•  ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ
  • ์™œ ์‚ฌ์šฉ?: ๊ธฐ์กด (4,4)์— (3,3) ํ•„ํ„ฐ๋ฅผ ์ถœ๋ ฅํ•˜๋ฉด (2,2)๋กœ ์ค„์–ด๋“ฌ. ⇒ ์ด๋Ÿฐ์‹์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์„ ๊ณ„์† ํ•˜๋ฉด ์ค„์–ด๋“ค์–ด์„œ ์ถœ๋ ฅ ํฌ๊ธฐ๊ฐ€ 1์ด ๋˜์–ด๋ฒ„๋ฆผ → ๋” ์ด์ƒ ์—ฐ์‚ฐ์ด ์•ˆ๋จ.
  • ๋”ฐ๋ผ์„œ, ํŒจ๋”ฉ์„ ํ™œ์šฉํ•ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๋ฅผ ์œ ์ง€ํ•ด์„œ ๋‹ค์Œ ๊ณ„์ธต์— ๊ทธ๋Œ€๋กœ ์ „๋‹ฌ์„ ํ•  ์ˆ˜ ์žˆ์Œ
7-2-4) ์ŠคํŠธ๋ผ์ด๋“œ

์ŠคํŠธ๋ผ์ด๋“œ: ํ•„ํ„ฐ๋ฅผ ์ ์šฉํ•˜๋Š” ์œ„์น˜์˜ ๊ฐ„๊ฒฉ

  • ์ŠคํŠธ๋ผ์ด๋“œ๊ฐ€ ์ปค์ง€๋ฉด ์ถœ๋ ฅ ํฌ๊ธฐ๋Š” ์ž‘์•„์ง ( ๊ทธ๋งŒํผ ์ด๋™ํ•˜๋Š”๊ฒŒ ์ปค์ง€๋ฏ€๋กœ ์นธ ์ˆ˜๋„ ์ž‘์•„์ง
  • ํŒจ๋”ฉ์„ ํฌ๊ฒŒ ํ•˜๋ฉด ์ถœ๋ ฅ ํฌ๊ธฐ๊ฐ€ ์ปค์ง
  • ์ž…๋ ฅ ํฌ๊ธฐ(H,W), ํ•„ํ„ฐ ํฌ๊ธฐ(FH,FW), ์ถœ๋ ฅ ํฌ๊ธฐ(OH, OW), ํŒจ๋”ฉ(P), ์ŠคํŠธ๋ผ์ด๋“œ(S)
7-2-5) 3์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ
  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ํ•„ํ„ฐ์˜ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์€ ์ฑ„๋„๋งˆ๋‹ค ์ˆ˜ํ–‰ → ๊ฒฐ๊ณผ๋ฅผ ๋”ํ•ด ํ•˜๋‚˜์˜ ์ถœ๋ ฅ์„ ์–ป์Œ
  • ์ฑ„๋„: ์—ฌ๊ธฐ์„  3๊ฐœ ( ํ•„ํ„ฐ์˜ ์ฑ„๋„ ์ˆ˜์™€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ์ฑ„๋„ ์ˆ˜๋Š” ๊ฐ™๊ฒŒ ์„ค์ •)
7-2-6) ๋ธ”๋ก์œผ๋กœ ์ƒ๊ฐํ•˜๊ธฐ
  • ์ง์œก๋ฉด์ฒด๋กœ ์ƒ๊ฐํ•˜๋ฉด ๋จ
  • 3์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฐ์—ด๋กœ ๋‚˜ํƒ€๋‚ผ ๋•Œ๋Š” ์ฑ„๋„, ๋†’์ด, ๋„ˆ๋น„ ์ˆœ์„œ (C, H, W) / ํ•„ํ„ฐ (C, FH, FW)
  • ํ•„ํ„ฐ๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ → ์ถœ๋ ฅ๋„ ์—ฌ๋Ÿฌ๊ฐœ
  • ๋”ฐ๋ผ์„œ ํ•„ํ„ฐ์˜ ์ˆ˜ ๋„ ๊ณ ๋ คํ•ด์•ผ ํ•จ. ⇒ ํ•„ํ„ฐ์˜ ๊ฐ€์ค‘์น˜ ๋ฐ์ดํ„ฐ๋Š” 4์ฐจ์› ๋ฐ์ดํ„ฐ (์ถœ๋ ฅ ์ฑ„๋„์˜ ์ˆ˜, ์ž…๋ ฅ ์ฑ„๋„์˜ ์ˆ˜, ๋†’์ด , ๋„ˆ๋น„)๋กœ ๊ตฌ์„ฑ๋จ (C: ๋’ค๋กœ ๋ช‡๊ฐœ, ์ถœ๋ ฅ: FN๊ฐœ)
7-2-7) ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ
  • ์‹ ๊ฒฝ๋ง์—์„œ ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ ํ•ด์คฌ์Œ ⇒ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ CNN์—์„œ๋„ ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ํ•™์Šต ์ง€์›ํ•ด์คŒ
  • ๊ธฐ์กด 3์ฐจ์›์—์„œ 4์ฐจ์›์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ฒŒ ๋จ.
  • (๋ฐ์ดํ„ฐ ์ˆ˜, ์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„)
  • ** 4์ฐจ์› ๋ฐ์ดํ„ฐ๊ฐ€ ํ•˜๋‚˜ ํ๋ฅผ ๋•Œ๋งˆ๋‹ค ๋ฐ์ดํ„ฐ N๊ฐœ์— ๋Œ€ํ•œ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์ด ์ด๋ค„์ง.
  • ์ฆ‰, ๊ธฐ์กด ์‹ ๊ฒฝ๋ง ๋ฏธ๋‹ˆ๋ฐฐ์น˜์™€ ๊ฐ™์€ ๋ง ⇒ NํšŒ ๋ถ„์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ๋ฅผ ํ•œ ๋ฒˆ์— ์ˆ˜ํ–‰ํ•ด์คŒ. ( ๋ฏธ๋‹ˆ๋ฐฐ์น˜ 10, ์ „์ฒด 100⇒ 10ํšŒ ๋ถ„์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ๋ฅผ ํ•œ๋ฒˆ์— ๋‹ค ๋Œ๋ ค๋ฒ„๋ฆผ)
7-3) ํ’€๋ง ๊ณ„์ธต
  • ํ’€๋ง: ๊ฐ€๋กœ ์„ธ๋กœ ๋ฐฉํ–ฅ์˜ ๊ณต๊ฐ„์„ ์ค„์ด๋Š” ์—ฐ์‚ฐ

→ ์ด ๊ฒฝ์šฐ, ํ•ด๋‹น max ๊ฐ’์„ ์ถ”์ถœํ•จ ( ์ตœ๋Œ€ ํ’€๋ง)

  • ํ’€๋ง์˜ ์œˆ๋„์šฐ ํฌ๊ธฐ์™€ ์ŠคํŠธ๋ผ์ด๋“œ๋Š” ๊ฐ™์€ ๊ฐ’์œผ๋กœ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์ด ๋ณดํ†ต (์œˆ๋„์šฐ: 2x2, ์ŠคํŠธ๋ผ์ด๋“œ(๋ณดํญ: 2))
ํ’€๋ง ๊ณ„์ธต์˜ ํŠน์ง•
  1. ํ•™์Šตํ•ด์•ผ ํ•  ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์—†๋‹ค (ex) ๊ทธ๋ƒฅ ์ตœ๋Œ“๊ฐ’๋งŒ ๊ตฌํ•˜๋ฉด ๋˜๋ฏ€๋กœ)
  1. ์ฑ„๋„ ์ˆ˜์˜ ๋ณ€ํ™”๊ฐ€ ์—†์Œ
    • ๋…๋ฆฝ์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์—
  1. ์ž…๋ ฅ์˜ ๋ณ€ํ™”์— ์˜ํ–ฅ์„ ์ ๊ฒŒ ๋ฐ›๋Š”๋‹ค
    • ๋ฐ์ดํ„ฐ๊ฐ€ ์˜ค๋ฅธ์ชฝ์œผ๋กœ ์ด๋™ํ•œ๋‹ค ํ•œ ๋“ค ๋ณ€ํ™”๋Š” ์—†์Œ
7-4) ํ•ฉ์„ฑ๊ณฑ/ํ’€๋ง ๊ณ„์ธต ๊ตฌํ˜„
7-4-1) 4์ฐจ์› ๋ฐฐ์—ด
  • ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ์ด (10,1,28,28) ์ด๋ผ๋ฉด ⇒ ๋ฐ์ดํ„ฐ๊ฐ€ 10๊ฐœ, ์ฑ„๋„ 1๊ฐœ, 28*28
  • x[0].shape ⇒ (1,28,28) ⇒ ์ฒซ ๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผ
  • ⇒ im2col์ด๋ผ๋Š” ํŠธ๋ฆญ์ด ๋ฌธ์ œ๋ฅผ ๋‹จ์ˆœํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์คŒ.
7-4-2) im2col(image to column) ๋ฐ์ดํ„ฐ ์ „๊ฐœํ•˜๊ธฐ
  • ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„  for๋ฌธ ๊ฒน๊ฒน์ด ์จ์•ผํ•จ ⇒ ๋‹จ์ .
  • ⇒ im2col( ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ค‘์น˜ ๊ณ„์‚ฐํ•˜๊ธฐ ์ข‹๊ฒŒ ํŽผ์น˜๋Š” ํ•จ์ˆ˜
  • ex) 3์ฐจ์› ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— im2col์„ ์ ์šฉํ•˜๋ฉด 2์ฐจ์› ํ–‰๋ ฌ๋กœ.
  • ์‹ค์ œ ์ƒํ™ฉ์—์„œ๋Š” ํ•„ํ„ฐ ์ ์šฉ ์˜์—ญ์ด ๊ฒน์น˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋Œ€๋ถ€๋ถ„
  • ํ•„ํ„ฐ ์ ์šฉ ์˜์—ญ์ด ๊ฒน์น˜๊ฒŒ ๋˜๋ฉด im2col๋กœ ์ „๊ฐœํ•œ ํ›„์˜ ์›์†Œ ์ˆ˜๊ฐ€ ์›๋ž˜๋ณด๋‹ค ๋งŽ์•„์ง (๋ฉ”๋ชจ๋ฆฌ ๋” ๋งŽ์ด ์†Œ๋น„)
  • ์ปดํ“จํ„ฐ๋Š” ํฐ ํ–‰๋ ฌ์„ ๋งŒ๋“ค์–ด ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ ํƒ์›”ํ•ด ํšจ์œจ ๋†’์ผ ์ˆ˜ ์žˆ์Œ
  • im2col ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ „๊ฐœ ํ›„ ํ•ฉ์„ฑ๊ณฑ๊ณ„์ธต ํ•„ํ„ฐ 1์—ด์ „๊ฐœํ•˜๊ณ  ํ–‰๋ ฌ๊ณฑ ๊ณ„์‚ฐ

 

  • ์ด๋ฏธ์ง€๋ฅผ ์—ด๋กœ ๋ถ™์ž„
7-4-3) ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต ๊ตฌํ˜„
# im2col ์‚ฌ์šฉ ๊ตฌํ˜„   import sys, os sys.path.append('/deep-learning-from-scratch') from common.util import im2col   x1 = np.random.rand(1, 3, 7, 7) # ๋ฐ์ดํ„ฐ ์ˆ˜, ์ฑ„๋„ ์ˆ˜, ๋†’์ด, ๋„ˆ๋น„ col1 = im2col(x1, 5, 5, stride=1, pad=0) print(col1.shape)   x2 = np.random.rand(10, 3, 7, 7) # ๋ฐ์ดํ„ฐ 10๊ฐœ col2 = im2col(x2, 5, 5, stride=1, pad=0) print(col2.shape)   >>> (9, 75) (90, 75)
  • ๋‘ ๊ฐ€์ง€ ๊ฒฝ์šฐ ๋ชจ๋‘ 2๋ฒˆ์งธ ์ฐจ์›์˜ ์›์†Œ๋Š” 75๊ฐœ
    • ํ•„ํ„ฐ์˜ ์›์†Œ ์ˆ˜์™€ ๊ฐ™์Œ (์ฑ„๋„3, 5*5 ๋ฐ์ดํ„ฐ)
  • ๋ฐฐ์น˜ํฌ๊ธฐ๊ฐ€ 1์ผ ๋•Œ๋Š” (9, 75) 10์ผ ๋•Œ๋Š” 10๋ฐฐ์ธ (90, 75)
# ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต ๊ตฌํ˜„ - Convolution ํด๋ž˜์Šค   class Convolution:   def __init__(self, W, b, stride=1, pad=0):     self.W = W     self.b = b     self.stride = stride     self.pad = pad     def forward(self, x):     FN, C, FH, FW = self.W.shape     N, C, H, W = x.shape     out_h = int(1 + (H + 2*self.pad - FH) / self.stride)     out_w = int(1 + (W +2*self.pad - FW) / self.stride)       col = im2col(x, FH, FW, self.stride, self.pad) # ์ž…๋ ฅ๋ฐ์ดํ„ฐ ์ „๊ฐœ     col_W = self.W.reshape(FN, -1).T # ํ•„ํ„ฐ ์ „๊ฐœ     out = np.dot(col, col_W) + self.b       out = out.reshape(N, out_h, out_w, -1).transpose(0, 3, 1, 2)       return out
  • eshape ๋‘ ๋ฒˆ์งธ ์ธ์ˆ˜ -1๋กœ ์ง€์ •ํ•˜๋ฉด ๋‹ค์ฐจ์› ๋ฐฐ์—ด์˜ ์›์†Œ ์ˆ˜๊ฐ€ ๋ณ€ํ™˜ ํ›„์—๋„ ๋˜‘๊ฐ™์ด ์œ ์ง€๋˜๋„๋ก ๋ฌถ์–ด์คŒ
  • transposeํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ์ถœ๋ ฅ๋ฐ์ดํ„ฐ๋ฅผ ์ ์ ˆํ•œ ํ˜•์ƒ์œผ๋กœ ๋ฐ”๊พธ์–ด ์คŒ
    • ์ธ๋ฑ์Šค๋ฅผ ์ง€์ •ํ•˜์—ฌ ์ถ•์˜ ์ˆœ์„œ ๋ณ€๊ฒฝ
  • ์—ญ์ „ํŒŒ์—์„œ๋Š” im2col ๋Œ€์‹  col2im ํ•จ์ˆ˜ ์‚ฌ์šฉ
7-4-4) ํ’€๋ง ๊ณ„์ธต ๊ตฌํ˜„
  • ํ’€๋ง์˜ ๊ฒฝ์šฐ์—๋Š” ์ฑ„๋„์ด ๋…๋ฆฝ์ ์ด๋ผ๋Š” ์ ์ด ํ•ฉ์„ฑ๊ณฑ๊ณ„์ธต๊ณผ ๋‹ค๋ฅธ ์ 
  • ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์˜ ๊ฒฝ์šฐ, ๊ฐ ํ•„ํ„ฐ์™€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณฑํ•ด์„œ ํ•œ ์นธ์— ๋‹ค ์ž‘์„ฑ but, ํ’€๋ง์˜ ๊ฒฝ์šฐ ์ด์–ด๋ถ™์ž„?!
  • ์ „๊ฐœ ํ›„ ์ตœ๋Œ“๊ฐ’ ๊ตฌํ•˜๊ณ  ์ ์ ˆํ•œ ํ˜•์ƒ์œผ๋กœ ๋ฐ”๊พธ์–ด์คŒ
    class Pooling:     def __init__(self, pool_h, pool_w, stride=1, pad=0):         self.pool_h=pool_h         self.pool_w=pool_w         self.stride=stride         self.pad=pad              def forward(self, x):         n, c, h, w=x.shape         out_h=int(1+(h-self.pool_h)/self.stride)         out_w=int(1+(w-self.pool_w)/self.stride)          				#์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ „๊ฐœ (1)         col=im2col(x, self.pool_h, self.pool_w, self.stride, self.pad)  #์ „๊ฐœ         col=col.reshape(-1, self.pool_h*self.pool_w)          				#์ตœ๋Œ“๊ฐ’ (2)         out=np.max(col, axis=1) #์ตœ๋Œ“๊ฐ’ ์—ฌ๊ธฐ์„œ๋Š” ์ถ• ๊ธฐ์ค€์œผ๋กœ์˜ ์ตœ๋Œ“๊ฐ’                  #์ ์ ˆํ•œ ๋ชจ์–‘์œผ๋กœ ์„ฑํ˜• (3) 				out=out.reshape(n, out_h, out_w, c).transpose(0, 3, 1, 2)                  return out

 

7-5) CNN ๊ตฌํ˜„ํ•˜๊ธฐ
  • CNN ๋„คํŠธ์›Œํฌ๋Š” Convolution-ReLU-Pooling-Affine-ReLU-Affine-Softmax  ์ˆœ์œผ๋กœ  ํ๋ฆ„
  • ํ•˜์ดํผ ํŒŒ๋ฆฌ๋ฏธํ„ฐ ์„ค์ •
class SimpleConvNet:        def __init__(self, input_dim=(1, 28, 28),                  conv_param={'filter_num':30, 'filter_size':5, 'pad':0, 'stride':1},                  hidden_size=100, output_size=10, weight_init_std=0.01):         filter_num = conv_param['filter_num']         filter_size = conv_param['filter_size']         filter_pad = conv_param['pad']         filter_stride = conv_param['stride']         input_size = input_dim[1]         conv_output_size = (input_size - filter_size + 2*filter_pad) / filter_stride + 1         pool_output_size = int(filter_num * (conv_output_size/2) * (conv_output_size/2))
  • Parameters

    input_size : ์ž…๋ ฅ ํฌ๊ธฐ๏ผˆMNIST์˜ ๊ฒฝ์šฐ์—” 784๏ผ‰ hidden_size_list : ๊ฐ ์€๋‹‰์ธต์˜ ๋‰ด๋Ÿฐ ์ˆ˜๋ฅผ ๋‹ด์€ ๋ฆฌ์ŠคํŠธ๏ผˆe.g. [100, 100, 100]๏ผ‰ output_size : ์ถœ๋ ฅ ํฌ๊ธฐ๏ผˆMNIST์˜ ๊ฒฝ์šฐ์—” 10๏ผ‰ activation : ํ™œ์„ฑํ™” ํ•จ์ˆ˜ - 'relu' ํ˜น์€ 'sigmoid' weight_init_std : ๊ฐ€์ค‘์น˜์˜ ํ‘œ์ค€ํŽธ์ฐจ ์ง€์ •๏ผˆe.g. 0.01๏ผ‰ 'relu'๋‚˜ 'he'๋กœ ์ง€์ •ํ•˜๋ฉด 'He ์ดˆ๊นƒ๊ฐ’'์œผ๋กœ ์„ค์ • 'sigmoid'๋‚˜ 'xavier'๋กœ ์ง€์ •ํ•˜๋ฉด 'Xavier ์ดˆ๊นƒ๊ฐ’'์œผ๋กœ ์„ค์ •

  • pool_output_size = int(filter_num * (conv_output_size/2) * (conv_output_size/2)) ????
  • ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
  # ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”         self.params = {}         self.params['W1'] = weight_init_std * \                             np.random.randn(filter_num, input_dim[0], filter_size, filter_size)         self.params['b1'] = np.zeros(filter_num)         self.params['W2'] = weight_init_std * \                             np.random.randn(pool_output_size, hidden_size)         self.params['b2'] = np.zeros(hidden_size)         self.params['W3'] = weight_init_std * \                             np.random.randn(hidden_size, output_size)         self.params['b3'] = np.zeros(output_size)    
  • ๊ณ„์ธต ์ƒ
# ๊ณ„์ธต ์ƒ์„ฑ         self.layers = OrderedDict()         self.layers['Conv1'] = Convolution(self.params['W1'], self.params['b1'],                                            conv_param['stride'], conv_param['pad'])         self.layers['Relu1'] = Relu()         self.layers['Pool1'] = Pooling(pool_h=2, pool_w=2, stride=2)         self.layers['Affine1'] = Affine(self.params['W2'], self.params['b2'])         self.layers['Relu2'] = Relu()         self.layers['Affine2'] = Affine(self.params['W3'], self.params['b3'])          self.last_layer = SoftmaxWithLoss()
  • ์ดˆ๊ธฐํ™”๋ฅผ ๋งˆ์นœ ํ›„์—๋Š” ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” predict ๋ฉ”์„œ๋“œ์™€ ์†์‹คํ•จ์ˆ˜์˜ ๊ฐ’์„ ๊ตฌํ•˜๋Š” loss ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.
def predict(self, x):         for layer in self.layers.values():             x = layer.forward(x)          return x  def loss(self, x, t):         """์†์‹ค ํ•จ์ˆ˜๋ฅผ ๊ตฌํ•œ๋‹ค.         Parameters         ----------         x : ์ž…๋ ฅ ๋ฐ์ดํ„ฐ         t : ์ •๋‹ต ๋ ˆ์ด๋ธ”         """         y = self.predict(x)         return self.last_layer.forward(y, t)
  • ์—ฌ๊ธฐ๊นŒ์ง€๊ฐ€ ์ˆœ์ „ํŒŒ ์ฝ”๋“œ
  • ์—ญ์ „ํŒŒ ๊ธฐ์šธ๊ธฐ ๊ตฌํ•˜๋Š” ๊ตฌํ˜„
def gradient(self, x, t):         """๊ธฐ์šธ๊ธฐ๋ฅผ ๊ตฌํ•œ๋‹ค(์˜ค์ฐจ์—ญ์ „ํŒŒ๋ฒ•).         Parameters         ----------         x : ์ž…๋ ฅ ๋ฐ์ดํ„ฐ         t : ์ •๋‹ต ๋ ˆ์ด๋ธ”         Returns         -------         ๊ฐ ์ธต์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ๋‹ด์€ ์‚ฌ์ „(dictionary) ๋ณ€์ˆ˜             grads['W1']ใ€grads['W2']ใ€... ๊ฐ ์ธต์˜ ๊ฐ€์ค‘์น˜             grads['b1']ใ€grads['b2']ใ€... ๊ฐ ์ธต์˜ ํŽธํ–ฅ         """         # forward         self.loss(x, t)          # backward         dout = 1         dout = self.last_layer.backward(dout)          layers = list(self.layers.values())         layers.reverse()         for layer in layers:             dout = layer.backward(dout)          # ๊ฒฐ๊ณผ ์ €์žฅ         grads = {}         grads['W1'], grads['b1'] = self.layers['Conv1'].dW, self.layers['Conv1'].db         grads['W2'], grads['b2'] = self.layers['Affine1'].dW, self.layers['Affine1'].db         grads['W3'], grads['b3'] = self.layers['Affine2'].dW, self.layers['Affine2'].db          return grads
7-6) CNN ์‹œ๊ฐํ™”
  • ํ•™์Šต ์ „์˜ ํ•„ํ„ฐ๋Š” ๋ฌด์ž‘์œ„๋กœ ์ดˆ๊ธฐํ™” ๋จ ⇒ ๊ทœ์น™์„ฑ์ด ์—†์Œ
  • ํ•™์Šต ํ›„์˜ ํ•„ํ„ฐ๋Š” ๊ทœ์น™์žˆ๋Š” ์ด๋ฏธ์ง€๊ฐ€ ๋จ
  • ํฐ์ƒ‰๋ถ€๋ถ„๊ณผ ๊ฒ€์ •์ƒ‰ ๋ถ€๋ถ„์œผ๋กœ ๋šœ๋ ทํ•˜๊ฒŒ ๋‚˜๋‰œ ๊ฒƒ → ex) ์™ผ์ชฝ์ด ํฐ์ƒ‰, ์˜ค๋ฅธ์ชฝ์ด ๊ฒ€์ • → ์„ธ๋กœ ๋ฐฉํ–ฅ์˜ ์—์ง€์— ๋ฐ˜์‘ํ•˜๋Š” ํ•„ํ„ฐ์ž„.
  • ex)

→ ์ธต์„ ์—ฌ๋Ÿฌ๊ฒน ์Œ“๊ฒŒ ๋˜๋ฉด ๋” ๋ณต์žกํ•ด์ง€๊ณ  ์ถ”์ƒํ™”๋œ ์ •๋ณด๊ฐ€ ์ถ”์ถœ๋จ → ์‚ฌ๋ฌผ์— ๋Œ€ํ•œ ์˜๋ฏธ ํŒŒ์•… ์ •๋ณด๋ฅผ ๋” ๋งŽ์ด ์•Œ ์ˆ˜ ์žˆ๊ฒŒ ๋จ. ex) ์ฒ˜์Œ์—๋Š” ์—์ง€๋งŒ ๋ฐ˜์‘ , ๊นŠ์–ด์งˆ์ˆ˜๋ก ๋” ๊ตฌ์ฒดํ™”๋œ ๊ทธ๋ฆผ์—

7-7) ๋Œ€ํ‘œ์  CNN
1) LeNet
  • ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต๊ณผ ํ’€๋ง ๊ผ์ธต์„ ๋ฐ˜๋ณต ํ›„, ์™„์ „์—ฐ๊ฒฐ ๊ณ„์ธต์„ ๊ฑฐ์นจ
  • ์†๊ธ€์”จ ์ˆซ์ž ์ธ์‹ ๋„คํŠธ์›Œํฌ
  • LeNet๊ณผ ํ˜„ CNN์˜ ์ฐจ์ด์ 
    1. sigmoid vs ReLU
    1. ์„œ๋ธŒ ์ƒ˜ํ”Œ๋ง(์ค‘๊ฐ„ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ๋ฅผ ์ค„์ž„) vs ์ตœ๋Œ€ ํ’€๋ง ์‚ฌ์šฉ
2) AlexNet
  • LeNet๊ณผ์˜ ์ฐจ์ด์ 
    • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ReLU ์‚ฌ์šฉ
    • LRN์ด๋ผ๋Š” ๊ตญ์†Œ์  ์ •๊ทœํ™” ์‹ค์‹œํ•˜๋Š” ๊ณ„์ธต ์ด์šฉ
    • ๋“œ๋กญ ์•„์›ƒ ์‚ฌ์šฉ
728x90
๋ฐ˜์‘ํ˜•