๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90
๋ฐ˜์‘ํ˜•
Inception V2/3 1. Intro CNN์ด ๋ฐœ์ „ํ•˜๋ฉด์„œ ๋ชจ๋ธ ํฌ๊ธฐ๋‚˜ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ ๋•Œ๋ฌธ์— ํ•œ๊ณ„ ๋ฐœ์ƒ. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜์™€ ๊ฐ™์€ ๋ฐฉ์‹์„ ์†Œ๊ฐœํ•จ ์•„๋ž˜ ๋ฐฉ๋ฒ•์œผ๋กœ ILSVRC 2012 daset์œผ๋กœ top1 error๊ฐ€ 17.2%, top5 error๊ฐ€ 3.58%๋ฅผ ๋‹ฌ์„ฑ VGGNet์€ ์„ฑ๋Šฅ์€ ์ข‹์ง€๋งŒ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์•„์„œ ๋น„์šฉ ๋งŽ์ด ๋ฐœ์ƒ Inception์€ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜ ์ค„์ด๊ณ  ์„ฑ๋Šฅ ์ข‹๋‹ค๋Š” ๊ฒฐ๋ก ์ด ๋‚˜์˜ด inception์ด ๊ตฌ์กฐ๊ฐ€ ๋ณต์žกํ•ด์„œ ์˜คํžˆ๋ ค ์ตœ์ ํ™” ๋ฐฉํ•ด๊ฐ€ ๋œ๋‹ค๋Š” ๊ฒฐ๊ณผ ๋ฐœ์ƒ. ์˜คํžˆ๋ ค ํšจ์œจ์„ฑ์ด ๋–จ์–ด์ง⇒ ์ด ์นœ๊ตฌ๋Š” ๊ตฌ์กฐ๊ฐ€ ๋ณต์žกํ•ด์„œ ์ˆ˜์ •ํ•˜๊ธฐ ์–ด๋ ต + ๋‹จ์ˆœ ํ™•์žฅ์˜ ๊ฒฝ์šฐ ์˜คํžˆ๋ ค ๊ณ„์‚ฐ ์˜ค๋ž˜ ๊ฑธ๋ฆฌ๊ฒŒ ๋จ ⇒ ์–ด๋–ค ์ด์œ ๋กœ ์ธํ•ด ํšจ์œจ์„ฑ์ด ์ข‹์€์ง€ ์ •ํ™•ํ•˜๊ฒŒ ์•Œ ์ˆ˜ ์—†์–ด์„œ ์ƒˆ๋กœ์šด ๊ณณ์— ์ ์šฉํ•˜๊ธฐ ์–ด๋ ต ํ•ด์„œ ๋‹จ์ ๋“ค์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ๋‚˜์˜จ ๊ฒƒ์ด ๋…ผ๋ฌธ์ž„... 2023. 7. 6.
ELMO 1. Intro ๊ฐ™์€ read๋ผ๊ณ  ํ•ด๋„ ํ˜„์žฌํ˜•๊ณผ ๊ณผ๊ฑฐํ˜•์ด ์žˆ์Œ -> ์•ž์—์„œ๋งŒ ์˜ˆ์ธก์„ ํ•ด์„œ ์ถœ๋ ฅํ•˜๋ฉด ์ •ํ™•ํžˆ ๋ชจ๋ฅด๊ธฐ ๋•Œ๋ฌธ์—, ๋’ค์—์„œ๋ถ€ํ„ฐ ์˜ค๋Š” ์• ๋“ค์„ ๊ฐ€์ง€๊ณ  ์˜ˆ์ธก์„ ํ•ด์„œ read๊ฐ€ ๊ณผ๊ฑฐํ˜•์œผ๋กœ ์“ฐ์ธ๋‹ค! ๋ผ๊ณ  ์•Œ๋ ค์ฃผ๋Š”๊ฒŒ ์—˜๋ชจ์˜ ์—ญํ•  2. Overall architecture read์— ํ•ด๋‹นํ•˜๋Š” ์นœ๊ตฌ๋ฅผ ๋ฝ‘๋Š”๋‹ค forward ๋ถ€๋ถ„๊ณผ backward ๋ถ€๋ถ„์„ ํ•จ๊ป˜ ํ•™์Šต์‹œํ‚ด ์ด๋•Œ, word embedding ๋ถ€๋ถ„, LSTM1์ธต, LSTM2์ธต ๋“ฑ ๊ฐ๊ฐ์˜embedding๊ณผ LSTM๋ผ๋ฆฌ concat์„ ์‹œํ‚ด ์ดํ›„, ์•Œ๋งž๊ฒŒ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณฑํ•ด์คŒ ( ์ด๋•Œ ์•„๋ž˜์— ์žˆ์„์ˆ˜๋ก ๋ฌธ๋ฒ•์ ์ธ ์ธก๋ฉด์—์„œ์˜ ๋ฒกํ„ฐ์ด๊ณ , ์œ„๋กœ ๊ฐˆ์ˆ˜๋ก ๋ฌธ๋งฅ์— ๋งž๋Š” ๋ฒกํ„ฐ๋ผ๊ณ  ํ•จ) ์ดํ›„, ๊ฐ€์ค‘ํ•ฉ์„ ํ•˜๋ฉด ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ๊ฐ€ ๋งŒ๋“ค์–ด์ง → read์— ๋Œ€ํ•œ embedding ์ธต์— elmo ๊ฐ’.. 2023. 7. 6.
SegNet Intro ์ž์œจ์ฃผํ–‰ - road scene segmentation task๋ฅผ ํ’€๊ณ ์ž ํ•˜์˜€์Œ ๋„๋กœ์™€ ๋ณด๋„๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ฑฐ๋‚˜, ์ž๋™์ฐจ์™€ ๋ณดํ–‰์ž ๋“ฑ max pooling, subsampling ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋‹ค๋ณด๋ฉด ์ถ”์ƒ์ ์ธ ํ”ผ์ฒ˜๋งต๋“ค์ด ๋งŒ๋“ค์–ด์ง ( ์ฆ‰, ์ด๋ฏธ์ง€ ํฌ๊ธฐ๊ฐ€ ์ ์  ์ค„์–ด๋“ค์ˆ˜๋ก ์›๋ณธ ์ •๋ณด๊ฐ€ ์†์‹ค๋จ → ์ถ”์ƒ์ ์ธ ๊ฒฐ๊ณผ๊ฐ’์œผ๋กœ ๋ณ€ํ•จ) ๊ทธ๋ ‡๊ฒŒ ๋˜๋ฉด ํ”ผ์ฒ˜๋งต์œผ๋กœ ํ”ฝ์„ผ ๋‹จ์œ„๋กœ ์ •๊ตํ•˜๊ฒŒ segmentation์„ ๋ชปํ•จ ๋˜ํ•œ, ์ž์œจ์ฃผํ–‰์„ ์œ„ํ•ด์„œ๋Š” ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋น ๋ฅด๊ฒŒ segmentation์„ ํ•ด์•ผํ•˜์ง€๋งŒ, ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ๋งŽ์œผ๋ฉด ๋น ๋ฅด๊ฒŒ ํ•˜์ง€ ๋ชปํ•จ. ๊ทธ๋ž˜์„œ ์ด์— ๋Œ€ํ•œ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‚˜์˜จ ๊ฒƒ์ด segnet์ž„ Network Architecture SegNet์˜ encoder-decoder๋Š” ๊ฐ๊ฐ 13๊ฐœ์˜ convolution layer.. 2023. 7. 6.
CycleGAN 0. Abstract Figure 1: Given any two unordered image collections X and Y , our algorithm learns to automatically “translate” an image from one into the other and vice versa: (left) Monet paintings and landscape photos from Flickr; (center) zebras and horses from ImageNet; (right) summer and winter Yosemite photos from Flickr. Example application (bottom): using a collection of paintings of famous.. 2023. 7. 5.
XLNet: Generalized Autoregressive Pretraining for Language Understanding ๐Ÿ’ก [๋…ผ๋ฌธ๋ฆฌ๋ทฐ] XLNet: Generalized Autoregressive Pretraining for Language Understanding XLNet: Generalized Autoregressive Pretraining for Language Understanding Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le https://arxiv.org/abs/1906.08237 1. Introduction Unsupervised Representation Learning์€ Large-scale์˜ corpora๋ฅผ ํ†ตํ•ด Pre… https://jeonsworld.github.io/NLP/xlnet.. 2023. 7. 5.
Inception-v4, Inception-ResNetand the Impact of Residual Connections on Learning ๐Ÿ’ก Inception-v1 Inception-v2 inception-v3 Inception-v2 ๊ตฌ์กฐ์—์„œ ์œ„์—์„œ ์„ค๋ช…ํ•œ ๊ธฐ๋ฒ•๋“ค์„ ํ•˜๋‚˜ํ•˜๋‚˜ ์ถ”๊ฐ€ํ•ด ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜๊ณ , ๋ชจ๋“  ๊ธฐ๋ฒ•๋“ค์„ ์ ์šฉํ•˜์—ฌ ์ตœ๊ณ  ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ชจ๋ธ์ด Inception-v3 Inception-v3์€ Inception-v2์—์„œ BN-auxiliary + RMSProp + Label Smoothing + Factorized 7x7 ์„ ๋‹ค ์ ์šฉํ•œ ๋ชจ๋ธ [๋…ผ๋ฌธ ์ฝ๊ธฐ] Inception-v3(2015) ๋ฆฌ๋ทฐ, Rethinking the Inception Architecture for Computer Vision ์ด๋ฒˆ์— ์ฝ์–ด๋ณผ ๋…ผ๋ฌธ์€ Rethinking the Inception Architecture for Computer Vision ์ž…๋‹ˆ๋‹ค. ๋ณธ ๋…ผ.. 2023. 7. 5.
728x90
๋ฐ˜์‘ํ˜•