๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90
๋ฐ˜์‘ํ˜•

Deep Learning/[๋…ผ๋ฌธ] Paper Review38

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization CAM(Class Activation Maps) ์ด๋ž€? Global Max Pooling(GMP) vs Global Average Pooling(GAP) : ์ „์ฒด ์˜์—ญ ๋‚ด์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ Global Max Pooling(GMP)๋ผ๊ณ  ํ•จ : ๋ฐ˜๋ฉด, ๋ชจ๋“  ๊ฐ’์„ ๊ณ ๋ คํ•˜์—ฌ ํ‰๊ท ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ Global Average Pooling(GAP)์ด๋ผ๊ณ  ํ•จ : ๋ณดํ†ต CNN์˜ ๊ตฌ์กฐ์—์„œ๋Š”๋งˆ์ง€๋ง‰ feature map์„ flattenํ•˜์—ฌ 1์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋งŒ๋“  ๋’ค ์ด๋ฅผ Fully Connected Netowork๋ฅผ ํ†ต๊ณผํ•˜์—ฌ softmax๋กœ classification์„ ํ–ˆ์—ˆ์Œ. : ์ด FC layer๋Š” parameter์˜ ๊ฐœ์ˆ˜๋ฅผ ๋งค์šฐ ์ปค์ง€๋„๋ก ๋งŒ๋“ค๊ธฐ ๋•Œ๋ฌธ์— overfitting ์œ„ํ—˜์ด ์ฆ๊ฐ€ํ•  ์ˆ˜ ์žˆ๊ณ , F.. 2023. 8. 13.
STAR: Sparse Trained Articulated Human Body Regressor(2020) 0. ABSTRACT 1. ํ›จ์”ฌ ๊ฐ„๊ฒฐํ•œ ๋ชจ๋ธ (๋งค๊ฐœ๋ณ€์ˆ˜์˜ ์ˆ˜ ๊ฐ์†Œ) : SMPL๋ณด๋‹ค 80% ๋” ์ž‘์€ ๋งค๊ฐœ๋ณ€์ˆ˜๋งŒ์„ ์‚ฌ์šฉ 2. ์ฒดํ˜•์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋Š” ํ˜•ํƒœ ๋ฐ ํฌ์ฆˆ๋ฅผ ํ•™์Šต์‹œํ‚ด (์ฒดํ˜• ๋ฐ BMI ํ™œ์šฉ) : SMPL์˜ ๊ฒฝ์šฐ, ์ฒดํ˜•์— ๋”ฐ๋ฅธ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ๋ณ€ํ˜•์„ ๊ณ ๋ คํ•˜์ง€ ์•Š์•˜์Œ (์ฒดํ˜•์— ์ƒ๊ด€์—†์ด ๋‹ค ๋™์ผํ•œ ๊ทผ์œก ํ˜•ํƒœ๋กœ ํ‘œํ˜„) ex) ๋ˆ„๊ตฐ๊ฐ€ ํŒ”์„ ๊ตฌ๋ถ€๋ ธ์„ ๋•Œ, ๊ฐ ์ฒดํ˜•๋งˆ๋‹ค ํŒ”๊ฟˆ์น˜ ์ฃผ๋ณ€์˜ ํ”ผ๋ถ€๋‚˜ ๊ทผ์œก์˜ ํ˜•ํƒœ๊ฐ€ ๋” ๋‹ค๋ฅด๊ฒŒ ๋ณ€ํ˜•ํ•จ(๊ณจ๊ฒฉ์ด ๋” ํฐ ์‚ฌ๋žŒ, ๊ทผ์œก์ด ๋” ๋งŽ๊ฑฐ๋‚˜ ์ ์€ ์‚ฌ๋žŒ..) 3. ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๋Š˜๋ฆผ :์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ ๊ฐœ์„  ๊ฒฐ๋ก : ์†Œํ˜• ๋ชจ๋ธ์ด๋ฉฐ, ์ƒˆ๋กœ์šด ์ฒดํ˜•์— ๋Œ€ํ•ด ๋” ์ž˜ ์ผ๋ฐ˜ํ™”๋จ, SMPL ๋Œ€์ฒด ๋ชจ๋ธ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅ 1 INTRODUCTION ํ•ด๋‹น ๊ด€์ ˆ(๋ฌด๋ฆŽ์ด๋ฉด ๋ฌด๋ฆŽ๊นŒ์ง€๋งŒ) ์ฃผ์œ„ ๋ถ€๋ถ„๊นŒ์ง€๋งŒ ํ•™์Šต์‹œํ‚ด : ๊ธฐ์กด SMPL ๊ฐ™์€.. 2023. 8. 10.
DINO: Emerging Properties in Self-Supervised Vision Transformers (2021) Self Supervised learning https://brunch.co.kr/@b047a588c11b462/45 : ๋น„์ง€๋„ ํ•™์Šต ๋ฐฉ์‹์˜ ์ผ์ข…์œผ๋กœ์„œ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์—ฌ ์ธ๊ณต์ง€๋Šฅ์ด ์Šค์Šค๋กœ ๋ถ„๋ฅ˜์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋„๋ก ํ•จ : ์Šค์Šค๋กœ ํƒœ์Šคํฌ๋ฅผ ์„ค์ •ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•œ๋‹ค๋Š” ์ ์—์„œ ๊ธฐ์กด์˜ ๋น„์ง€๋„ ํ•™์Šต ๋ฐฉ์‹๊ณผ ์ฐจ์ด๊ฐ€ ์กด์žฌํ•˜๋ฉฐ, ์ธํ„ฐ๋„ท์ƒ ํฌ๋กค๋ง์„ ํ†ตํ•ด ์ˆ˜์ง‘ํ•  ์ˆ˜ ์žˆ๋Š” ํ…์ŠคํŠธ, ์ด๋ฏธ์ง€, ๋น„๋””์˜ค ๋“ฑ ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•  ์ˆ˜๋„ ์žˆ์Œ : ๋ชจ๋ธ์ด ํ™•์žฅ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„์š”๋กœ ํ•˜์ง€๋งŒ, ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์†์ ์œผ๋กœ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋งŽ์€ ๋น„์šฉ์ด ์š”๊ตฌ๋œ๋‹ค๋Š” ๋‹จ์ ์ด ์กด์žฌ : ์ž๊ธฐ ์ง€๋„ ํ•™์Šต์€ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ํ•™์Šต ๋ฐ์ดํ„ฐ๋งŒ ํ™•๋ณดํ•˜๋”๋ผ๋„ ๋ชจ๋ธ์˜ ๊ทœ๋ชจ๋ฅผ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ด์— ๋”ฐ๋ผ ์ •ํ™•๋„ ์—ญ์‹œ ํ–ฅ์ƒ์‹œํ‚ฌ.. 2023. 8. 10.
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image SMPL-X๋ž€? : ๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ, ์‹ ์ฒด๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ์†๊ณผ ์–ผ๊ตด์„ ํ†ตํ•ฉ์ ์œผ๋กœ 3D ํ˜•ํƒœ์˜ ์‹ ์ฒด๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ชจ๋ธ ์ขŒ: SMPL, ์ค‘๊ฐ„: SMPL+H, ์šฐ: SMPL-X 0. ABSTRACT : 3D ์Šค์บ”์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๊ฐ„์˜ ๋ชธ์ฒด์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ํ†ตํ•ฉ๋œ 3D ๋ชจ๋ธ์ธ SMPL-X๋ฅผ ํ›ˆ๋ จ : SMPL์„ ํ™•์žฅํ•ด์„œ ์†๊ณผ ํ‘œ์ •๊นŒ์ง€ ๊ตฌํ˜„ํ•˜๊ณ ์ž ํ•จ : SMPL-X๋Š” ์–ผ๊ตด, ์†, ๋ชฉ, ์‹ ์ฒด ๋“ฑ ๋‹ค์–‘ํ•œ ์ธ์ฒด ํ˜•ํƒœ์™€ ์ž์„ธ๋ฅผ ํฌํ•จํ•˜๋Š” ๋งŽ์€ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ, ์ด๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ฐ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€ ์ •๋ณด์™€ ๊ด€์ ˆ ์ •๋ณด๋ฅผ ๊ฒฐํ•ฉํ•ด์„œ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•œ๋‹ค๋Š” ๋ฌธ์ œ์  ์กด์žฌ : ๋”ฐ๋ผ์„œ ๊ธฐ์กด SMPLify ์•Œ๊ณ ๋ฆฌ์ฆ˜(2D์—์„œ ๊ด€์ ˆ์ •๋ณด ์ถ”์ถœํ•ด์„œ ํ•™์Šต์‹œํ‚ค๋Š”)์„ ํ™œ์šฉํ•ด์„œ SMPL-X ๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•ด์„œ ์ ํ•ฉ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์ œ.. 2023. 8. 4.
BodyNet: Volumetric Inference of 3D Human Body Shapes BodyNet์ด๋ž€? : ๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ 2D pose, segmentation ์ถ”์ถœ, ๋‘ ๊ฐœ์˜ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•ด 3D pose๋ฅผ ํ•™์Šต, ์ดํ›„, 3๊ฐ€์ง€ ์ •๋ณด์— RGB ์ •๋ณด๊นŒ์ง€ ํ™œ์šฉํ•ด 3D์˜ ๋ถ€ํ”ผ ๊ธฐ๋ฐ˜ ์ฒดํ˜•์„ ๊ตฌ์„ฑํ•˜๋Š” Network๋ฅผ ๋งํ•จ : end to end ํ˜•์‹ 1. ์ž…๋ ฅ RGB ์ด๋ฏธ์ง€๋Š” ๋จผ์ € 2D ํฌ์ฆˆ ์ถ”์ •๊ณผ 2D ์‹ ์ฒด ๋ถ€์œ„ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์„ ์œ„ํ•œ ํ•˜์œ„ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ต๊ณผ 2. 2D pose์™€ segmentation์„ ํ›ˆ๋ จ 3. ํ•™์Šต๋œ 2D pose์™€ Segmentation ๊ฐ€์ค‘์น˜๋ฅผ ๊ณ ์ •ํ•ด์„œ 3D pose๋ฅผ ํ›ˆ๋ จ์‹œํ‚ด 4. ์ดํ›„, ์ด์ „์˜ ๋ชจ๋“  ๋„คํŠธ์›Œํฌ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณ ์ •ํ•˜๊ณ  3D ํ˜•ํƒœ network๋ฅผ ํ›ˆ๋ จ 5. ์ถ”๊ฐ€ ์žฌํ”„๋กœ์ ์…˜ ์†์‹ค๋กœ ํ˜•ํƒœ ๋„คํŠธ์›Œํฌ ํ›ˆ๋ จํ•ด์„œ ๋ถ€ํ”ผ ๊ธฐ๋ฐ˜ ํ˜•ํƒœ ์ถ”์ • ์ž‘์—…์— ๋Œ€ํ•ด ์„ธ๋ฐ€ ์กฐ์ • 6. ๊ฒฐํ•ฉ๋œ ์†.. 2023. 8. 3.
mixup: Beyond Emprical Risk Minimization Mixup์ด ๋ญ์•ผ? : Beyond Emprical Risk Minimization - ๊ฒฝํ—˜์  ์œ„ํ—˜ ์ตœ์†Œํ™”๋ฅผ ๋„˜์–ด? ์ด๊ฒŒ ๋„๋Œ€์ฒด ๋ญ”๋ง์ธ๊ฐ€ : mixup ⇒ data augmentaion ๊ธฐ๋ฒ• :๋‘ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ํ˜•์ ์œผ๋กœ ๊ฒฐํ•ฉํ•ด์„œ ์ƒˆ๋กœ์šด ์ƒ˜ํ”Œ์„ ์ƒ์„ฑ : ์ •๋ง ์‰ฝ๊ฒŒ ๋งํ•˜์ž๋ฉด, ์šฐ๋ฆฌ๊ฐ€ ์ผ๋ฐ˜์ ์œผ๋กœ ํ›ˆ๋ จ, ์˜ˆ์ธก๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์„ ์“ฐ๋ฉด ๊ณผ์ ํ•ฉ์ด ๋ฐœ์ƒํ•˜๊ธฐ ๋งˆ๋ จ์ž„. : ์™œ๋ƒ? ํ›ˆ๋ จ๋ฐ์ดํ„ฐ๋งŒ ๋ณด๊ณ  ํ•™์Šต์„ ์‹œํ‚ค๊ธฐ ๋•Œ๋ฌธ์—, ๋‹น์—ฐํžˆ ํ•™์Šตํ•œ ๋ชจ๋ธ์€ ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์— ํŽธํ–ฅ๋จ. : ์ฆ‰, ๊ณผ์ ํ•ฉ์ด ๋‚œ๋‹ค๋Š” ๋ง. ๊ฒฐ๊ตญ, ๋‹ค๋ฅธ ์กฐ๊ธˆ๋งŒ ๋‹ค๋ฅธ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๋Š” ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉ๋งŒ ํ•ด(Out of Distribution) ๋ชจ๋ธ์ด ์ทจ์•ฝํ•  ์ˆ˜ ๋ฐ–์— ์—†์Œ : ๋”ฐ๋ผ์„œ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹๋งŒ ํ•™์Šต ์‹œํ‚ค๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์˜ ๊ทผ๋ฐฉ ๋ถ„ํฌ๋„ ํ•จ๊ป˜ ํ•™์Šต์„ ์‹œ์ผœ์„œ ๋ณด๋‹ค ๋” .. 2023. 8. 3.
728x90
๋ฐ˜์‘ํ˜•