dhwiii's notepad | 딥 러닝, 코덱 일기장

[논문리뷰] SRGAN : Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network 본문

■ Working Draft/◎ Deep Learning

[논문리뷰] SRGAN : Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

dhwiii 2021. 4. 5. 16:03

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

논문 링크 : https://arxiv.org/abs/1609.04802

GAN(Generative Adversarial Network) 구조를 이용한 Super-Resolution(초해상화) 방법론입니다.

(수정 중 입니다.)


Key Points

: It’s frist framework capable of inferring photo-realistic natural images for 4X upscaling factors.
- 4배 업스케일링이 가능한 첫 번째 프레임워크.

: To achieve upscaling, developers propose a perceptual loss function which consists of an adversarial loss and a content loss.
- Loss Function은 Adversarial loss, Content loss를 사용하여 구성.


Abstract

Despite the accuracy and speed of Neural Networks with using faster and deeper convolutional neural networks, one central problem remains largely unsolved
: how to recover the finer texture details when we super-resolve at large upscaling factors?


Existing Super-Resolution methods are optimized by using MSE minimizing
->
PSNR is high, but high-frequency details are often lacking.

So, they suggest SRGAN for image SR, with using Adversarial Loss & Content Loss.

Adversarial Loss : discriminator network train.
Content Loss : Perceptual similarity train at the pixel space.


Introduction

The task of estimating a high-resolution(HR) image from its low-resolution(LR) counterpart is referred to as super-resolution(SR)

SR algorithms is commonly the minimization of the MSE between the recovered HR image and the ground truth
-> But the ability of MSE and PSNR to capture perceptually relevant differences, such as high texture detail, is very limited as they are defined based on pixel-wise image differences.

In this work, they propose a SRGAN for which they employ a
ResNet with skip-connection and diverge from MSE as the sole optimization target.
And define a novel perceptual loss using high-level feature maps of the VGG Network

 


Methods

In SISR(Single Image Super-Resolution) aim is to estimate a high-resolution, super-resolved image
from a low-resolution input image.

It’s almost same the usual GAN Loss that Goodfellow suggested

The Loss value is obtained using the Fake high-resolution images which is
created from the LR image and HR image (ground truth)

Generator & Discriminator Network Structure


Results


개인적인 생각
SRGAN의 경우 PSNR이나 SSIM은 수치적으로 낮음에도 불구하고 MOS평가에서 높은 점수를 받았는데,
수치적으로 더 나은 결과를 가져오지 않았더라도 논문이 accept된 것이 특이한 케이스가 아닌가 생각됩니다.

 

 

 

Comments