Deep Neural Networks for No-Reference Video Quality Assessment

Junyong You; Jari Korhonen

doi:10.1109/ICIP.2019.8803395

Deep Neural Networks for No-Reference Video Quality Assessment

Junyong You^*, Jari Korhonen

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

66 Citations (Scopus)

Abstract

Video quality assessment (VQA) is a challenging task due to the complexity of modeling perceived quality characteristics in both spatial and temporal domains. A novel no-reference (NR) video quality metric (VQM) is proposed in this paper based on two deep neural networks (NN), namely 3D convolution network (3D-CNN) and a recurrent NN composed of long short-term memory (LSTM) units. 3D-CNNs are utilized to extract local spatiotemporal features from small cubic clips in video, and the features are then fed into the LSTM networks to predict the perceived video quality. Such design can elaborately tackle the issue of insufficient training data whilst also efficiently capture perceptive quality features in both spatial and temporal domains. Experimental results with respect to two publicly available video quality datasets have demonstrate that the proposed quality metric outperforms the other compared NR quality metrics.

Original language	English
Title of host publication	2019 IEEE International Conference on Image Processing (ICIP)
Publisher	IEEE Explore
Pages	2349-2353
Number of pages	5
ISBN (Electronic)	978-1-5386-6249-6
ISBN (Print)	978-1-5386-6250-2
DOIs	https://doi.org/10.1109/ICIP.2019.8803395
Publication status	Published - 2019
Event	26th IEEE International Conference on Image Processing (ICIP) - Taipei, TAIWAN Duration: 22 Sept 2019 → 25 Sept 2019

Conference

Conference	26th IEEE International Conference on Image Processing (ICIP)
Country/Territory	TAIWAN
City	Taipei
Period	22/09/19 → 25/09/19

Keywords

3D-CNN
deep learning
LSTM
video quality assessment
PREDICTION

Access to Document

10.1109/ICIP.2019.8803395

Cite this

@inproceedings{f1bfc539b7f14c3a9e6f4ed3f077917a,

title = "Deep Neural Networks for No-Reference Video Quality Assessment",

abstract = "Video quality assessment (VQA) is a challenging task due to the complexity of modeling perceived quality characteristics in both spatial and temporal domains. A novel no-reference (NR) video quality metric (VQM) is proposed in this paper based on two deep neural networks (NN), namely 3D convolution network (3D-CNN) and a recurrent NN composed of long short-term memory (LSTM) units. 3D-CNNs are utilized to extract local spatiotemporal features from small cubic clips in video, and the features are then fed into the LSTM networks to predict the perceived video quality. Such design can elaborately tackle the issue of insufficient training data whilst also efficiently capture perceptive quality features in both spatial and temporal domains. Experimental results with respect to two publicly available video quality datasets have demonstrate that the proposed quality metric outperforms the other compared NR quality metrics.",

keywords = "3D-CNN, deep learning, LSTM, video quality assessment, PREDICTION",

author = "Junyong You and Jari Korhonen",

year = "2019",

doi = "10.1109/ICIP.2019.8803395",

language = "English",

isbn = "978-1-5386-6250-2",

pages = "2349--2353",

booktitle = "2019 IEEE International Conference on Image Processing (ICIP)",

publisher = "IEEE Explore",

note = "26th IEEE International Conference on Image Processing (ICIP) ; Conference date: 22-09-2019 Through 25-09-2019",

}

TY - GEN

T1 - Deep Neural Networks for No-Reference Video Quality Assessment

AU - You, Junyong

AU - Korhonen, Jari

PY - 2019

Y1 - 2019

N2 - Video quality assessment (VQA) is a challenging task due to the complexity of modeling perceived quality characteristics in both spatial and temporal domains. A novel no-reference (NR) video quality metric (VQM) is proposed in this paper based on two deep neural networks (NN), namely 3D convolution network (3D-CNN) and a recurrent NN composed of long short-term memory (LSTM) units. 3D-CNNs are utilized to extract local spatiotemporal features from small cubic clips in video, and the features are then fed into the LSTM networks to predict the perceived video quality. Such design can elaborately tackle the issue of insufficient training data whilst also efficiently capture perceptive quality features in both spatial and temporal domains. Experimental results with respect to two publicly available video quality datasets have demonstrate that the proposed quality metric outperforms the other compared NR quality metrics.

AB - Video quality assessment (VQA) is a challenging task due to the complexity of modeling perceived quality characteristics in both spatial and temporal domains. A novel no-reference (NR) video quality metric (VQM) is proposed in this paper based on two deep neural networks (NN), namely 3D convolution network (3D-CNN) and a recurrent NN composed of long short-term memory (LSTM) units. 3D-CNNs are utilized to extract local spatiotemporal features from small cubic clips in video, and the features are then fed into the LSTM networks to predict the perceived video quality. Such design can elaborately tackle the issue of insufficient training data whilst also efficiently capture perceptive quality features in both spatial and temporal domains. Experimental results with respect to two publicly available video quality datasets have demonstrate that the proposed quality metric outperforms the other compared NR quality metrics.

KW - 3D-CNN

KW - deep learning

KW - LSTM

KW - video quality assessment

KW - PREDICTION

U2 - 10.1109/ICIP.2019.8803395

DO - 10.1109/ICIP.2019.8803395

M3 - Published conference contribution

SN - 978-1-5386-6250-2

SP - 2349

EP - 2353

BT - 2019 IEEE International Conference on Image Processing (ICIP)

PB - IEEE Explore

T2 - 26th IEEE International Conference on Image Processing (ICIP)

Y2 - 22 September 2019 through 25 September 2019

ER -

Deep Neural Networks for No-Reference Video Quality Assessment

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this