Special issue: 3D Video Technologies and Services

Vol. 68, n° 11-12, November-December 2013
Content available on Springerlink

Guest editors
Béatrice Pesquet-Popescu, Télécom ParisTech-LTCI/CNRS,  France
Frédéric Dufaux, Télécom ParisTech-LTCI/CNRS,  France
Touradj Ebrahimi, EPFL, Switzerland
Shipeng Li, Microsoft, China

Foreword

Béatrice Pesquet-Popescu, Frédéric Dufaux, Touradj Ebrahimi, Shipeng Li

Camera array image rectification and calibration for stereoscopic and autostereoscopic displays

Vincent Nozick
Gaspard Monge Institute, Université Paris-Est Marne-la-Vallée, France

Abstract This paper presents an image rectification method for an arbitrary number of views with aligned camera center. This paper also describes how to extend this method to easily perform a robust camera calibration. These two techniques can be used for stereoscopic rendering to enhance the perception comfort or for depth from stereo. In this paper, we first expose why epipolar geometry is not suited to solve this problem. Second, we propose a nonlinear method that includes all the images in the rectification process. Then, we detail how to extract the rectification parameters to provide a quasi-Euclidean camera calibration. Our method only requires point correspondences between the views and can handle images with different resolutions. The tests show that it is robust to noise and to sparse point correspondences among the views.

Keywords Image rectification – Stereoscopic displays – Camera array – Camera array calibration

Edge-preserving interpolation of depth data exploiting color information

Valeria Garro1 , Carlo Dal Mutto2 , Pietro Zanuttigh2  and Guido M. Cortelazzo2  
(1) University of Verona, Verona, Italy
(2) University of Padova, Padova, Italy

Abstract The extraction of depth information associated to dynamic scenes is an intriguing topic, because of its perspective role in many applications, including free viewpoint and 3D video systems. Time-of-flight (ToF) range cameras allow for the acquisition of depth maps at video rate, but they are characterized by a limited resolution, specially if compared with standard color cameras. This paper presents a super-resolution method for depth maps that exploits the side information from a standard color camera: the proposed method uses a segmented version of the high-resolution color image acquired by the color camera in order to identify the main objects in the scene and a novel surface prediction scheme in order to interpolate the depth samples provided by the ToF camera. Effective solutions are provided for critical issues such as the joint calibration between the two devices and the unreliability of the acquired data. Experimental results on both synthetic and real-world scenes have shown how the proposed method allows to obtain a more accurate interpolation with respect to standard interpolation approaches and state-of-the-art joint depth and color interpolation schemes.

Keywords Depth map – Interpolation – Super resolution – Calibration Time of flight

A study of depth/texture bit-rate allocation in multi-view video plus depth compression

Emilie Bosc1, Fabien Racapé1, Vincent Jantet1,2, Paul Riou1, Muriel Pressigout1, and Luce Morin1
(1) Université Européenne de Bretagne, INSA de Rennes, France
(2) INRIA Rennes, Bretagne Atlantique, France

Abstract Multi-view video plus depth (MVD) data offer a reliable representation of three-dimensional (3D) scenes for 3D video applications. This is a huge amount of data whose compression is an important challenge for researchers at the current time. Consisting of texture and depth video sequences, the question of the relationship between these two types of data regarding bit-rate allocation often raises. This paper questions the required ratio between texture and depth when encoding MVD data. In particular, the paper investigates the elements impacting on the best bit-rate ratio between depth and color: total bit-rate budget, input data features, encoding strategy, and assessed view.

Keywords
Multi-view video coding – Bit-rate – HEVC – H.264 – PSNR – View synthesis

Rate-distortion analysis of multiview coding in a DIBR framework

Boshra Rajaei1, 2 , Thomas Maugey3 , Hamid-Reza Pourreza1,  and Pascal Frossard3  
(1) Ferdowsi University of Mashhad, Mashhad, Iran
(2) Sadjad Institute of Higher Education, Mashhad, Iran
(3) École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

Abstract Depth image-based rendering techniques for multiview applications have been recently introduced for efficient view generation at arbitrary camera positions. The rate control in an encoder has thus to consider both texture and depth data. However, due to different structures of depth and texture data and their different roles on the rendered views, the allocation of the available bit budget between them requires a careful analysis. Information loss due to texture coding affects the value of pixels in synthesized views, while errors in depth information lead to a shift in objects or to unexpected patterns at their boundaries.In this paper, we address the problem of efficient bit allocation between texture and depth data of multiview sequences.We adopt a rate-distortion framework based on a simplified model of depth and texture images, which preserves the main features of depth and texture images. Unlike most recent solutions, our method avoids rendering at encoding time for distortion estimation so that the encoding complexity stays low. In addition to this, our model is independent of the underlying inpainting method that is used at the decoder for filling holes in the synthetic views. Extensive experiments validate our theoretical results and confirm the efficiency of our rate allocation strategy.
Keywords Depth image-based renderingMultiview video codingRate allocationRate-distortion analysis

How visual fatigue and discomfort impact 3D-TV quality of experience: a comprehensive review of technological, psychophysical, and psychological factors

Matthieu Urvoy, Marcus Barkowsky,  and Patrick Le Callet
LUNAM Université, Université de Nantes, IRCCyN, CNRS, Polytech Nantes, France

Abstract The quality of experience (QoE) of 3D contents is usually considered to be the combination of the perceived visual quality, the perceived depth quality, and lastly the visual fatigue and comfort. When either fatigue or discomfort are induced, studies tend to show that observers prefer to experience a 2D version of the contents. For this reason, providing a comfortable experience is a prerequisite for observers to actually consider the depth effect as a visualization improvement. In this paper, we propose a comprehensive review on visual fatigue and discomfort induced by the visualization of 3D stereoscopic contents, in the light of physiological and psychological processes enabling depth perception. First, we review the multitude of manifestations of visual fatigue and discomfort (near triad disorders, symptoms for discomfort), as well as means for detection and evaluation. We then discuss how, in 3D displays, ocular and cognitive conflicts with real world experience may cause fatigue and discomfort; these includes the accommodation–vergence conflict, the inadequacy between presented stimuli and observers depth of focus, and the cognitive integration of conflicting depth cues. We also discuss some limits for stereopsis that constrain our ability to perceive depth, and in particular the perception of planar and in-depth motion, the limited fusion range, and various stereopsis disorders. Finally, this paper discusses how the different aspects of fatigue and discomfort apply to 3D technologies and contents. We notably highlight the need for respecting a comfort zone and avoiding camera and rendering artifacts. We also discuss the influence of visual attention, exposure duration, and training. Conclusions provide guidance for best practices and future research.

Keywords Visual fatigue – Visual discomfort – 3D-TV – Quality of experience – 3D technologies – Stereopsis

Enhancing the audience experience during sport events: real-time processing of multiple stereoscopic cameras

Julien Maillard1, Marc Leny2,  and Hélène Diakhaté3  
(1) Vitec, Chatillon, France
(2) Ektacom, Les Ulis, France
(3) Thales Communications and Security, Gennevilliers, France

Abstract From video acquisition to 3D rendering, most of the hardware and software modules required for stereoscopy are currently available in academic or industrial R&D laboratories. Some are even features of open-source libraries. However, designing a stereoscopic architecture able to perform this acquisition followed by geometrical calibration and colour correction, disparity maps computation, multi-view coding and transmission for several cameras into one dedicated server remains a challenge. This was achieved for the SkyMedia project which aimed at providing an enhanced experience for the audience, organising staff and performers of an event. Compromises were required, from lower-resolution depth estimation to limited MultiView Coding predictions, but in the end the project system was fit to the task and delivered contents to the various people evolving around the 2012 Turin Marathon.

Keywords
3D Processing – Real time – Stereoscopy – Multiple streams acquisition and processing – Geometric and colour calibration – Disparity maps – MVC + D – Metadata aggregation

Stereoscopic video watermarking: a comparative study

Afef Chammem1 , Mihai Mitrea1,  and Françoise Prêteux2  
(1) Télécom SudParis, Institut Mines-Télécom, France
(2) MINES ParisTech, Institut Mines-Télécom, France

Abstract Despite the sound theoretical, methodological, and experimental background inherited from 2D video, the stereoscopic video watermarking imposed itself as an open research topic. Paving the way towards practical deployment of such copyright protection mechanisms, the present paper is structured as a comparative study on the main classes of 2D watermarking methods (spread spectrum, side information, hybrid) and on their related optimal stereoscopic insertion domains (view or disparity based). The performances are evaluated in terms of transparency, robustness, and computational cost. First, the watermarked content transparency is assessed by both subjective protocols (according to ITU-R BT 500-12 and BT 1438 recommendations) and objective quality measures (five metrics based on differences between pixels and on correlation). Secondly, the robustness is objectively expressed by means of the watermark detection bit error rate against several classes of attacks, such as linear and nonlinear filtering, compression, and geometric transformations. Thirdly, the computational cost is estimated for each processing step involved in the watermarking chain. All the quantitative results are obtained out of processing two corpora of stereoscopic visual content: (1) the 3DLive corpus, summing up about 2 h of 3D TV content captured by French professionals, and (2) the MPEG 3D video reference corpus, composed of 17 min provided by both academic communities and industrials. It was thus established that for a fixed size of the mark, a hybrid watermark insertion performed into a new disparity map representation is the only solution jointly featuring imperceptibility (according to the subjective tests), robustness against the three classes of attacks, and nonprohibitive computational cost.

Keywords Robust stereoscopic watermarking – Spread spectrum – Side information – Hybrid watermarking – Stereoscopic disparity map – HD 3D TV

Open topics

Deployment of wireless regional area network and its impact on DTV service coverage

Yee-Loo Foo 
Multimedia University, Malaysia

Abstract Prediction of digital TV (DTV) coverage has not considered the potential interference originating in the IEEE 802.22 Wireless Regional Area Network (WRAN), which operates in the same TV bands. WRAN interference could affect DTV reception, resulting in DTV service outage in some areas. Spectrum sensing is a means of minimizing the interference by not operating WRAN in the TV band where DTV signal is detected to be present. However, limited sensing accuracy could lead to erroneous decision. This paper investigates the extent of which DTV service quality is affected by the operation of WRAN with limited sensing accuracy. One of the main factors that limit sensing accuracy is the variability in radio propagation channel. Depending on the characteristics of the statistical variation, radio channels are modeled as Gaussian, Rayleigh, Nakagami, and Rician channels in this paper. The probability of DTV service outage is analyzed and expressed as a function of sensing accuracy. The theoretical results presented here have been validated by the Monte Carlo simulations.

Keywords Digital TV 6 – Wireless regional area networksGaussian channelsRayleigh channelsNakagami channelsRician channels

Multi-user receiver scheme and uplink performance of space–time-coded CDMA system in Rician fading channels

Xiangbin Yu , Xiaoshuai Liu, Xiaodan Yu, Wei Tan, and Xiaomin Chen
Nanjing University of Aeronautics and Astronautics, China

Abstract The uplink performance of multi-user space–time-coded code-division multiple access (STC-CDMA) system in Rician fading channel is presented. A simple and effective multi-user receiver scheme is developed for STC-CDMA system. The scheme has linear decoding complexity when compared to the existing scheme with exponential decoding complexity, and thus implements low-complexity decoding. Based on the bit error rate (BER) analysis and moment generation function, theoretical BER expressions are derived for STC-CDMA with orthogonal and quasi-orthogonal spreading code, respectively. It is shown that these expressions have more accuracy. Using these expressions and the approximation of error function, closed-form approximate BER expressions are obtained, which can simplify the calculation of the derived theoretical BER. Simulation results show that the developed low-complexity decoding scheme can achieve almost the same performance as the existing scheme. The theoretical BER are in good agreement with the corresponding simulated values. Moreover, the presented approximate expressions are also close to the simulated values due to the better approximation. Under the same system throughput and concatenation of channel code, the presented full-rate STC-CDMA system has lower BER than the corresponding full-diversity STC-CDMA systems.

Keywords Multi-user receiver Space–time coding Rician fading Code-division multiple access (CDMA) Moment generation function Low complexity