Estimating depth from images nowadays yields outstanding results, both in
terms of in-domain accuracy and generalization. However, we identify two main
challenges that remain open in this field: dealing with non-Lambertian
materials and effectively processing high-resolution images. Purposely, we
propose a novel dataset that includes accurate and dense ground-truth labels at
high resolution, featuring scenes containing several specular and transparent
surfaces. Our acquisition pipeline leverages a novel deep space-time stereo
framework, enabling easy and accurate labeling with sub-pixel precision. The
dataset is composed of 606 samples collected in 85 different scenes, each
sample includes both a high-resolution pair (12 Mpx) as well as an unbalanced
stereo pair (Left: 12 Mpx, Right: 1.1 Mpx). Additionally, we provide manually
annotated material segmentation masks and 15K unlabeled samples. We divide the
dataset into a training set, and two testing sets, the latter devoted to the
evaluation of stereo and monocular depth estimation networks respectively to
highlight the open challenges and future research directions in this field.