New State-of-the-Art

DVD: Deterministic Video Depth Estimation
with Generative Priors

Hongfei Zhang1* Harold H. Chen1,2* Chenfei Liao1* Jing He1* Zixin Zhang1 Haodong Li3 Yihao Liang4 Kanghao Chen1 Bin Ren5 Xu Zheng1 Shuai Yang1 Kun Zhou6 Yinchuan Li7 Nicu Sebe8 Ying-Cong Chen1,2†
1HKUST(GZ)   2HKUST   3UCSD   4Princeton University   5MBZUAI   6SZU   7Knowin   8UniTrento
*Equal Contribution    Corresponding Author
Paper Code Models
DVD Teaser Figure

TL;DR: DVD effectively resolves the geometric hallucination issues in generative models and semantic ambiguities in discriminative baselines, delivering consistent, high-fidelity geometry.

Long Video Results

Tip: You can use the progress bar on the RGB video to control all videos synchronously. Drag the slider on the right to compare the results.

Short Video Results

Citation

If you find our work useful in your research, please consider citing:

@article{zhang2026dvd,
  title={DVD: Deterministic Video Depth Estimation with Generative Priors},
  author={Zhang, Hongfei and Chen, Harold Haodong and Liao, Chenfei and He, Jing and Zhang, Zixin and Li, Haodong and Liang, Yihao and Chen, Kanghao and Ren, Bin and Zheng, Xu and Yang, Shuai and Zhou, Kun and Li, Yinchuan and Sebe, Nicu and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2603.12250},
  year={2026}
}