Enhancing Monocular Depth Estimation with an Advanced Encoder-Decoder Architecture
- Authority: Neural Computing and Application
- Category: Journal Publication
Significant technological advancements have been made in autonomous navigation, impacting various fields such as robotics, autonomous vehicles, and unmanned aerial vehicles. These systems typically use the distances of surrounding objects as input. Monocular depth estimation, which involves estimating depths from a single RGB image, plays a crucial role in this context. In this study, we proposed an encoder-decoder model for monocular depth estimation. Additionally, we introduced a weighted loss function designed to minimize depth image reconstruction errors and penalize distortions in the scene domain of the depth map. The proposed model was evaluated using the NYU Depth V2 dataset, and the results surpassed those of state-of-the-art models on the same dataset. Specifically, our model achieved a validation accuracy of 0.9823, an average relative error (rel) of 0.04713 and a root mean square error (RMSE) of 0.2372, representing significant reductions of 60% and 49%, respectively, compared to contemporary techniques, even with a small training dataset.