Knowledge Distillation with Predicted Depth for Robust and Lightweight Face Presentation Attack Detection
- Authority: Knowledge-Based Systems
- Category: Journal Publication
Face Presentation Attack Detection (FacePAD) is critical for safeguarding face recognition systems against spoofing attempts, including printed photos, video replays, and 3D masks. However, many existing approaches struggle with generalization across diverse attack types and real-world conditions. In this study, we propose a dual-branch deep learning framework that leverages both RGB images and synthetically predicted depth maps to improve anti-spoofing robustness and accuracy. A monocular depth estimation network is used to generate depth cues from a single RGB image, which are then processed in parallel with the original image through two distinct branches of a convolutional neural network. The extracted features-texture-based from RGB and structure-aware from depth-are fused via concatenation to facilitate more discriminative spoof detection. Extensive experiments on four benchmark datasets demonstrate that our method achieves state-of-the-art performance, reducing HTER to 0 % on Replay-Attack and Replay-Mobile, and 1.023 % on ROSE-Youtu. Similarly, an ACER of 0.56 % is achieved on OULU-NPU, while maintaining computational efficiency. Furthermore, we introduce a knowledge distillation scheme to compress the dual-branch model into a lightweight single-branch variant suitable for real-time deployment in mobile authentication, surveillance, and biometric access control scenarios.