Publications

UAV-based Real-Time Face Detection using YOLOv7

  • Authority: The 1st International Conference on Smart Mobility and Logistics Ecosystems (SMiLE)
  • Category: Conference Proceeding

YOLOv7 is a power full deep learning-based object detection model with a novel architecture that balances model complexity with inference time. When compared to other YOLO models, YOLOv7 has a lightweight backbone network called E-ELAN that allows it to learn more efficiently without affecting the gradient path. However, the use of YOLOv7 in dealing with the problem of face detection from UAV-captured images has not been investigated. UAV-based images present challenges due to variations in view and distance, especially when taken outside. A total of 266 images collected by a UAV-based camera were used in this study to evaluate YOLOv7’s performance in addressing this problem. In addition, six YOLOv7-based models were investigated in this study: YOLOv7, YOLOv7-X, YOLOv7-W6, YOLOv7-E6, YOLOv7-D6, and YOLOv7-E6E. In the experiments, 100 images from the WIDER FACE dataset were used for training purposes. However, the 266 UAV-based images collected were used in the testing phase. According to the reported results, YOLOv7 produced the best detection accuracy with a 95% in F1 measure. Furthermore, when tested on a single GPU machine, YOLOv7 required a short inference time of 3.7 milliseconds per image. The analysis revealed that YOLOv7 outperformed RetinaFace and MTCNN, one of the most popular pre-trained deep face detection models. Nonetheless, YOLOv7 fails to localize faces in low-resolution images, indicating that there is still room for improvement in terms of improving recall rates.