MVDNet: UAV Based Multi-Modal Multi-Vehicle Anchor Free Detection

20250 citationsJournal Article

Authors

Nandini Saini · Indian Institute of Technology Jodhpur

Dhaval Patel · Boston Micromachines (United States)

Debasis Das · Indian Institute of Technology Jodhpur

Chiranjoy Chattopadhyay · Flame University

Abstract

Unmanned aerial vehicles (UAVs), equipped with advanced sensor technology and cognitive abilities are effective for object detection and tracking in diverse operations such as search and rescue, surveillance, urban planning, agriculture, and traffic monitoring. Despite their ability to capture a wide range of objects, UAV-captured images present challenges in terms of resolution, object scale variance, highly complex distribution, occlusion, density, and noise. In this work, we propose a novel deep learning-based one-stage detector for performing robust and fast multi-vehicle detection from aerial images in a noisy scenario. We generated synthetic multinoise datasets through integrating three types of noise (gaussian noise, salt and pepper noise, and speckle noise) to existing datasets and real-time data. The proposed framework incorporates two main components, the first of which comprises a noise-resilient feature extraction module named MLAM that extracts features from aerial images using a multilevel attention mechanism (MLAM) to deal with various noises. The second is considered an anchor-free detection technique since it aggregates semantic features for classification and recognition without accumulating region proposals for objects in the image. In this detection pipeline, we will only use UAVs for image capture and conduct computationally intensive tasks on a GPU-based cloud server. We performed comprehensive experiments on two of the most prominent aerial image benchmark datasets: DOTAVehicle (single-modality) and VEDAI (multi-modality), as well as a real-time dataset collected by an unmanned aerial vehicle using a single camera. Our results demonstrate that MVDNet achieves a mean average precision (mAP) of 81.4% and 58.2% on the respective datasets in noisy environments, with an inference speed of 14.4 milliseconds (ms).

Topics & Keywords

Advanced Neural Network Applications Robotics and Sensor-Based Localization Infrastructure Maintenance and Monitoring

Publication Details

Published in: IEEE Transactions on Vehicular Technology

Volume 74, Issue 11, pp. 16850-16863

DOI: 10.1109/tvt.2025.3580119

Field-Weighted Citation Impact: 0.00

Command Palette

MVDNet: UAV Based Multi-Modal Multi-Vehicle Anchor Free Detection

Authors

Abstract

Topics & Keywords

Publication Details