CONVOLUTIONAL NEURAL NETWORK ARCHITECTURES FOR ROBUST IMAGE FEATURE REPRESENTATION IN RETRIEVAL SYSTEMS

20260 citationsJournal Articlehybrid Open Access

Authors

Salahuddin · NFC Institute of Engineering and Technology

Maria Sohail · Government College University, Faisalabad

Hira Khan · Islamia University of Bahawalpur

Muhammad Tanveer Meeran · Universiti Malaysia Terengganu

Abstract

Machine learning techniques have become fundamental to modern image classification and retrieval systems, enabling the extraction of meaningful and discriminative visual features from large image collections. In this research, we propose an efficient framework for generating image descriptors using deep Convolutional Neural Network (CNN) architectures, specifically GoogleNet , Inception V3 , and DenseNet-201 . These pre-trained deep models are utilized to capture multi-level visual characteristics, including fine texture patterns, shape cues, and high-level object semantics. To further strengthen the descriptive power of the extracted features, information from the three color channels is encoded and integrated, improving retrieval accuracy while preserving computational efficiency and response time. As images propagate through the hierarchical layers of the CNNs, increasingly abstract feature maps are produced, forming distinctive feature signatures that represent the visual content. These signatures are then reorganized into a newly constructed feature matrix designed to encode spatial relationships, chromatic properties, and latent structural patterns within the image. This enriched representation provides a more holistic description of image content, making it suitable for content-based image retrieval (CBIR) and visual similarity analysis. To validate the effectiveness and generalization capability of the proposed method, experiments were conducted on four widely recognized benchmark datasets: Corel-1K, CIFAR-10, 17-Flowers, and ZuBuD . The evaluation considered both retrieval precision and computational performance across datasets with varying resolutions, object categories, and scene complexities. Experimental results indicate that all three CNN architectures produce discriminative descriptors; however, DenseNet-201 consistently delivered superior performance on the CIFAR-10 dataset, which includes diverse object classes and varying image scales. Its densely connected architecture promotes improved feature reuse and gradient flow, leading to higher classification accuracy and more robust retrieval outcomes compared to GoogleNet and Inception V3. Overall, the proposed CNN-based descriptor generation framework demonstrates strong potential for scalable and accurate image retrieval applications in multimedia databases and intelligent visual search systems.

Topics & Keywords

Advanced Image and Video Retrieval Techniques Image Retrieval and Classification Techniques Multimodal Machine Learning Applications

UN Sustainable Development Goals

Reduced inequalities

Publication Details

Published in: Kashf Journal of Multidisciplinary Research

Volume 3, Issue 02, pp. 109-139

DOI: 10.71146/kjmr840

Field-Weighted Citation Impact: 0.00