Search for a command to run...
Machine learning techniques have become fundamental to modern image classification and retrieval systems, enabling the extraction of meaningful and discriminative visual features from large image collections. In this research, we propose an efficient framework for generating image descriptors using deep Convolutional Neural Network (CNN) architectures, specifically GoogleNet , Inception V3 , and DenseNet-201 . These pre-trained deep models are utilized to capture multi-level visual characteristics, including fine texture patterns, shape cues, and high-level object semantics. To further strengthen the descriptive power of the extracted features, information from the three color channels is encoded and integrated, improving retrieval accuracy while preserving computational efficiency and response time. As images propagate through the hierarchical layers of the CNNs, increasingly abstract feature maps are produced, forming distinctive feature signatures that represent the visual content. These signatures are then reorganized into a newly constructed feature matrix designed to encode spatial relationships, chromatic properties, and latent structural patterns within the image. This enriched representation provides a more holistic description of image content, making it suitable for content-based image retrieval (CBIR) and visual similarity analysis. To validate the effectiveness and generalization capability of the proposed method, experiments were conducted on four widely recognized benchmark datasets: Corel-1K, CIFAR-10, 17-Flowers, and ZuBuD . The evaluation considered both retrieval precision and computational performance across datasets with varying resolutions, object categories, and scene complexities. Experimental results indicate that all three CNN architectures produce discriminative descriptors; however, DenseNet-201 consistently delivered superior performance on the CIFAR-10 dataset, which includes diverse object classes and varying image scales. Its densely connected architecture promotes improved feature reuse and gradient flow, leading to higher classification accuracy and more robust retrieval outcomes compared to GoogleNet and Inception V3. Overall, the proposed CNN-based descriptor generation framework demonstrates strong potential for scalable and accurate image retrieval applications in multimedia databases and intelligent visual search systems.
Published in: Kashf Journal of Multidisciplinary Research
Volume 3, Issue 02, pp. 109-139
DOI: 10.71146/kjmr840