Search for a command to run...
AndroMetric is a large-scale dataset designed to support empirical research at the intersection of software metrics and Android application security. The dataset contains 39,243 Android applications, including 29,208 malware samples and 10,035 benign apps, collected from publicly available repositories and analyzed using a static analysis pipeline. For each APK, 31 static code metrics are extracted to capture multi-dimensional structural and behavioral characteristics. These metrics span six categories: (i) size and dimensional structure, (ii) object-oriented design properties, (iii) control-flow complexity, (iv) sensitive operation usage, (v) Android Intent patterns, and (vi) exception handling. In addition to metrics, the dataset includes relevant metadata to facilitate large-scale analysis. AndroMetric enables research on Android malware detection, comparative analysis of benign and malicious codebases, and security-aware software quality assessment. The dataset is intended for researchers and practitioners in mobile security, software engineering, and applied machine learning. If you use this dataset, please cite the associated paper describing the AndroMetric dataset. @inproceedings{AndroMetric, author={Sebastian Siedler and Karim Elish}, booktitle={23rd IEEE/ACM International Conference on Mining Software Repositories (MSR)}, title={AndroMetric: Bridging Multi-Dimensional Software Metrics and Mobile Application Security}, year={2026}} App and Malware Access Information: The apps and malware samples referenced in this dataset are real-world Android applications. To adhere to research ethics guidelines and comply with data sharing and redistribution policies, we do not directly distribute APK files. Instead, researchers are required to obtain the applications through the official AndroZoo repository and/or Drebin. AndroZoo: https://androzoo.uni.lu/ Drebin: https://drebin.mlsec.org/ Usage Instructions: Request access to the repository by following the instructions provided on the official website. Use the cryptographic hash values (e.g., SHA-256) included in our dataset to retrieve the corresponding application samples from the repository.