Search for a command to run...
Understanding how regulatory elements and genes define cellular identity requires integrative analysis of single-cell transcriptomic and epigenomic landscapes. While recent single-cell technologies enable high-resolution profiling of gene expression and chromatin accessibility, a systematic framework to quantify cell-type specificity across modalities has been lacking. To address this, we developed SPICEY (SPecificity Index for Coding and Epigenetic activitY), an R package that quantifies cell-type specificity from single-cell RNA-seq and ATAC-seq data. SPICEY computes two complementary indices: RETSI (Regulatory Element cell Type Specificity Index) for chromatin accessibility and GETSI (Gene Expression cell Type Specificity Index) for gene expression. By integrating differential and entropy-based measures, SPICEY captures both the magnitude and exclusivity of regulatory and transcriptional activity, enabling quantitative assessment of cell-type-restricted programs. Applied to human pancreatic islet data, SPICEY identified cell-type-specific enhancers, genes, and regulatory networks enriched in endocrine cell populations—including β cells—revealing the regulatory logic that underlies tissue specialization. If a accessible site to gene annotation is provided, like co-accessibility links, SPICEY provides a unified framework to dissect cell-type-specific regulatory mechanisms from multi-omic data. Our work delivers a comprehensive pipeline to compute and visualize transcriptional and regulatory specificity, providing a resource to interpret noncoding and coding variation in a cell-type-resolved context. This entry contains all the code, datasets, and intermediate files used to analyze and generate the figures from the publication. Datasets 1. HPAP Single-cell data from HPAP control donors (joint snATAC-seq and scRNA-seq) used to compute SPICEY scores (scHPAP.rds). This dataset represents the islet of Langerhans tissue from human samples contributed by HPAP. It contains 9,224 cells profiles using 10x Genomics single-cell sequencing. Differential analysis results (DA_RNA_HPAP.rds, DA_ATAC_HPAP.rds) identifying cell-type-specific gene expression and chromatin accessibility using the Wilcoxon test and Seurat’s FindMarkers function. Co-accessibility links (HPAP_CICERO_LINKS.rds) computed from snATAC-seq data using Cicero, to link regulatory elements to target genes. SPICEY output object (HPAP_SPICEY_COACC.rds) containing integrated RETSI and GETSI results with co-accessibility annotations. 2. PBMCs Single-cell multiome data from human PBMCs (snATAC-seq and snRNA-seq from same exact cell) used to compute SPICEY scores (mo_PBMCs.rds). This dataset represents the peripheral blood mononuclear cells tissue from human samples contributed via the pbmcMultiome.SeuratData repository but originally generated by 10x Genomics. It contains 10,970 cells profiles using 10x Genomics single-cell sequencing. Differential analysis results (DA_RNA_PBMC.rds, DA_ATAC_PBMC.rds) identifying cell-type-specific gene expression and chromatin accessibility using the Wilcoxon test and Seurat’s FindMarkers function. Co-accessibility links (PBMC_CICERO_LINKS.rds) computed from snATAC-seq data using Cicero, to link regulatory elements to target genes. SPICEY output object (PBMC_SPICEY_COACC.rds) containing integrated RETSI and GETSI results with co-accessibility annotations. 3. Reference gene-sets Reference gene sets, including tissue-specific and ubiquitous hallmark gene lists (HPAP_TISSUE_SPZ_GENESET.csv,PBMC_TISSUE_SPZ_GENESET.csv, UBIQUITOUS_GENESET.csv) derived from literature. Source code The repository also provides R scripts to reproduce all key computational steps: RUN_COACC.R – computes co-accessibility networks. RUN_DIFF_ANALYSIS.R – performs differential expression and accessibility analyses. RUN_SPICEY.R – computes SPICEY (RETSI/GETSI) metrics and integrate multi-omic data. SPICEY_PAPER.Rmd – generate all analyses and figures presented in the paper. Citations and acknowledgments This repository used data acquired from the database (https://hpap.pmacs.upenn.edu/) of the Human Pancreas Analysis Program (HPAP; RRID:SCR_016202; PMID: 31127054; PMID: 36206763). HPAP is part of a Human Islet Research Network (RRID:SCR_014393) consortium (UC4-DK112217, U01-DK123594, UC4-DK112232, and U01-DK123716).