We performed single-cell weighted gene co-expression network analysis (scWGCNA) using the hdWGCNA package (v0.2.5)20. The merged single-cell RNA-seq object was initialized with genes expressed in at least 5% of cells. Metacells were constructed by grouping cells within each annotated subcluster using k-nearest neighbors (k = 10) with a maximum shared-cell limit of 10. Following metacell normalization, we generated the expression matrix for network construction.
The soft power-threshold was determined through scale-free topology analysis (signed network type), selecting the lowest power that achieved a scale-free fit index > 0.8. The gene co-expression network was constructed using this optimized power value. Module eigengenes were computed with batch correction for sample origin. We assessed module connectivity by calculating kME values within each subcluster and assigned modules with systematic identifiers.
Hub genes were defined as the top 25 genes ranked by kME within each module. Module expression scores were computed using both Seurat and UCell algorithms, based on the top 25 genes per module21. Finally, we visualized module eigengene patterns using feature plotting and exported all results for downstream analysis.
Technical Validation
Based on data obtained from the GEO platform, we collected transcriptomic data from adult mouse cortical tissues and a mixed dataset from human, chimpanzee, bonobo, and macaque cortical tissues. Additionally, we analyzed publicly available scATAC-seq data from the cortex. The reliability of the transcriptomic data was supported by standard quality metrics including nFeature_RNA, nCount_RNA, mitochondrial gene percentage (percent.mt), and hemoglobin gene percentage (HB_percent), while the scATAC-seq data quality was confirmed by high fragment counts, transcription start site (TSS) enrichment, and the fraction of reads in peaks (FRiP) (Fig. 1A-C). According to the different sequencing platforms and experimental methods, a total of 11 libraries were analyzed and annotated 22,23,23–31. Detailed information on the cell type marker genes used for all transcriptomic analyses in this study is provided in Fig. 1D-E.
A total of 173,081 high-quality RNA-seq cells were obtained from adult mouse brain tissue, and a total of 8 cell types were annotated. In addition, cell cycle scoring analysis indicated that cell clustering was not driven by cell cycle effects32, and cell density visualization further confirmed that no single cluster was disproportionately influenced by variations in cell number (Fig. 2A). We applied Harmony to remove batch effects arising from differences in sequencing depth. Through cross-dataset non-negative least squares regression analysis, we demonstrated that all annotated cell types exhibit highly consistent similarity across different technological platforms, confirming the robustness of our cell type annotations (Fig. 2B).
To further explore the function of these cells under different sequencing methods, we used WGCNA to identify the functional hub genes of 8 cells. A total of 33 co-expression modules were identified: 7 in drop-sn, 8 in drop-sc, 6 in 10x-sc, and 12 in 10x-sn. We subsequently performed functional enrichment analysis on each gene co-expression module to systematically identify the dominant functional characteristics of different cell types (Fig. 3A). Modules that did not yield any meaningful functional enrichment results were excluded from subsequent analyses. For each module, we calculated the intramodular connectivity of every gene, selected the top 25 genes with the highest connectivity as hub genes, and subsequently determined the correlation between module gene expression levels and cell types (Fig. 3B). Specifically, the results revealed that both glial cells and neuronal populations play significant roles in synaptic-related functions and metabolic support pathways, while exhibiting distinct functional specializations: neuronal functions primarily focus on synaptic signaling and neuromodulation, whereas glial cells mainly govern the assembly, maintenance, and fine-tuned regulation of synaptic structures. Additionally, to investigate the core functional genes within co-expression modules, we extracted the top 40 genes from each module and further partitioned them using the Louvain community detection algorithm. The results indicated that core genes identified across four datasets predominantly involved protein synthesis (Rpl23a, Rpl35, Rps23), energy metabolism (Aldoa, Slc25a4, Ftl1)33, synaptic signaling (Gria4, Dlg2)34, and glial support functions (Slc1a3, Apoe, Mbp) (Fig. 3C-F). Overall, these findings collectively reveal the key regulatory mechanisms of neurons and glial cells.
To gain deeper insights into the cellular interaction network within the cortex, we performed cell-cell communication analysis on the datasets using CellChat. The results revealed significant and strong interactions among excitatory neurons, astrocytes, and oligodendrocyte precursor cells (OPCs), except in the dataset in which OPCs were not detected (Fig. 4A). Further pattern recognition analysis identified that the core signaling pathways shared across the four datasets were primarily enriched in key biological processes, including cell adhesion, neuron-glia interactions, synaptogenesis, and axon guidance (Fig. 4B-C). Notably, we found that the communication among these three cell types is predominantly mediated by the NRXN and PTN signaling pathways. Neurexin (NRXN), primarily localized at the neuronal presynaptic membrane, functions by trans-synaptically organizing and stabilizing both excitatory and inhibitory synapses through interactions with postsynaptic ligands such as Neuroligin, serving as a central molecule in synapse formation and specificity regulation35. Pleiotrophin (PTN), a signaling molecule secreted by astrocytes, binds to its receptors (e.g., Syndecan-3, RPTPβ/Z) and concurrently promotes neuronal survival, dendritic spine formation, and synapse maturation, while also regulating the migration and differentiation of OPCs36. Therefore, investigating the NRXN and PTN signaling pathways is likely to provide valuable insights into cortical cellular interactions. Furthermore, we discovered that signaling pathways related to cell adhesion molecules—including NEGR, CADM, and CDH—serve as essential communication mediators between astrocytes and excitatory neurons37. These molecules form specific trans-synaptic adhesion complexes that directly facilitate the physical envelopment and spatial positioning of synaptic structures by astrocytes. This finding highlights the critical role of intercellular physical contact in neuron-glia interactions and provides a new molecular theoretical foundation for understanding nervous system development and synaptic plasticity (Fig. 4D).
To systematically evaluate the performance of transcriptomic datasets in scATAC-seq data integration, we collected 10,055 high-quality cortical scATAC-seq cells for cross-modal integration with RNA-seq data (Fig. 5A)38. Our integration consistency assessment revealed that 4,139 cells (41.1%) obtained consistent cell-type annotations across at least three datasets, indicating that most cells achieve stable annotations across different reference datasets. Further analysis demonstrated that excitatory and inhibitory neurons were prone to mutual misclassification during cross-dataset integration. Surprisingly, despite their substantial heterogeneity, astrocytes exhibited remarkable stability, suggesting that astrocytes may maintain more conserved regulatory patterns at the chromatin accessibility level (Fig. 5B)39.
To delineate the regulatory architecture underlying cell type-specific chromatin landscapes in cortex, we extracted 4139 cells with consistent cell type annotations in at least three datasets for subsequent analysis(Fig. 5C).We identified 6 major cell types, namely, Astrocytes, OPCs, Excitatory neurons, Inhibitory neurons, Oligodendrocytes, and Microglial cells, each exhibiting distinct chromatin accessibility patterns at established marker loci (Fig. 5D). Subsequently, we systematically constructed chromatin accessibility networks using Cicero co-accessibility analysis. Network topology analysis based on betweenness centrality (threshold: >80th percentile) identified 18 genes as topologically central hubs (Fig. 5E–F). Among them, Cacna1b encodes the pore-forming subunit of neuronal N-type voltage-gated calcium channels, directly mediating presynaptic calcium influx and triggering neurotransmitter release40. The axonal guidance molecule Ntng2 (Netrin-G2) orchestrates excitatory synaptogenesis via mechanisms involving specific cell adhesion41. Notably, we discovered the central hub positions of metabolism-related genes (Rapgef1, Ass1), which suggests the intrinsic metabolic state of a cell may serves as a key driver in reshaping the chromatin accessibility landscape42.
To investigate the performance of the four datasets in cross-species analysis, we collected a publicly available single-nucleus transcriptomic dataset of mixed cortical tissues from humans, chimpanzees, bonobos, and macaques (n = 29,353). Due to variations in sequencing depth, only four major cortical cell types were reliably detected (astrocytes, oligodendrocytes, inhibitory neurons, and excitatory neurons) (Fig. 6A–B). We then performed a correlation analysis based on the average expression of conserved highly variable genes across species within each cluster. The results demonstrated highly consistent patterns among the four datasets across all cell types (similarity > 0.7). Furthermore, we analyzed the evolutionary rate of gene expression across cell types, which suggested that astrocytes exhibit lower interspecies divergence compared with other cell types, potentially reflecting their highly conserved functional architecture (Fig. 6C)43.
To capture subtle species-specific differences in gene expression, we further analyzed significantly upregulated differentially expressed genes (DEGs) within each cluster across species. Surprisingly, astrocytes showed fewer overlapping DEGs (Fig. 6D). Further principal component analysis of shared gene expression patterns revealed that oligodendrocytes, inhibitory neurons, and excitatory neurons displayed similar cross-species expression profiles, whereas astrocytes exhibited highly species-specific expression patterns (Fig. 6E). Specifically, in addition to canonical astrocyte markers such as AQP4, SLC1A3, and GFAP, astrocytes were enriched for genes involved in core functional pathways44. These include genes regulating immune response and signaling (e.g., PON2, ID3)45, lipid metabolism and transport (e.g., ETNPPL, CLU, FADS2, APOE)46, as well as specialized molecular transport (e.g., SLC39A12, SLCO1C1)47. In contrast, although oligodendrocytes are also glial cells, their core myelin structural components (MOG, PLP1, MAG, MBP)48, signaling-related factors (LPAR1, EFNB3, S1PR5)49, and cytoskeletal regulatory proteins (ERMN, CRYAB, GSN) exhibit conserved expression patterns across species that are highly consistent with neurons. This finding suggests that oligodendrocytes, as glial cells closely coupled with neuronal function, may have undergone evolutionary conservation pressure similar to those of neurons for their core functional modules, thereby forming a coordinated and unified cross-species expression pattern.
This study constructed multiple transcriptomic atlases of mouse cortical cells across different sequencing platforms through systematic technical validation. We rigorously evaluated data reliability and uncovered core gene modules, key signaling pathways, and chromatin regulatory networks that coordinate between neurons and glial cells to maintain cortical function. In addition, through cross-omics, and cross-species analyses, we confirmed the robustness of data integration across technologies. In summary, we provide a high-quality data resource and analytical framework for advancing the exploration of cellular heterogeneity, molecular regulatory mechanisms, and evolutionary conservation in the cerebral cortex.