A
Genome-resolved metagenomics and culturomics reveal a fiber-degrading gut microbiome in Dahuabai pigs with culture-validated cellulase activity
BoxuanYang1
YanhuaHan1
JianboYang1
ZhijianXu1
BoSong1
XiaofanWang2
NingLiu3
HuiJiang1
JianminChai1
Dr.
FeilongDeng1✉
Email
Prof.
YingLi1✉
Email
Prof.
JiangchaoZhao2✉
Email
1Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Animal Science and TechnologyFoshan University528225FoshanChina
2College of Animal ScienceSouth China Agricultural University510642GuangzhouChina
3Biology major, Kenneth P. Dietrich School of Arts and SciencesUniversity of Pittsburgh15260PittsburghPAUSA
Boxuan Yang1, †, Yanhua Han1, †, Jianbo Yang1, Zhijian Xu1, Bo Song1, Xiaofan Wang2, Ning Liu3, Hui Jiang1, Jianmin Chai1, Feilong Deng1,*, Ying Li1,*, and Jiangchao Zhao2,*
1Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Animal Science and Technology, Foshan University, Foshan, 528225, China
2College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
3Biology major, Kenneth P. Dietrich School of Arts and Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
*To whom correspondence should be addressed:
Prof. Jiangchao Zhao (jzhao77@scau.edu.cn), Prof. Ying Li (yingli@fosu.edu.cn), and Dr. Feilong Deng (fdeng@fosu.edu.cn)
Abstract
Background
Local Chinese pig breeds such as Dahuabai (DHB) are noted for fiber tolerance. We compared the gut microbiota and fiber-degrading functions of DHB and Duroc pigs and built a genome-resolved resource to characterize cellulose/hemicellulose degradation.
Results
We first profiled fecal metagenomes from Dahuabai (DHB) pigs (n = 30) and Duroc pigs (n = 30). DHB harbored higher relative abundances of Methanobrevibacter (14.31% vs. 3.03%, Wilcoxon p = 0.032) and Lactobacillus (17.43% vs. 1.73%, p = 0.007); at the species level, Methanobrevibacter smithii and Lactobacillus amylovorus were dominant in DHB. Functionally, DHB microbiomes contained more CAZymes overall (147,829 vs. 63,825) and were enriched for cellulose-degrading (GH1, GH5, GH6, GH7, GH9, GH12, GH45) and hemicellulose-degrading (GH10, GH26, GH39, GH42, GH43, GH51) families (p < 0.05). Guided by these inter-breed differences, we constructed a genome-resolved resource focused on DHB by integrating Illumina and Oxford Nanopore Technologies (ONT) metagenomes with whole-genome sequencing (WGS) of cultured isolates. This dataset comprised 888 genomes (382 Illumina Metagenome-Assembled Genomes (MAGs), 489 ONT MAGs, and 17 isolate genomes), of which 449 met high-quality criteria. We then predicted cellulose-degrading capacity using the presence of endoglucanases, exoglucanases, and β-glucosidases as criteria. In total, 258 genomes showed potential for cellulose degradation, and 129 were classified as high potential. Finally, we evaluated cultured representatives in vitro. Primary Congo red screening identified 14 fiber-degrading isolates, and liquid assays detected carboxymethylcellulase (CMCase) activity in four strains: Bacillus velezensis D7-1 (136.82 U/mL), Bacillus subtilis D6-1 (24.88 U/mL), Bacillus safensis X6-1 (7.83 U/mL), and Bacillus_A cereus Y9-1 (2.76 U/mL), highlighting Bacillus spp. as cultured hosts with measurable cellulolytic activity.
Conclusions
In conclusion, there are differences in gut composition and function between DHB and Duroc. DHB have formed a special intestinal microbial community during long-term natural domestication, and tolerance to rough feeding is significantly higher than that of commercial pig breed. However, more extensive research is needed on the application potential of fiber-degrading bacteria in actual production.
Introduction
The Dahuabai (DHB) pig, a resilient indigenous breed from southern China, is notably adapted to thrive on high-fiber, coarse diets and challenging rearing conditions, which traits refined through natural selection1. Unlike commercial lean breeds, this tolerance highlights DHB's value as both a crucial genetic resource and a compelling model for studying microbial contributions to nutrient extraction from roughage2,3. Since non-ruminant pigs lack endogenous cellulolytic enzymes, they largely depend on their hindgut microbiota to ferment dietary fibers (e.g., cellulose, hemicellulose) into absorbable metabolites like short-chain fatty acids (SCFAs)4,5. The fermentation of fiber not only provides SCFAs (such as acetate, propionate, and butyrate) that contribute up to 5–30% of a pig’s maintenance energy requirements6,7, but also increases hindgut mass and modulates gut environment (e.g. pH) to favor fiber-adapted microbes8. Notably, the interactions between dietary fiber and gut microbiota are bidirectional: the composition of the microbiota determines the host’s capacity to utilize fiber, and conversely, fiber fermentation end-products can stimulate beneficial cellulolytic bacteria while suppressing pathogenic taxa9,10. Consequently, unraveling the DHB pig's unique gut ecosystem holds substantial promise for revealing microbial strategies to enhance fiber degradation efficiency and overall gut health in swine production.
Conventional Illumina-based metagenomics faces critical limitations in resolving complex gut microbiomes. Short reads inherently constrain taxonomic resolution and genome assembly continuity, yielding fragmented MAGs that often miss repetitive regions and polysaccharide utilization loci essential for fiber degradation11,12. Functional predictions from incomplete MAGs or 16S-derived inferences (e.g., PICRUSt) risk inaccuracies in pathway reconstruction and gene attribution13, hindering precise identification of fiber-degrading taxa and enzymes. Long-read metagenomics (e.g. ONT) overcomes these barriers by generating contiguous reads spanning thousands of bases, enabling near-complete MAG recovery even in complex communities14. ONT assemblies dramatically reduce fragmentation (~ 50-fold N50 improvement), accurately reconstructing polysaccharide-degrading gene clusters and mobile genetic elements missed by short reads15. When integrated with culturomics—which expands recoverable microbial diversity through advanced cultivation16—ONT provides high-resolution genomic blueprints while isolates facilitate functional validation. This combined approach unlocks unprecedented potential to characterize uncultured fiber-degrading "dark matter" in the porcine gut ecosystem.
Here, we compared the gut microbiomes of DHB and Duroc pigs using shotgun metagenomics, revealing breed-specific differences in community composition, diversity, and CAZymes. We then constructed a genome-resolved resource for DHB by integrating Illumina- and Oxford Nanopore–derived MAGs with whole-genome sequences of cultured isolates. Using the presence of endoglucanases, exoglucanases, and β-glucosidases as criteria, we predicted candidate cellulolytic taxa and prioritized them for testing. Finally, we validated cellulase activity in cultured representatives using Congo red screening and carboxymethylcellulase (CMCase) assays.
A
This integrated methodology advances mechanistic understanding of the microbial basis of roughage resilience in DHB pigs and guides strategy development to optimize fiber-use efficiency in sustainable swine systems.
Method
sample collection
A
A total of 60 fecal samples were collected in this study, with 30 from DHB and 30 from Duroc. These pigs were selected from a commercial pig farm in Shaoguan, Guangdong, China. The fecal samples were rapidly frozen in liquid and nitrogen then transferred to a -80°C freezer for long-term storage.
Isolation, Cultivation, and Identification of Bacteria
Fecal samples from DHB were cultured under both anaerobic and aerobic conditions. First, the fecal samples were thoroughly mixed with PBS buffer to prepare a uniform bacterial suspension. Subsequently, the supernatant of the bacterial suspension was serially diluted to prepare dilutions of 10⁻³, 10⁻⁴, and 10⁻⁵. 100 microliters of each dilution was then evenly spread onto different types of 15 culture media (Supplementary Table S1).
For anaerobic conditions, the inoculated culture media were placed in an anaerobic chamber with a gas mixture of 85% N₂, 5% CO₂, and 10% H₂, and incubated at 37°C. After culturing for 24 to 48 hours, individual colonies that appeared on the agar plates were selected. These single colonies were then inoculated onto corresponding agar plates for further cultivation until the colonies fully developed. Finally, the mature colonies were identified using Matrix-assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS) and full-length 16S rRNA gene Sanger sequencing.
The purified strains were inoculated into the corresponding liquid culture medium and cultured at 37°C. After 12–24 hours, the cultures were removed and centrifuged at 4°C and 12,000 r/min for 5 minutes. The supernatant was discarded, and the bacterial pellet was retained for high-quality DNA extraction. Whole-genome sequencing was performed by Novogene Bioinformatics Technology Co., Ltd. (Beijing, China).
Library Construction, Quality Control and Sequencing
A total amount of 0.2 µg DNA per sample was used as input material for the DNA library preparations. The sequencing library was generated using the NEBNext® UltraTM DNA Library Prep Kit for Illumina (NEB, USA, Catalog #: E7370L) according to the manufacturer's instructions, with index codes added to each sample. In brief, the genomic DNA samples were fragmented to a size of 350 bp using sonication. The DNA fragments were then endpolished, A-tailed, and ligated to full-length adapters for Illumina sequencing, followed by PCR amplification. The PCR products were purified using the AMPure XP system (Beverly, USA). Subsequently, the library quality was assessed on the Agilent 5400 system (Agilent, USA), and the library was quantified using QPCR (1.5 nM). Based on the effective library concentration and the required data volume, the qualified libraries were pooled and sequenced on the Illumina platform at Beijing Biomarker Technologies Co., LTD. using the PE150 strategy.
Metagenomic Assembled and Binning
We conducted metagenomic sequencing on 60 fecal samples from two different pig breeds. 3 low-quality samples from DHB were excluded, retaining 57 samples for subsequent analysis. To ensure data quality, this study used Trimmomatic (v0.39)17 with the following parameter standards for quality control of raw reads: -threads 30 LEADING:3 TRAILING:3 SLIDINGWINDOW:5:20 MINLEN:60. Bowtie2 (v2.5.1)18 was then employed to map all trimmed reads to the reference pig genome (Scrofa11.1, GCF_000003025.6_Sscrofa11.1) to remove reads that may contain host sequences. Seqkit (v2.6.1)18 was used to generate quality reports for the metagenomic sequencing data reads of each sample. After the above quality control steps, the clean paired-end reads were used for subsequent analyses. Kraken2 (v2.1.3)19 was used to align the clean paired-end read sequences with the standard database, and Bracken (v2.9)20 was employed to estimate the relative abundance of microbial communities at different taxonomic levels using a Bayesian model.
The clean reads after quality control were assembled using the SPAdes (v3.13.0)21 tool. ONT third-generation clean reads were assembled using Flye (v2.9.2)22 in "--nano-raw" mode. The MetaBAT2 (v2.12.1)23 tool was then used to bin the contigs of each sample. Subsequently, the completeness and contamination of the MAGs were assessed using the CheckM (v1.2.2)24 tool. The MAGs with completeness ≥ 50% and contamination ≤ 10% were retained. Finally, dRep (v3.4.5)25 was used to dereplicate the retained MAGs with the following parameters: -comp 50 -con 10 -sa 0.99 -p 24, resulting in the final set of MAGs available for subsequent analyses.
Identification, classification and screening of cellulose and hemicellulose degrading microorganisms
The GTDBtk (v2.4.1)26 tool was used to align MAGs and isolate genomes with the latest GTDB database (R226). In addition, all dereplicated genomes were merged, and Prodigal (v2.6.3)27 was used to construct a complete gene set and predict ORF genes. After that, CD-hit (v4.8.1)28 was used to dereplicate genes from contigs with the following parameters: -c 0.95 -G 0 -aS 0.9 -g 1. The Diamond (v2.0.4)29 tool was used to align contigs with the latest CAZyme database (CAZyDB.07142024) to determine the carbohydrate enzyme composition of these genes. And select from the glycoside hydrolase (GH) family, the enzymes targeting cellulose degradation (GH1, GH5, GH6, GH7, GH9, GH12, GH45, GH48, and GH74) and the enzymes targeting hemicellulose degradation (GH10, GH11, GH26, GH39, GH42, GH43, and GH51).
Genomes with fiber-degrading potential were classified according to the typical steps of cellulose degradation. For cellulose degradation30, candidate genomes were required to encode: (1) ≥ 1 endoglucanase (GH5/GH9), (2) ≥ 1 anaerobic exoglucanase (GH48) or ≥ 1 aerobic exoglucanase (GH6/GH7), and (3) ≥ 1 β-glucosidase (GH1/GH3).
A scoring system was established to assess the functional potential of candidate genomes in fiber degradation. Genomes that conform to the above base criteria were given one point. For cellulose-degrading genomes, three additional criteria were used: (1) ≥ 2 endoglucanases, (2) ≥ 2 exoglucanases, and (3) presence of at least one lytic polysaccharide monooxygenase (LPMO; AA9/AA10). Each fulfilled criterion contributed one point (maximum of three).
Screening of Intestinal Cellulose Degrading Bacteria
For the preliminary screening, the strains were revived, and 2–5 µL of the bacterial suspension was inoculated onto CMC-Na agar plates containing an inorganic nitrogen source (with three replicates per group). The plates were incubated at 37°C for 48 hours. Subsequently, 1 mg/mL Congo red staining solution was added and allowed to stain for 30 minutes. The staining solution was then removed, and 1 mol/L NaCl solution was used to decolorize for 30 minutes. The ability to degrade cellulose was determined by observing the formation of hydrolysis zone, and the degradation intensity was preliminarily assessed by the ratio of the hydrolysis zone diameter (D) to the colony diameter (d) (D/d).
Before enzyme activity determination, a standard curve was prepared: 1 mg/mL glucose standard solution was precisely prepared. Aliquots of 0-1.2 mL were diluted to 2 mL, and 1.5 mL of DNS reagent was added. The mixture was boiled for 10 minutes and then diluted to 25 mL. The absorbance at 540 nm was measured. The standard curve equation was established as y = 0.2962x (R²=0.991).
The target strains after preliminary screening were inoculated into LB liquid medium and cultured at 37℃ with a shaking speed of 160 r/min for 12 hours. Then, 10 mL of the culture was transferred to 200 mL of enzyme production fermentation medium and fermented under the same conditions for 48 hours. The supernatant was obtained by centrifugation at 8,000 r/min for 20 minutes and used as the crude enzyme solution.
CMCase activity detection: The reaction system contained 1 mL of crude enzyme solution and 1 mL of 1% CMC-Na solution (with inorganic nitrogen source). The mixture was hydrolyzed at 50°C for 40 minutes, followed by the addition of 2 mL of DNS reagent. The mixture was boiled for 15 minutes and then diluted to 10 mL to measure the absorbance (A₁). The control group used inactivated crude enzyme solution (boiled for 2 hours) and was measured in the same manner (A₂). The difference in absorbance was calculated as ∆A = A1 -A2, and the concentration of reducing sugar x (mg/mL) was determined by substituting into the standard curve. The CMCase activity (U/mL) was calculated using the following formula: CL(U/mL) = 1000*x*V1/(V2*T) (V 1 is the total volume of the reaction system, 10 mL; V2 is the volume of crude enzyme solution added, 1 mL; T is the enzyme hydrolysis time, 40 minutes; CMCase activity is defined as the amount of glucose produced per minute, with 1 µg of glucose produced per minute being equivalent to 1 CMCase activity unit).
Result
Comparative Metagenomic Profiling of Gut Microbiota and Fiber-Degrading Functions
Fecal samples from DHB (n = 30) and Duroc (n = 30) pigs underwent metagenomic sequencing. After quality control, 1,106 million high-quality reads were taxonomically profiled using Kraken2 with Bracken abundance estimation. In DHB dominant bacterial genera (mean relative abundance > 10%) comprised Clostridium 28%, Blautia 19.15%, Lactobacillus 17.43%, Methanobrevibacter 14.31%, and Faecalibacterium 10.48%. In Duroc dominant bacterial genera comprised Clostridium 26.49%, Blautia 14.75% (Fig. 1A). Significant inter-breed differences were observed: Methanobrevibacter abundance was elevated in DHB (14.31% vs. 3.03%, Wilcoxon rank-sum test, p = 0.032), while Lactobacillus exhibited greater enrichment (17.43% vs. 1.73%, p = 0.007; Fig. 1B).
At the species level, dominant taxa included Methanobrevibacter smithii (DHB 11.04%), Lactobacillus amylovorus (DHB 10.15%) (Fig. 1C). Alpha diversity analysis revealed higher Simpson indices in DHB (DHB vs. Duroc, p = 0.041), though beta diversity (Bray-Curtis) showed community structure similarity (PERMANOVA, p = 0.12; Fig. 1D). LEfSe analysis (LDA score > 3.0) confirmed significant enrichment of M. smithii, L. amylovorus, and L. delbrueckii in DHB (Figure S1).
Fig. S1
LEfSe analysis. DHB compare to Duroc
Click here to Correct
Functional annotation identified 147,829 CAZymes in DHB and 63,825 in Duroc microbiomes (Figure S2). Comparative analysis demonstrated significantly higher abundances in DHB for key glycoside hydrolase families implicated in: (1) Cellulose degradation: GH1, GH5, GH6, GH7, GH9, GH12, GH45. (2) Hemicellulose degradation: GH10, GH26, GH39, GH42, GH43, GH51. (p < 0.05, Wilcoxon test; Figure S3). MAGs from DHB consistently contained more CAZyme genes per genome, particularly for cellulose-degrading enzymes (Fig. 2C).
Fig. 1
Intestinal microbial composition and differences (a) Bar chart of the abundance of dominant genera. Different colors represent different bacterial genera classifications, and each bar corresponds to an independent sample. The color bands on the X-axis represent the grouping situation, while the Y-axis represents the relative abundance. (b) The chart of differences in dominant genera. Different colors represent different species, and each point represents a sample. (c) Bar chart of the abundance of dominant species. (d) The chart of Simpson diversity and the chart of Bray-Curtis distance. (e) Differences between DHB and Duroc in the cellulose, hemicellulose related dominant GH family, with each point representing the amount of CAZyme in this MAG.
Click here to Correct
Fig. S2
CAZyme families in DHB and Duroc.
Click here to Correct
Fig. S3
The number of major fiber-degrading enzyme families encoded by each MAG between DHB and Duroc. Figure A-I shows A family of cellulose degrading enzymes. J-P is a family of hemicellulose degrading enzyme
Click here to Correct
Integrated Genomic Resource of Cultured Isolates and MAGs Reveals Fiber-Degrading Potential in DHB Gut Microbiome
We established a specialized microbial genomic resource for DHB pigs through integrated metagenomic and culturomic approaches. MAGs were reconstructed from Illumina and ONT sequencing data using MetaBAT2 binning. Concurrently, dominant bacterial strains were isolated from DHB fecal samples using selective culture media, with WGS performed on these isolates. The genomic collection integrated MAGs and isolate genomes to characterize fiber-degrading potential.
Genome dereplication using dRep followed by taxonomic classification with GTDB-Tk (completeness ≥ 50%, contamination ≤ 10%) yielded 888 high-quality bacterial genomes. This collection comprised 382 Illumina-derived MAGs, 489 ONT-derived MAGs, and 17 isolate genomes. Within this dataset, 449 genomes conform to high-quality standards (completeness ≥ 90%, contamination ≤ 5%), including 200 Illumina MAGs, 232 ONT MAGs, and 17 isolate genomes. Taxonomic annotation spanned 10 phyla, 16 classes, 36 orders, 73 families, 345 genera, and 562 species, with all genomes annotated at minimum to family level (Fig. 2).
Based on the mechanisms of cellulase action, we classified genomes with potential for fiber degradation. The typical process of cellulose degradation requires to encode endoglucanases, exoglucanases, and β-glucosidases. A total of 258 genomes conform to this criterion, including 85 Illumina MAGs, 156 ONT MAGs, and 16 isolate genomes. At the phylum level, these genomes were mainly classified into Gemmiger (13 genomes), Blautia_A (8 genomes), and Bacteroides (7 genomes). At the species level, most were from Oliverpabstia sp004556655 (4 genomes), with Bacillus velezensis, Gemmiger sp004561545, Gemmiger variabilis_B, Lactobacillus sp910589675, and Megasphaera elsdenii each containing 3 genomes.
By classifying genomes that conform to the basic requirements for cellulose degradation, we integrated GH families across all genome types to generate comprehensive CAZyme annotation profiles. These functional annotations were combined with GTDB taxonomic classifications to construct a species-resolved catalog of CAZyme functions within the DHB gut microbiota, as documented in Supplementary Table S2. This reveals the extensive potential for polysaccharide degradation in the DHB microbiome.
Fig. 2
The classification annotation and phylogenetic tree of 888 genomes. The outside bar chart represents the genome size of each MAG; The second circle represents the completsness; The third circle uses different icons to indicate data quality, with a solid star representing high quality (completeness ≥ 90; contamination ≤ 5); The fourth circle represents the contamination; The fifth circle represents the whole genome, ONT and Illumina MAGs; The innermost content uses different colors to represent different phylum-level evolutionary relationships, and the branches of the phylogenetic tree represent the species-level bacterial evolutionary relationships.
Click here to Correct
Experimental Validation of Cellulose-Degrading Bacteria from Dahuabai Pigs​
We established a quantitative scoring system to further quantify the functional potential of DHB microbiome genomes in polysaccharide degradation. 129 genomes conform to the requirements for cellulose degradation (scored 3 points), indicating a high potential for cellulose degradation. At the phylum level, these genomes were primarily classified as Bacillus (5 genomes). At the species level, Bacillus velezensis and Lactobacillus sp910589675 each had 3 genomes with the highest score.
To validate the cellulose-degrading degradation ability of these microorganisms, 16 genomes with high cellulose-degrading degradation potential (meaned scored 3 points) were selected from the isolate genomes (Fig. 3a). A two-stage screening evaluated cellulose-degrading degradation capacity. Primary screening of 16 isolates on CMC-Na agar with Congo red staining revealed hydrolysis zones in 14 strains, including Stenotrophomonas acidaminiphila_A (Y81-1, Y81-2, Y82-2, D/d = 2), Phytobacter ursingii (Y55-1, Y55-2 D/d = 2), Enterococcus faecalis D8-1 (D/d = 2), Citrobacter_A sp013836145 X7 (D/d = 2), Bacillus_A cereus Y9-1 (D/d = 4.67), Bacillus_A cereus Y8-1 (D/d = 2.2), Bacillus velezensis D11-1 (D/d = 3), Bacillus velezensis DHB-1 (D/d = 2.83), Bacillus velezensis D7-1 (D/d = 2.5), Bacillus subtilis D6-1 (D/d = 1.6), and Bacillus safensis X6-1 (D/d = 2.25).
Five isolates with larger hydrolysis zones and higher fiber-degrading GH family abundance underwent secondary screening using Congo red staining in inorganic nitrogen source medium. Four strains demonstrated hydrolysis zones: Bacillus velezensis D7-1 (D/d = 4.0), Bacillus subtilis D6-1 (D/d = 3.0), Bacillus safensis X6-1 (D/d = 3.0), Bacillus_A cereus Y9-1 (D/d = 1.3), Citrobacter A sp013836145 X7 showed no hydrolysis zone (Fig. 3b).
CMCase activity was measured using DNS method with glucose standard curve (y = 0.2962x, R²=0.991). Crude enzyme extracts from liquid fermentation showed: Bacillus velezensis D7-1: 136.82 U/mL, Bacillus subtilis D6-1: 24.88 U/mL, Bacillus safensis X6-1: 7.83 U/mL, Bacillus A cereus Y9-1: 2.76 U/mL. The relative hydrolysis zone sizes corresponded to CMCase activity levels (Fig. 3c).
Fig. 3
The distribution and functional verification of the cellulose-degrading enzymes of DHB isolate genomes. (a) The heat map shows the relationship of each cellulose-degrading enzymes to isolate genomes, with color depth indicating the number of CAZyme, Y-axis is isolate genomes. (b) The hydrolysis zones of Bacillus velezensis D7-1, Bacillus subtilis D6-1, Bacillus safensis X6-1, and Bacillus cereus Y9-1. (c) The CMCase activities (U/mL) of 4 Bacillus strains.
Click here to Correct
Discussion
Previous studies have primarily relied on correlation analyses as the main approach for identifying functional microorganisms with specific capabilities31,32. MAGs combined with functional genomic analyses enable more accurate identification of gut microbial species with targeted metabolic functions. Prior research has conducted metagenomic assembly on intestinal microbiota across multiple species including: chicken33, duck34, pig35, cattle36, sheep37, mice38, giant panda39. This study focuses on the indigenous Dahuabai pig breed, employing a hybrid assembly strategy integrating ONT long-read sequencing with Illumina short-read sequencing to reconstruct MAGs from the gut microbiota. Furthermore, we combined culturomics with WGS of isolates to obtain higher-quality genomes. Our establishment of a specialized microbial genomic resource (888 genomes spanning 562 species) for Dahuabai pigs provides comprehensive insights into fiber degradation mechanisms.
The significant enrichment of Lactobacillus (p = 0.007) and Methanobrevibacter (p = 0.032) in DHB pigs (Fig. 1B) suggests breed-specific adaptations to high-fiber diets. Lactobacillus spp. are classic gut-fermenting lactic acid bacteria, and numerous studies have established their association with fiber degradation40,41. Deng et al. found that Methanobrevibacter was significantly more abundant in native pig breeds than in intensively bred breeds, which may facilitate intestinal fermentation through hydrogen consumption, Our findings are consistent with these previous results42,43. Species-level profiling confirmed enrichment of L. amylovorus and L. delbrueckii (Supplementary Fig. S1), consistent with their documented capacity for carbohydrate metabolism44. While community structures showed similarity between breeds (Bray-Curtis p = 0.12), the higher alpha diversity in DHB (Simpson p = 0.041) may reflect broader niche adaptation in this indigenous breed.
Functional annotation demonstrated substantially higher CAZyme abundance in DHB microbiomes (147,829 vs. 63,825 in Duroc), with significant enrichment of cellulose-degrading (GH1, GH5, GH6, GH9, GH12, GH45) and hemicellulose-degrading (GH10, GH26, GH39, GH42, GH43, GH51) families (all p < 0.05; Fig. S2). This aligns with reports of enhanced fiber utilization in Chinese indigenous pig breeds45,46. The GH5 family initiates cellulose degradation through β-1,4-glycosidic bond cleavage47, while GH1 and GH43 facilitate oligosaccharide hydrolysis48, collectively indicatig enhanced fiber-degrading potential in DHB.
Based on functional genomic analysis, we identified a set of potential fiber-degrading microorganisms and subsequently classified and scored them according to their predicted fiber-degrading capacity, thereby constructing a species-resolved catalog of CAZyme functions within the DHB gut microbiota. Among the microorganisms with a high fiber-degradation potential (score = 3), Bacillus velezensis49 have previously been confirmed to possess fiber-degrading ability in ruminant animals. Gemmiger45 has a relatively high abundance in the intestines of the native Guangdong lantang pig, which also shows the trait of being tolerant to coarse feed.
We conducted functional verification on isolate genomes. Hydrolysis zones were found in 14 out of 16 strains, demonstrating fiber degradation ability and reflecting the potential of intestinal microbiota in DHB pigs. Although Sarcina perfringens D21-1 and Ligilactobacillus salivarius D15-1 reached 3 points, there was still no hydrolysis zones, indicating that degraded cellulose might require encoding more GH1,GH3 and GH5, these GH families are all indispensable parts involved in cellulose degradation, from the beginning to the end of cellulose degradation50–52. Bacillus is a common type of bacteria that can degrade fibers53–55. We also identified several strains that have been rarely associated with fiber degradation in pigs, including Enterococcus faecalis, Stenotrophomonas acidaminiphila_A, Phytobacter ursingii, Enterococcus_A avium, and Citrobacter_A sp013836145. These findings expand our understanding of fiber-degrading microbial diversity.
Conclusion
Comparative metagenomics of DHB and Duroc pigs revealed breed-specific taxonomic and functional profiles: DHB showed higher alpha diversity and enrichment of Methanobrevibacter and Lactobacillus, alongside larger CAZyme repertoires with key glycoside hydrolase families for cellulose and hemicellulose breakdown.
A
Guided by these differences, we assembled a DHB-focused, genome-resolved resource (888 genomes, 449 high quality) and a species-resolved CAZyme catalog. Genome-based screening identified 258 candidate cellulolytic genomes (129 high potential), and culture-based assays confirmed cellulase activity in four isolates. These data delineate a DHB-linked fiber-degrading microbiome and provide genomes and cultured representatives for mechanistic interrogation of polysaccharide utilization in the porcine gut. This resource offers a basis to test microbiome- and strain-informed strategies to improve fiber use in swin.
Supplementary Information
A
Author Contribution
YBX, HYH, YJB, XZJ, SB, WXF, LN, JH, CJM, DFL, LY, and ZJC wrote the manuscript. YBX, YJB, LN, DFL, LY, and ZJC provided critical reviews for the content. HYH, XZJ, WXF, SB, JH and CJM provided intellectual oversight, suggestions, multiple critiques and editing. All authors read and approved the final manuscript.
A
Funding:
This research was funded by the National Key Research and Development Program of China (2023YFE0124400), the Specific University Discipline Construction Project (2023B10564001), Youth Project of Guangdong Foshan joint fund of the Guangdong Natural Science Foundation (2022A1515110819), and the National Natural Science Foundation of China (No. 32202715).
Institutional Review Board Statemen
t: Not applicable.
Informed Consent
Statement: Not applicable.
A
Data Availability
A
The dataset supporting the findings of this study is available in the ncbi repository, BioProject:PRJNA1320959 (https://dataview.ncbi.nlm.nih.gov/object/PRJNA1320959?reviewer=e1jc2fciekgojrakb7erev73o0). All the genomes are available in (https://doi.org/10.6084/m9.figshare.30153811.v1)
Electronic Supplementary Material
Below is the link to the electronic supplementary material
Reference
1. Wang, Y. et al. Whole-genome analysis reveals the hybrid formation of Chinese indigenous DHB pig following human migration. Evolutionary Applications 15, 501–514 (2022).
2. Xue, P. et al. Colonic Microbiota Improves Fiber Digestion Ability and Enhances Absorption of Short-Chain Fatty Acids in Local Pigs of Hainan. Microorganisms 12, 1033 (2024).
3. Zhang WeiLi, Z. W. et al. Meat cut evaluation of Dahuabai pig. (2015).
4. Varel, V. H. & Yen, J. T. Microbial perspective on fiber utilization by swine. Journal of Animal Science 75, 2715–2722 (1997).
5. Varel, V. H., Tanner, R. S. & Woese, C. R. Clostridium herbivorans sp. nov., a cellulolytic anaerobe from the pig intestine. International Journal of Systematic and Evolutionary Microbiology 45, 490–494 (1995).
6. Bai, Y. et al. Sources of dietary fiber affect the SCFA production and absorption in the hindgut of growing pigs. Frontiers in Nutrition 8, 719935 (2022).
7. Ma, L. et al. Clostridium butyricum and carbohydrate active enzymes contribute to the reduced fat deposition in pigs. Imeta 3, e160 (2024).
8. Xue, P. et al. Colonic Microbiota Improves Fiber Digestion Ability and Enhances Absorption of Short-Chain Fatty Acids in Local Pigs of Hainan. Microorganisms 12, 1033 (2024).
9. Murga-Garrido, S. M. et al. Gut microbiome variation modulates the effects of dietary fiber on host metabolism. Microbiome 9, 117 (2021).
10. Wang, X. et al. Longitudinal investigation of the swine gut microbiome from birth to market reveals stage and growth performance associated bacteria. Microbiome 7, 109 (2019).
11. Han, Y. et al. Unlocking the Potential of Metagenomics with the PacBio High-Fidelity Sequencing Technology. Microorganisms 12, 2482 (2024).
12. Deng, F. et al. HiFi based metagenomic assembly strategy provides accuracy near isolated genome resolution in MAG assembly. iMetaOmics e70041 (2025).
13. Matchado, M. S. et al. On the limits of 16S rRNA gene-based metagenome prediction and functional profiling. Microbial Genomics 10, 001203 (2024).
14. Deng, F. et al. The unique gut microbiome of giant pandas involved in protein metabolism contributes to the host’s dietary adaption to bamboo. Microbiome 11, 180 (2023).
15. Liu, L., Yang, Y., Deng, Y. & Zhang, T. Nanopore long-read-only metagenomics enables complete and high-quality genome reconstruction from mock and complex metagenomes. Microbiome 10, 209 (2022).
16. Wang, X. et al. Comprehensive cultivation of the swine gut microbiome reveals high bacterial diversity and guides bacterial isolation in pigs. Msystems 6, 10.1128/msystems. 00477 − 21 (2021).
17. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
18. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature methods 9, 357–359 (2012).
19. Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome biology 20, 1–13 (2019).
20. Lu, J., Breitwieser, F. P., Thielen, P. & Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Computer Science 3, e104 (2017).
21. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of computational biology 19, 455–477 (2012).
22. Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nature biotechnology 37, 540–546 (2019).
23. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
24. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome research 25, 1043–1055 (2015).
25. Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. The ISME journal 11, 2864–2868 (2017).
26. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: A Toolkit to Classify Genomes with the Genome Taxonomy Database. (Oxford University Press, 2020).
27. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics 11, 1–11 (2010).
28. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
29. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nature methods 12, 59–60 (2015).
30. Lynd, L. R., Weimer, P. J., Van Zyl, W. H. & Pretorius, I. S. Microbial cellulose utilization: fundamentals and biotechnology. Microbiology and molecular biology reviews 66, 506–577 (2002).
31. Muller, E., Algavi, Y. M. & Borenstein, E. A meta-analysis study of the robustness and universality of gut microbiome-metabolome associations. Microbiome 9, 203 (2021).
32. Shi, H. & Li, J. MAGs-based genomic comparison of gut significantly enriched microbes in obese individuals pre-and post-bariatric surgery across diverse locations. Frontiers in Cellular and Infection Microbiology 15, 1485048 (2025).
33. Shen, H. et al. Metagenome-assembled genome reveals species and functional composition of Jianghan chicken gut microbiota and isolation of Pediococcus acidilactic with probiotic properties. Microbiome 12, 25 (2024).
34. Ma, L. et al. Duck gut metagenome reveals the microbiome signatures linked to intestinal regional, temporal development, and rearing condition. Imeta 3, e198 (2024).
35. Chen, C. et al. Expanded catalog of microbial genes and metagenome-assembled genomes from the pig gut microbiome. Nature communications 12, 1106 (2021).
36. Lin, L., Lai, Z., Zhang, J., Zhu, W. & Mao, S. The gastrointestinal microbiome in dairy cattle is constrained by the deterministic driver of the region and the modified effect of diet. Microbiome 11, 10 (2023).
37. Zhang, K. et al. Compendium of 5810 genomes of sheep and goat gut microbiomes provides new insights into the glycan and mucin utilization. Microbiome 12, 104 (2024).
38. Kieser, S., Zdobnov, E. M. & Trajkovski, M. Comprehensive mouse microbiota genome catalog reveals major difference to its human counterpart. PLOS Computational Biology 18, e1009947 (2022).
39. Deng, F. et al. A comprehensive analysis of antibiotic resistance genes in the giant panda gut. Imeta 3, e171 (2024).
40. Li, X. et al. Superior ability of dietary fiber utilization in obese breed pigs linked to gut microbial hydrogenotrophy. ISME communications 5, ycaf043 (2025).
41. Wang, W., Hu, H., Zijlstra, R. T., Zheng, J. & Gänzle, M. G. Metagenomic reconstructions of gut microbial metabolism in weanling pigs. Microbiome 7, 48 (2019).
42. Deng, F. et al. The diversity, composition, and metabolic pathways of archaea in pigs. Animals 11, 2139 (2021).
43. Yang, J. et al. The role of gut archaea in the pig gut microbiome: a mini-review. Frontiers in Microbiology 14, 1284603 (2023).
44. Kavanova, K., Kostovova, I., Moravkova, M., Kubasova, T. & Crhanova, M. In vitro characterization of lactic acid bacteria and bifidobacteria from wild and domestic pigs: probiotic potential for post-weaning piglets. BMC microbiology 25, 8 (2025).
45. Yang, J. et al. Exploring the intestinal microbial community of lantang pigs through metagenome-assembled genomes and carbohydrate degradation Genes. Fermentation 10, 207 (2024).
46. Cheng, P. H. et al. In vitro fermentative capacity of swine large intestine: comparison between native Lantang and commercial Duroc breeds. Animal Science Journal 88, 1141–1148 (2017).
47. Wang, Y. et al. Metagenomic insight into lignocellulose degradation of the thermophilic microbial consortium TMC7. Journal of Microbiology and Biotechnology 31, 1123 (2021).
48. Kumar, J. et al. Metagenomic insights into the taxonomic and functional features of kinema, a traditional fermented soybean product of Sikkim Himalaya. Frontiers in Microbiology 10, 1744 (2019).
49. Chen, B. et al. Complete genome analysis of Bacillus velezensis TS5 and its potential as a probiotic strain in mice. Frontiers in Microbiology 14, 1322910 (2023).
50. Henrissat, B. & Davies, G. Structural and sequence-based classification of glycoside hydrolases. Current opinion in structural biology 7, 637–644 (1997).
51. Aspeborg, H., Coutinho, P. M., Wang, Y., Brumer III, H. & Henrissat, B. Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5). BMC evolutionary biology 12, 186 (2012).
52. Wojtaczka, P., Ciarkowska, A., Starzynska, E. & Ostrowski, M. The GH3 amidosynthetases family and their role in metabolic crosstalk modulation of plant signaling compounds. Phytochemistry 194, 113039 (2022).
53. Li, F. et al. Screening of cellulose degradation bacteria from Min pigs and optimization of its cellulase production. Electronic Journal of Biotechnology 48, 29–35 (2020).
54. Shang, Z. et al. Complete genome sequencing and investigation on the fiber-degrading potential of Bacillus amyloliquefaciens strain TL106 from the tibetan pig. BMC microbiology 22, 186 (2022).
55. Rajaei-Sharifabadi, H. et al. Growth performance and nutrient digestibility of grower–finisher pigs fed corn DDGS-soybean meal-based diets supplemented with a combination of protease and multi-strain Bacillus-based direct-fed microbial. Frontiers in Animal Science 6, 1562308 (2025).
Click here to Correct
Click here to Correct
Click here to Correct
Total words in MS: 3812
Total words in Title: 16
Total words in Abstract: 310
Total Keyword count: 0
Total Images in MS: 9
Total Tables in MS: 0
Total Reference count: 55