摘要
This collection of Supplementary Tables provides comprehensive datasets and analyses supporting the systematic identification and characterization of CRISPR-Cas systems and anti-CRISPR (Acr) proteins in the human gut microbiome. Tables S1–S7 and S10 detail the detection, classification, and phylogenetic analysis of Class 1 and Class 2 CRISPR-Cas systems—including Cas9 orthologs—within the UHGG database. Tables S9 and S11 document spacer–virus connections between UHGG and GVD, enabling the prediction of phage-encoded Acrs. Tables S12–S14 summarize Acr candidate selection, codon optimization, and library construction. Functional validation data for Acrs targeting six Type II-Cas9 systems are presented in Tables S16–S22, with a non-redundant Acr set provided in Table S22. Finally, structural analyses, including fold similarity and the GutAcraca family, are summarized in Tables S24–S26. Table S1. Class 1 CRISPR-Cas systems detected in UHGG, related to Figure 1A Table S2. Class 2 CRISPR-Cas systems detected in UHGG, related to Figure 1A, B Table S3. Type I CRISPR-Cas systems with Cas3 detected in UHGG, related to Figure 1A Table S4. Type III CRISPR-Cas systems with Cas10 detected in UHGG, related to Figure 1A Table S5. Distribution of type II, V, and VI CRISPR-Cas systems from Class 2 across microbial classes, related to Figure 1B Table S6. Cas9 CDSs detected in UHGG, related to Figure 1C. Cas9_subfamily were obtained from UniProt according to UniProtKB_Entry annotated by UHGG. Table S7. Non-redundant Cas9 CDSs used to construct the phylogenetic tree, related to Figure 1C Table S9. Connections between CRISPR spacers from UHGG and viral contigs from GVD through CRISPR-spacer blastn matches, related to Figure 2A and Figure 1D Table S10. Non-redundant Cas9 CDSs used to construct the phylogenetic tree, related to Figure 1D Table S11. Viral contigs in GVD which had CRISPR spacer matching with microbial genomes in UHGG carrying Cas9, related to Figure 2A Table S12. Acr candidates with amino acid sequence, related to Figure 2A Table S13. Selecting Acr candidates for DNA sequence codon optimization, related to Figure 2B Table S14. Oligos design of Acr candidate library, related to Figure 2B Table S16. Positive Acr candidates of SpyCas9, related to Figure S2A, B Table S17. Positive Acr candidates of SaCas9, related to Figure S2A, B Table S18. Positive Acr candidates of St1Cas9, related to Figure S2A, B Table S19. Positive Acr candidates of St3Cas9, related to Figure S2A, B Table S20. Positive Acr candidates of FnCas9, related to Figure S2A, B Table S21. Positive Acr candidates of NmCas9, related to Figure S2A, B Table S22. 651 non-redundant positive Acr candidates in total, related to Figure 4A, B Table S24. Structural similarity matrix of positive Acr candidates, related to Figure 4A, B and S5A Table S25. Members of GutAcraca, related to Figure 4B Table S26. Structural similarity analysis of GutAcraca in the AlphaFold database, related to Figure 5N