Identification and evolutionary analysis of genome of oyster-associated Microviridae
-
摘要: 具有滤食习性的牡蛎富集了水体中包括病毒在内的大量病原体,是一个极具价值的病毒库。为了对牡蛎相关的病毒库进行深入研究,在前期对华南沿海多地采集的香港牡蛎 (Crassostrea hongkongensis) 进行病毒组测序,并对测序数据进行质控、组装及物种注释后,挑选其中5条被鉴定为微病毒科 (Microviridae) 的基因组序列进行多维度分析,包括宿主预测、开放阅读框预测、主要衣壳蛋白系统发育与三维结构预测、主要衣壳蛋白与外部支架蛋白的进化关联以及病毒丰度分析等。结果显示,5个病毒的宿主均为埃希氏菌属 (Escherichia);其中病毒基因组序列HSd1-5344568聚类在Bullavirinae分支中,说明其为该亚科成员;其余4条未聚类到任何已知亚科中,应属于一个单独的未分类亚科;微病毒主要衣壳蛋白和外部支架蛋白的进化树之间的联系表明两个蛋白的进化规律不同。Abstract: Oysters with filter feeding habits enrich a large number of pathogens including viruses in water, like a very valuable virus bank. To conduct an in-depth research on oyster-related viruses, we sequeced the Hong Kong oyster (Crassostrea hongkongensis) collected along the South China coast. After the quality control, assembly and taxonomy annotation of the sequencing data, we selected five genomic sequences which were identified as Microviridae for multi-dimensional analyses, such as host prediction, open reading frame and gene function prediction, phylogeny and three-dimensional structure prediction of major capsid proteins, evolutionary association between major capsid proteins and external scaffold proteins, as well as virus abundance analysis. The results show that the hosts of the five viruses were all Escherichia; one of the virus genome sequences was clustered in Bullavirinae branch, which indicates that it is a member of the subfamily; the other four genome sequences were not clustered into any known subfamilies so they should belong to a single unclassified subfamily; the relationship between the evolutionary tree of the main capsid proteins and external scaffold proteins indicates that the evolution rules of the two proteins were different.
-
图 1 主要衣壳蛋白的相似性网络聚类图
注:图中的点表示主要衣壳蛋白序列,绿点代表ICTV分类的微病毒科 (n=15),橙点表示牡蛎样本中鉴定为微病毒科的序列 (n=220),蓝点 (n=741) 表示上述两个来源的序列在nr库中最相似的序列,灰线表示连接两个点的边 (即两点序列间blastp打分值)。
Figure 1. Identity network of major capsid proteins
Note: The dots in the figure indicate the major capsid protein sequences, among which the green dots (n=15) represent the Microviridae classified by ICTV; the orange dots (n=220) indicate sequences identified as Microviridae in oyster samples; the blue dots (n=741) indicate the sequences from the two sources mentioned above which are most similar in the NR library; the gray line indicates the edge connecting the two dots (Based on the score values of blastp).
图 2 主要衣壳蛋白进化树及对应的基因组结构图
注:MAFFT 比对主要衣壳蛋白的氨基酸序列,TrimAl 序列对齐,用iqtree建树,itol可视化,最大似然数;自展值:1 000,自动选择最佳替代模型,临界值70%。进化树上的红色星形表示来自牡蛎样品的微病毒科序列,序列号为:GX1-198598、T4S1-854210、T4S1-22425、ML1-11067和HSd1-5344568;ID的背景色表示用Cherry预测的宿主类型;ID右边一列表示ICTV分类的亚科和属,以及未分类病毒的样品来源;右侧为基因组结构示意图,不同颜色表示不同的标志基因。
Figure 2. Phylogenetic tree of major capsid proteins and related schematic maps of genome structure
Note: The amino acid sequences of the major coat proteins were aligned by using MAFFT, trimmed by using trimAL sequences, phylogenetic tree was drawn by using iqtree, visualized by itol, Maximum Likelihood Estimation tree; bootstrap replication: 1 000, automatic selection of the best alternative model, and site coverage cutoff is 70%. The red ID on the phylogenetic tree indicates the sequence from the oyster sample. The background color of ID represents the host type predicted by Cherry; the column on the right side of ID shows the subfamilies and genera of ICTV classification, as well as the sample sources of unclassified viruses; and on the right side is a schematic map of genome structure, with different colors indicating different marker genes.
图 5 微病毒科噬菌体主要衣壳蛋白与外部支架蛋白的进化关系
注:使用 MAFFT 进行基因组序列比对,TrimAl 对齐氨基酸序列,iqtree 建树,itol可视化,最大似然数;自展值:1 000,替代模型:自动选择。图中紫色序列ID为Gokushovirinae序列,红色序列ID为牡蛎相关微病毒科序列,粉色序列ID为Bullavirinae序列。左边为主要衣壳蛋白的进化树,右边为外部支架蛋白的进化树,同一个基因组的两个蛋白用直线连接。
Figure 5. Linkage between phylogenetic tree of major coat protein and that of external scaffolding protein of Microviridae phage
Note: The amino acid sequences of the major coat proteins were aligned by using MAFFT, trimmed by using trimAL sequences, phylogenetic tree was drawn by using iqtree, visualized by itol, Maximum Likelihood Estimation; bootstrap replication: 1 000, automatic selection of the best alternative model, and site coverage cutoff value is 70%. The purple ID in the figure is Gokushovirinae; the red ID is the oyster-associated Microviridae sequence, and the pink ID is the Bullavirinae sequence. The phylogenetic tree of the major capsid protein is shown on the left, and the evolutionary tree of the external scaffold protein is shown on the right, with two proteins of the same genome connected by straight lines.
表 1 牡蛎相关微病毒在各测序文库中的 TPM 丰度
Table 1. TPM abundance of oyster-related microviruses in sequencing libraries
样本 ID
Sample ID病毒基因组 ID Genome of virus ID HSd1-5344568 GX1-198598 T4S1-22425 T4S1-854210 ML1-11067 GX170519 0 0.0 012 0 0 0 S1-DR 0.0 006 0 0 0 0 S2-D-2 0.0 010 0 0 0 0 S2-DR 0.0 005 0 0 0 0 S3-D-2 0.0 002 0 0 0 0 S3-D 0.0 055 0 0 0 0 样本 ID
Sample ID病毒基因组 ID Genome of virus ID HSd1-5344568 GX1-198598 T4S1-22425 T4S1-854210 ML1-11067 S3-DR 0.0 023 0 0 0 0 S4-D-2 0.0 004 0 0 0 0 S5-D-2 0.0 017 0 0 0 0 S6-D-2 0.0 006 0 0 0 0 S7-D-2 0.0 003 0 0 0 0 T4S170523 0 0 0.000 8 0.000 7 0 T5S170523 0 0 0.001 3 0 0 -
[1] 李辉尚, 李坚明, 秦小明, 等. 中国牡蛎产业发展现状、问题与对策:基于鲁、闽、粤、桂四省区的实证分析[J]. 海洋科学, 2017, 41(11): 125-129. [2] 李晨, 谢晓晨, 王博, 等. 牡蛎细菌病的研究进展[J]. 环境生态学, 2022, 4(4): 59-64. [3] CHANG R Y, WONG J, MATHAI A, et al. Production of highly stable spray dried phage formulations for treatment of Pseudomonas aeruginosa lung infection[J]. Eur J Pharm Biopharm, 2017, 12(1): 1-13. [4] 赵虹泽. 噬菌体对鸡白痢治疗效果评价及其对盲肠菌群的影响[D]. 武汉: 华中农业大学, 2022: 1-10. [5] DOSS J, CULBERTSON K, HAHN D, et al. A review of phage therapy against bacterial pathogens of aquatic and terrestrial organisms[J]. Viruses, 2017, 9(3): 50-60. doi: 10.3390/v9030050 [6] PEDULLA M L, FORD M E, HOUTZ J M, et al. Origins of highly mosaic mycobacteriophage genomes[J]. Cell, 2003, 113(2): 171-182. doi: 10.1016/S0092-8674(03)00233-2 [7] SALMOND G P, FINERAN P C. A century of the phage: past, present and future[J]. Nat Rev Microbiol, 2015, 13(12): 777-786. doi: 10.1038/nrmicro3564 [8] GILDEA L, AYARIGA J A, ROBERTSON B K. Bacteriophages as biocontrol agents in livestock food production[J]. Microorganisms, 2022, 10(11): 2126-2145. doi: 10.3390/microorganisms10112126 [9] DESNUES C, RODRIGUEZ-BRITO B, RAYHAWK S, et al. Biodiversity and biogeography of phages in modern stromatolites and thrombolites[J]. Nature, 2008, 452(7185): 340-343. doi: 10.1038/nature06735 [10] ROSARIO K, DAYARAM A, MARINOV M, et al. Diverse circular ssDNA viruses discovered in dragonflies (Odonata: Epiprocta)[J]. J Gen Virol, 2012, 93(Pt 12): 2668-2681. [11] JIANG J Z, FANG Y F, WEI H Y, et al. A remarkably diverse and well-organized virus community in a filter-feeding oyster[J]. Microbiome, 2023, 11(1): 2-16. doi: 10.1186/s40168-022-01431-8 [12] BRENTLINGER K L, HAFENSTEIN S, NOVAK C R, et al. Microviridae, a family divided: isolation, characterization, and genome sequence of phiMH2K, a bacteriophage of the obligate intracellular parasitic bacterium Bdellovibrio bacteriovorus[J]. J Bacteriol, 2002, 184(4): 1089-1094. doi: 10.1128/jb.184.4.1089-1094.2002 [13] QUAISER A, DUFRESNE A, BALLAUD F, et al. Diversity and comparative genomics of Microviridae in Sphagnum-dominated peatlands[J]. Front Microbiol, 2015, 6: 375-385. [14] ZHANG T, BREITBART M, LEE W H, et al. RNA viral community in human feces: prevalence of plant pathogenic viruses[J]. PLoS Biol, 2006, 4(1): 3-12. [15] BREITBART M, HAYNES M, KELLEY S, et al. Viral diversity and dynamics in an infant gut[J]. Res Microbiol, 2008, 159(5): 367-373. doi: 10.1016/j.resmic.2008.04.006 [16] BENBOW R M, HUTCHISON C A, FABRICANT J D, et al. Genetic map of bacteriophage φX174[J]. J Virol, 1971, 7(5): 549-558. doi: 10.1128/jvi.7.5.549-558.1971 [17] CHEN S, ZHOU Y, CHEN Y, et al. Fastp: an ultra-fast all-in-one FASTQ preprocessor[J]. Bioinformatics, 2018, 34(17): i884-i890. doi: 10.1093/bioinformatics/bty560 [18] LI D, LIU C M, LUO R, et al. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph[J]. Bioinformatics, 2015, 31(10): 1674-1676. doi: 10.1093/bioinformatics/btv033 [19] BUCHFINK B, XIE C, HUSON D H. Fast and sensitive protein alignment using DIAMOND[J]. Nat Methods, 2015, 12(1): 59-60. doi: 10.1038/nmeth.3176 [20] HUSON D H, BEIER S, FLADE I, et al. MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data[J]. PLoS Comput Biol, 2016, 12(6): e1004957. doi: 10.1371/journal.pcbi.1004957 [21] HYATT D, CHEN G L, LOCASCIO P F, et al. Prodigal: prokaryotic gene recognition and translation initiation site identification[J]. BMC Bioinform, 2010, 11(1): 119-130. doi: 10.1186/1471-2105-11-119 [22] 关迎晖, 向勇, 陈康. 基于Gephi的可视分析方法研究与应用[J]. 电信科学, 2013,2 9(S1): 112-119. [23] KATOH K, STANDLEY D M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability[J]. Mol Biol Evol, 2013, 30(4): 772-780. doi: 10.1093/molbev/mst010 [24] CAPELLA-GUTIÉRREZ S, SILLA-MARTíNEZ J M, GABALDÓN T. TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses[J]. Bioinformatics, 2009, 25(15): 1972-1973. doi: 10.1093/bioinformatics/btp348 [25] NGUYEN L T, SCHMIDT H A, von HAESELER A, et al. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies[J]. Mol Biol Evol, 2014, 32(1): 268-274. [26] LEE I, OUK KIM Y, PARK S C, et al. OrthoANI: an improved algorithm and software for calculating average nucleotide identity[J]. Int J Syst Evol Microbiol, 2016, 66(2): 1100-1103. doi: 10.1099/ijsem.0.000760 [27] TISZA M J, BELFORD A K, DOMÍNGUEZ-HUERTA G, et al. Cenote-Taker 2 democratizes virus discovery and sequence annotation[J]. Virus Evol, 2021, 7(1): veaa100. doi: 10.1093/ve/veaa100 [28] SHANG J, SUN Y. CHERRY: a computational method for accurate prediction of virus-prokaryotic interactions using a graph encoder-decoder model[J]. Brief Bioinform, 2022, 23(5): 182-198. doi: 10.1093/bib/bbac182 [29] PATRO R, DUGGAL G, LOVE M I, et al. Salmon provides fast and bias-aware quantification of transcript expression[J]. Nat Methods, 2017, 14(4): 417-419. doi: 10.1038/nmeth.4197 [30] HOPKINS M, KAILASAN S, COHEN A, et al. Diversity of environmental single-stranded DNA phages revealed by PCR amplification of the partial major capsid protein[J]. Isme J, 2014, 8(10): 2093-2103. doi: 10.1038/ismej.2014.43 [31] 李灏, 丁子元, 徐林通, 等. 论我国水产养殖病害控制技术现状与发展趋势[J]. 农业与技术, 2015, 35(24): 171. [32] 陈愿. 噬菌体在水产养殖业中的研究进展[J]. 水产学报, 2021, 45(9): 1605-1615. [33] ROHWER F, THURBER R V. Viruses manipulate the marine environment[J]. Nature, 2009, 459(7244): 207-212. doi: 10.1038/nature08060 [34] ROUX S, ENAULT F, ROBIN A, et al. Assessing the diversity and specificity of two freshwater viral communities through metagenomics[J]. PLoS One, 2012, 7(3): e33641. doi: 10.1371/journal.pone.0033641 [35] KRUPOVIC M, FORTERRE P. Microviridae goes temperate: microvirus-related proviruses reside in the Genomes of Bacteroidetes[J]. PLoS One, 2011,6(5): e19893. [36] 李振, 张建城, 曹振辉, 等. 噬菌体控制主要水产养殖类致病菌的研究进展[J]. 畜牧与兽医, 2015, 47(8): 138-143. [37] PENG Y, JIN Y, LIN H, et al. Application of the VPp1 bacteriophage combined with a coupled enzyme system in the rapid detection of Vibrio parahaemolyticus[J]. J Microbiol Methods, 2014, 98(3): 99-104. [38] GAUDU P, YAMAMOTO Y, JENSEN P R, et al. Genetics of Lactococci[J]. Microbiol Spectr, 2019, 7(4): 361-362. [39] 洪宝华, 马荣荣, 袁娜. 养殖梭鱼格氏乳球菌的分离鉴定及致病性研究[J]. 农业生物技术学报, 2020, 28(8): 1458-1470. [40] ZHU P, LIU G F, LIU C, et al. Novel RNA viruses in oysters revealed by virome[J]. iMeta, 2022, 1(4): e65. -