Characterization of microsatellites and polymorphic marker development in ragworm (Tylorrhynchus heterochaetus) based on genome survey data
-
摘要: 为了解疣吻沙蚕 (Tylorrhynchus heterochaetus) 基因组信息并高效地开发微卫星标记,指导其种质资源保护与新品种的遗传改良研究,采用低深度高通量测序开展全基因组survey,k-mer分析估计疣吻沙蚕基因组大小为759.53 Mb,杂合率1.41%,重复序列比例45.92%;初步组装获得2 181 621条scaffold,全长为840 375 821 bp。在基因组序列中检测到130 216个微卫星位点,丰度为154.9 个·Mb−1。微卫星重复次数集中在4~18拷贝;单碱基重复比例最高 (35.00%),二碱基 (32.48%)、三碱基 (14.42%) 次之;二碱基、三碱基优势基序分别是AT/AT、AAT/ATT,表现出A/T碱基优势。从随机挑选的50对引物中筛选到15对多态标记,在30尾样本中共检测到87个等位基因,等位基因数 (Na) 为2.000~12.000 (平均5.800),有效等位基因数 (Ne) 为1.164~6.713 (平均3.328),期望杂合度 (He) 为0.141~0.789 (平均0.561),多态信息含量 (PIC) 为0.136~0.776 (平均0.511);其中13个为高度或中度多态性位点,在遗传分析中有较高的实用价值。结果表明,疣吻沙蚕基因组为复杂基因组,其微卫星位点类型丰富且具备良好的多态性潜能,可为种质资源评价、群体遗传学及分子育种研究提供有效的标记资源。
-
关键词:
- 疣吻沙蚕 /
- 全基因组survey /
- 简单重复序列 /
- 多态性位点
Abstract: In order to understand the genomic information of Tylorrhynchus heterochaetus and efficiently develop microsatellite markers, so as to guide the conservation of its germplasm resources and genetic improvement of new varieties, we conducted a whole-genome survey by using low depth high-throughput sequencing. A total of 57.48 Gb of clean data were generated after the quality control of raw data. K-mer analysis estimates that the genome size of T. heterochaetus was 759.53 Mb; the heterozygosity rate was 1.41%; the proportion of repetitive sequences was 45.92%. Preliminary assembly obtained 2 181 621 scaffolds with a total length of 840 375 821 bp. A total of 130 216 microsatellite loci were detected with a density of 154.9 loci per Mb. The repeated number of microsatellite units largely ranged from 4 to 18. The ratio of mononucleotide loci was the highest (35.00%), followed by those of dinucleotide (32.48%) and trinucleotide (14.42%) loci. AT/AT and AAT/ATT motifs were dominant in dinucleotide and trinucleotide loci, respectively, indicating an A/T dominance. Fifteen polymorphic loci were identified from 50 randomly selected primers, and 87 alleles were amplified in a T. heterochaetus population containing 30 individuals. The number of alleles per locus ranged from 2.000 to 12.000, with an mean of 5.800. The effective allele number (Ne) and expected heterozygosity (He) ranged from 1.164 to 6.713 and from 0.141 to 0.789, with means of 3.328 and 0.561, respectively. The polymorphic information content (PIC) ranged from 0.136 to 0.776, with a mean of 0.511. Thirteen loci were found to be highly or moderately polymorphic, having high practical value in genetic analysis. In conclusion, T. heterochaetus genome is a complex genome, and its microsatellites have a rich variety and high polymorphic potential. The results can provide effective marker resources for germplasm resource evaluation, population genetics and molecular breeding research. -
图 5 疣吻沙蚕基因组微卫星长度分布特征
注:a. 不同长度区间微卫星数量及比例;b. 不同类型微卫星长度分布特征。
Figure 5. Distribution pattern of length of microsatellite loci genome of T. heterochaetus
Note: a. Number and percentage of microsatellite loci at different length intervals; b. Length distribution of the six motif types of microsatellite loci.
表 1 疣吻沙蚕基因组 survey 测序数据统计
Table 1. Statistics of genomic survey sequencing data of T. heterochaetus
测序文库
Sequencing library原始数据量
Raw base/Gb有效数据比
Effective rate/%有效数据量
Clean base/Gb碱基错误率
Error rate/%Q20/% Q30/% GC 含量
GC content/%L1 33.12 99.68 33.01 0.04 96.23 90.77 39.13 L2 24.53 99.75 24.47 0.05 94.88 88.63 39.00 总计 Total 57.65 — 57.48 — — — — 均值 Mean — 99.72 — 0.05 95.56 89.70 39.07 表 2 疣吻沙蚕 15 对多态微卫星引物信息
Table 2. Information of 15 polymorphic microsatellite loci in genome of T. heterochaetus
位点
Locus引物序列 (5'—3')
Primer sequence (5'–3')重复单元
Repeat unit产物大小
Size/bp退火温度
Annealing temperature/℃ThGM004 F: TGCTGCTACTGCTACAGCTACTATG (TAC)18 289 60.0 R: CTGACAAAGTTTGGTGGCTG ThGM006 F: TGAAAATTAGTGTGATTTTGTCCC (CA)11 260 59.0 R: AGCCAACCAGAACATGAACA ThGM011 F: AACTTGGACTAAGGCTATCAAAAA (AG)17 220 59.0 R: CTTGGGGTTCATGCATCATT ThGM015 F: TTGGTTGTTATCCATGCACC (TAT)12 279 59.5 R: AGACAGCAGTGAAATAGCACCA ThGM017 F: ATTCGATAAGCATTCCACCG (ATGG)8 215 60.0 R: CTTGGTAGCTGGCCTGTCTC ThGM021 F: TGCGAAATGAGAAGTGAGCA (TA)10 277 60.0 R: TGCCTGTGTGGAATACCAAG ThGM024 F: ACCTGTCCACCCGTCATTTA (TAT)14 294 59.5 R: CCTTTAGGGGATGGCTACAA ThGM029 F: GAGCAAAATATTCAAGTTGGCA (ATT)12 243 59.0 R: TTGTTTGTCATATCTTCTAAAGAGCA ThGM033 F: GGAGTGGGGAGGATTTTAGC (TG)18 277 60.0 R: CCATGTACAGCATTCAGCCA ThGM035 F: GTAAGGGCAAGGGTTGTGAA (AG)13 226 60.0 R: ACCGTTACCCTAACCCCAAC ThGM038 F: TTACCCTGCCATCCTACCAG (TG)20 157 60.0 R: CTATTCTGCCAGTGGTCGCT ThGM040 F: GGATCCAGAAGGGGTAAAGC (TTA)11 239 59.5 R: GTTGGTCATGTTCCTGTTGC ThGM041 F: ACCAGCTGCTAGAGGCAGAC (ATG)7 260 60.0 R: TTAGGTCCTCACCCAGGGAT ThGM043 F: AAAAGCAAGTGGTAACACAAAATG (TCAT)11 272 59.5 R: CATTGGGCTCTGGGAATAAA ThGM047 F: CGACCTGCGGATTTAATTTG (TGG)12 148 60.0 R: ATATCTTGGCGGCGGATAG 注:F. 正向引物;R. 反向引物。 Note: F. Forward primer; R. Reverse primer. 表 3 15 个多态微卫星位点在疣吻沙蚕群体中的遗传特征
Table 3. Genetic characteristics of 15 polymorphic microsatellite loci in a T. heterochaetus population
位点 Locus 等位基因数 Na 有效等位基因数 Ne 观测杂合度 Ho 期望杂合度 He 多态信息含量 PIC 哈迪-温伯格平衡的P 值 PHWE ThGM004 12 6.672 0.697 0.775 0.726 0.275 ThGM006 4 1.642 0.367 0.388 0.372 0.001* ThGM011 5 1.608 0.257 0.349 0.377 0.026* ThGM015 8 4.933 0.438 0.789 0.776 0.225 ThGM017 3 1.521 0.066 0.271 0.245 0.148 ThGM021 2 1.593 0.367 0.508 0.375 1.000 ThGM024 11 6.713 0.879 0.742 0.682 0.541 ThGM029 10 5.647 0.697 0.658 0.599 0.140 ThGM033 5 3.102 0.167 0.772 0.720 0.069 ThGM035 3 2.164 0.050 0.141 0.136 0.086 ThGM038 8 4.878 0.576 0.545 0.489 0.221 ThGM040 4 2.441 0.733 0.718 0.652 0.008* ThGM041 3 2.155 0.417 0.431 0.336 0.503 ThGM043 5 2.727 0.724 0.682 0.617 0.148 ThGM047 4 2.224 0.867 0.642 0.569 0.267 均值 Mean 5.800 3.328 0.487 0.561 0.511 — 注:*. Bonferroni法校正后显著偏离哈迪-温伯格平衡(P<0.05);n=30。 Note: *. Significant departure from Hardy-Weinberg equilibrium after Bonferroni's correction (P<0.05);n=30. -
[1] YANG Z Q, SUNIL C, JAYACHANDRAN M, et al. Anti-fatigue effect of aqueous extract of Hechong (Tylorrhynchus heterochaetus) via AMPK linked pathway[J]. Food Chem Toxicol, 2020, 135: 111043. doi: 10.1016/j.fct.2019.111043 [2] 苏跃朋, 黄啟, 崔阔鹏. 珠江河口区禾虫产业技术现状及增养殖效益分析[J]. 海洋与渔业, 2016(10): 64-67. [3] ZHANG W X, WANG Z X, GANESAN K, et al. Antioxidant activities of aqueous extracts and protein hydrolysates from marine worm Hechong (Tylorrhynchus heterochaeta)[J]. Foods, 2022, 11(13): 1837. doi: 10.3390/foods11131837 [4] 杨尉, 陈兴汉. 疣吻沙蚕-水稻生态复合种养技术要点及效益分析[J]. 南方农业, 2022, 16(20): 17-20, 24. [5] CHEN X H, YANG S, YANG W, et al. First genetic assessment of brackish water polychaete Tylorrhynchus heterochaetus: mitochondrial COI sequences reveal strong genetic differentiation and population expansion in samples collected from southeast China and north Vietnam[J]. Zool Res, 2020, 41(1): 61-69. doi: 10.24272/j.issn.2095-8137.2020.006 [6] CHEN X H, LI M M, LIU H P, et al. Mitochondrial genome of the polychaete Tylorrhynchus heterochaetus (Phyllodocida, Nereididae)[J]. Mitochondrial DNA A, 2016, 27(5): 3372-3373. doi: 10.3109/19401736.2015.1018226 [7] CHEN H, LI X, WANG Y, et al. De novo transcriptomic characterization enables novel microsatellite identification and marker development in Betta splendens[J]. Life, 2021, 11(8): 803. doi: 10.3390/life11080803 [8] 孙效文, 张晓锋, 赵莹莹, 等. 水产生物微卫星标记技术研究进展及其应用[J]. 中国水产科学, 2008, 15(4): 689-703. [9] 张永德, 文露婷, 罗洪林, 等. 卵形鲳鲹基因组调研及其SSR分子标记的开发应用[J]. 南方农业学报, 2020, 51(5): 983-994. [10] 上官清, 陈昆慈, 刘海洋, 等. 斑鳢基因组中微卫星分布特征及野生种群遗传结构分析[J]. 南方水产科学, 2020, 16(3): 47-60. [11] LIU B H, SHI Y J, YUAN J Y, et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects[J]. Quant Biol, 2013, 35(s1-3): 62-67. [12] LUO R B, LIU B H, XIE Y L, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler[J]. GigaScience, 2012, 1(1): 18. doi: 10.1186/2047-217X-1-18 [13] LALITHA S. Primer premier 5[J]. Biotech Softw Internet Rep, 2000, 1(6): 270-272. doi: 10.1089/152791600459894 [14] 刘玉萍, 王棋, 黄新芯, 等. 基于高通量测序的带鱼肌肉组织转录组微卫星信息分析[J]. 南方农业学报, 2022, 53(3): 725-734. [15] PEAKALL R, SMOUSE P E. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research: an update[J]. Bioinformatics, 2012, 28(19): 2537-2539. doi: 10.1093/bioinformatics/bts460 [16] TEMNYKH S, DECLERCK G, LUKASHOVA A, et al. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential[J]. Genome Res, 2001, 11(8): 1441-1452. doi: 10.1101/gr.184001 [17] SIMAKOV O, MARLETAZ F, CHO S J, et al. Insights into bilaterian evolution from three spiralian genomes[J]. Nature, 2013, 493(7433): 526-531. doi: 10.1038/nature11696 [18] TONG L, DAI S X, KONG D J, et al. The genome of medicinal leech (Whitmania pigra) and comparative genomic study for exploration of bioactive ingredients[J]. BMC Genom, 2022, 23(1): 76. doi: 10.1186/s12864-022-08290-5 [19] MARTÍN-ZAMORA F M, LIANG Y, GUYNES K, et al. Annelid functional genomics reveal the origins of bilaterian life cycles[J]. Nature, 2023, 615(7950): 105-110. doi: 10.1038/s41586-022-05636-7 [20] de OLIVEIRA A L, MITCHELL J, GIRGUIS P, et al. Novel insights on obligate symbiont lifestyle and adaptation to chemosynthetic environment as revealed by the giant tubeworm genome[J]. Mol Biol Evol, 2022, 39(1): msab347. doi: 10.1093/molbev/msab347 [21] LI Y, TASSIA M G, WAITS D S, et al. Genomic adaptations to chemosymbiosis in the deep-sea seep-dwelling tubeworm Lamellibrachia luymesi[J]. BMC Biol, 2019, 17(1): 91. doi: 10.1186/s12915-019-0713-x [22] JIN F, ZHOU Z L, GUO Q, et al. High-quality genome assembly of Metaphire vulgaris[J]. PeerJ, 2020, 8: e10313. doi: 10.7717/peerj.10313 [23] ZAKAS C, HARRY N D, SCHOLL E H, et al. The genome of the poecilogonous Annelid Streblospio benedicti[J]. Genome Biol Evol, 2022, 14(2): evac008. doi: 10.1093/gbe/evac008 [24] SHAO Y, WANG X B, ZHANG J J, et al. Genome and single-cell RNA-sequencing of the earthworm Eisenia andrei identifies cellular mechanisms underlying regeneration[J]. Nat Commun, 2020, 11(1): 2656. doi: 10.1038/s41467-020-16454-8 [25] ZWARYCZ A S, NOSSA C W, PUTNAM N H, et al. Timing and scope of genomic expansion within Annelida: evidence from homeoboxes in the genome of the earthworm Eisenia fetida[J]. Genome Biol Evol, 2016, 8(1): 271-281. doi: 10.1093/gbe/evv243 [26] KENNY N J, NAMIGAI E K O, MARLÉTAZ F, et al. Draft genome assemblies and predicted microRNA complements of the intertidal lophotrochozoans Patella vulgata (Mollusca, Patellogastropoda) and Spirobranchus (Pomatoceros) lamarcki (Annelida, Serpulida)[J]. Mar Genom, 2015, 24(2): 139-146. [27] SUN Y N, SUN J, YANG Y, et al. Genomic signatures supporting the symbiosis and formation of chitinous tube in the deep-sea tubeworm Paraescarpia echinospica[J]. Mol Biol Evol, 2021, 38(10): 4116-4134. doi: 10.1093/molbev/msab203 [28] 高胜寒, 禹海英, 吴双阳, 等. 复杂基因组测序技术研究进展[J]. 遗传, 2018, 40(11): 944-963. [29] 徐杰杰, 毕宜慧, 程景颢, 等. 中华绒螯蟹 (Eriocheir sinensis) 全基因组微卫星分布特征研究[J]. 基因组学与应用生物学, 2021, 40(Z2): 2422-2429. [30] 梁霞, 王慧琪, 马宇璇, 等. 鲤鱼(Cyprinus carpio)全基因组微卫星分布特征研究[J]. 南京师大学报 (自然科学版), 2021, 44(3): 103-111. [31] ZHANG Q, ZHANG C S, YU Y, et al. Characteristic analysis of simple sequence repeats in the ridgetail white prawn Exopalaemon carinicauda genome and its application in parentage assignment[J]. J World Aquacult Soc, 2020, 51(3): 690-701. doi: 10.1111/jwas.12650 [32] SRIVASTAVA S, KUSHWAHA B, PRAKASH J, et al. Development and characterization of genic SSR markers from low depth genome sequence of Clarias batrachus (Magur)[J]. J Genet, 2016, 95(3): 603-609. doi: 10.1007/s12041-016-0672-8 [33] 彭冶, 李杰, 王涛, 等. 瓦氏黄颡鱼全基因组微卫星的分布特征及其定位的初步研究[J]. 南方水产科学, 2022, 18(1): 90-98. [34] XU S Y, SONG N, XIAO S J, et al. Whole genome survey analysis and microsatellite motif identification of Sebastiscus marmoratus[J]. Biosci Rep, 2020, 40(2): BSR20192252. doi: 10.1042/BSR20192252 [35] 王九龙, 李洪莉, 尹硕, 等. 绿鳍马面鲀全基因组微卫星分布特征[J]. 烟台大学学报 (自然科学与工程版), 2022, 35(3): 285-293. [36] 王佳佳, 王琼, 秦桢, 等. 凡纳滨对虾全基因组SSR标记开发及不同养殖群体的遗传多样性分析[J]. 水产学报, 2023, 47(6): 64-74. [37] SUN J X, PENG G H, XIONG L J, et al. Genome-wide SSR marker development and application in genetic diversity analysis of the red swamp crayfish, Procambarus clarkii (Girard, 1852) in China[J]. Crustaceana, 2021, 94(2): 189-205. doi: 10.1163/15685403-bja10076 [38] 倪守胜, 杨钰, 柳淑芳, 等. 基于高通量测序的虾夷扇贝基因组微卫星特征分析[J]. 渔业科学进展, 2018, 39(1): 107-113. [39] 熊良伟, 王帅兵, 岳丽佳, 等. 宽体金线蛭基因组SSR序列特征分析及其分子标记开发[J]. 南方农业学报, 2018, 49(11): 2298-2303. [40] LIU H Y, ZHANG Y F, WANG G B, et al. Development and characterization of microsatellite markers in the earthworm Drawida gisti Michaelsen, 1931 and cross-amplification in two other congeners[J]. Mol Biol Rep, 2020, 47(10): 8265-8269. doi: 10.1007/s11033-020-05799-4 [41] 王斌, 孙静, 刘凌云, 等. 蛭类转录组中EST-SSR分析及抗凝血相关分子标记的挖掘[J]. 中草药, 2017, 48(1): 172-178. [42] MADUNA S N, VIVIAN-SMITH A, JÓNSDÓTTIR Ó D B, et al. Genome- and transcriptome-derived microsatellite loci in lumpfish Cyclopterus lumpus: molecular tools for aquaculture, conservation and fisheries management[J]. Sci Rep, 2020, 10(1): 559. doi: 10.1038/s41598-019-57071-w [43] 李强勇, 李旻, 曾地刚, 等. 凡纳滨对虾微卫星分子标记的开发及不同养殖家系遗传多态性分析[J]. 南方农业学报, 2020, 51(2): 429-436. [44] WIERDL M, DOMINSKA M, PETES T D. Microsatellite instability in yeast: dependence on the length of the microsatellite[J]. Genetics, 1997, 146(3): 769-779. doi: 10.1093/genetics/146.3.769 [45] JO E, LEE S J, CHOI E, et al. Whole genome survey and microsatellite motif identification of Artemia franciscana[J]. Biosci Rep, 2021, 41(3): BSR20203868. doi: 10.1042/BSR20203868 [46] SCHLÖTTERER C, TAUTZ D. Slippage synthesis of simple sequence DNA[J]. Nucleic Acids Res, 1992, 20(2): 211-215. doi: 10.1093/nar/20.2.211 [47] 马军, 刘嘉鑫, 江智景, 等. 基于RNA-seq数据的密斑刺鲀SSR分子标记开发及鉴定[J]. 南方水产科学, 2020, 16(1): 127-136. [48] 朱维岳, 周桃英, 钟明, 等. 基于遗传多样性和空间遗传结构的野生大豆居群采样策略[J]. 复旦学报 (自然科学版), 2006, 45(3): 321-327. -