单核苷酸多态性,通常缩写为SNP(/snɪp/;复数/snɪps/),是发生在基因组特殊位置的单个核苷酸的替换,其中每个变异在群体中存在一定程度(例如> 1%)。[1]
在人类基因组的特定碱基位置上,C核苷酸可能出现在大多数个体中,但是在少数个体中,该位置被A占据。这意味着在该特定位置上存在单核苷酸多态性,并且两种可能的核苷酸变异——C或A——被称为该位置的等位基因。
单核苷酸多态性预示了我们对多种疾病(如镰状细胞性贫血、β地中海贫血和单核苷酸多态性导致的囊性纤维化)的易感性差异。[2][3][4] 疾病的严重性和身体对治疗的反应也是遗传变异的表现。例如,APOE(载脂蛋白E)基因的单碱基突变与阿尔茨海默病的风险较低相关。[5]
单核苷酸变体(SNV)是单核苷酸的变体,没有任何频率限制,可能出现在体细胞中。体细胞单核苷酸变异(比如,由癌症引起)也可以称为单核苷酸变异。
Types of SNPs |
---|
|
单核苷酸多态性可能属于基因编码序列、基因非编码区或基因间区(基因间区)。由于遗传密码的简并性,编码序列中的单核苷酸多态性不一定改变产生的蛋白质的氨基酸序列。
编码区的SNPs有两种类型:同义和非同义SNPs。同义SNPs不影响蛋白质序列,而非同义SNPs改变蛋白质的氨基酸序列。非同义SNPs有两种类型:错义和无意义。
不在蛋白质编码区的单核苷酸多态性仍可能影响基因剪接、转录因子结合、信使核糖核酸降解或非编码核糖核酸序列。受这种类型的单核苷酸多态性影响的基因表达被称为eSNP(表达单核苷酸多态性),可以位于基因的上游或下游。
在不同人群中发现了超过8400万个单核苷酸多态性。一个典型的基因组在400万到500万个位点上不同于参考人类基因组,其中大部分(超过99.9%)由单核苷酸多态性和短序列组成。[8]
单核苷酸多态性的基因组分布不均匀;单核苷酸多态性出现在非编码区的频率高于编码区,或者一般来说,在自然选择起作用并“固定”构成最有利遗传适应的单核苷酸多态性等位基因(消除其他变体)的地方。[9] 其他因素,如基因重组和突变率,也可以决定单核苷酸多态性密度。[10]
单核苷酸多态性密度可以通过微卫星的存在来预测:特别是腺苷酸微卫星是单核苷酸多态性密度的有效预测因子,长的(AT)重复片段通常被发现存在在单核苷酸多态性频率显著较低和GC含量较低的区域。[11]
人类群体之间存在差异,因此在一个地理或种族群体中常见的单核苷酸多态性等位基因在另一个群体中可能要罕见得多。在一个群体中,单核苷酸多态性可以被赋予一个较低的等位基因频率——在特定群体中观察到的某一位点的最低等位基因频率。这只是单核苷酸多态性的两个等位基因频率中较小的一个。
人类基因序列的变化会影响人类疾病的发生以及对病原体、化学物质、药物、疫苗和其他药剂的反应。单核苷酸多态性对个性化医学也至关重要。[12] 例子包括生物医学研究、法医学、药物遗传学和疾病成因,如下所述。
单核苷酸多态性在临床研究中最重要的是在全基因组关联研究中比较不同队列(如有疾病和无疾病的匹配队列)之间的基因组区域。单核苷酸多态性已经在全基因组关联研究中用作与疾病或正常性状相关的基因定位的高分辨率标记。对表型没有明显影响的单核苷酸多态性(所谓的沉默突变)在全基因组关联研究中仍然是有用的遗传标记,因为它们的数量和世代间的稳定遗传。
单核苷酸多态性最初用于将法医DNA样本与嫌疑人进行匹配,但随着基于串联重复序列的脱氧核糖核酸指纹技术的发展,它已经被淘汰。当前的新一代测序(NGS)技术可能允许在法医应用中更好地使用单核苷酸多态性基因分型,只要避免有问题的基因座。[13] 在未来,SNPs可能被用于一些表型线索的取证,如眼睛颜色、头发颜色、种族等。Kidd等人已经证明,一个由19个SNPs组成的小组可以在所研究的40个人群中识别出匹配概率很高的种族群体(Pm = 10^-7)。[14] 这可能有潜在用途的一个例子是对未知个体骨骼化遗骸可能的预解剖外观进行艺术重建。虽然严格地基于人类学特征,面部重建可以相当精确,但是其他包括眼睛颜色、皮肤颜色、头发颜色等数据可能会让重建更精确。
在法医样本或降解样本数量较少的情况下,由于潜在标记物的丰富性、对自动化的适应性以及所需片段长度可能减少到仅60-80 bp,单核苷酸多态性方法可能是STR方法的良好替代方法。在脱氧核糖核酸图谱数据库中没有字符串匹配的情况下;不同的单核苷酸多态性可以用来获得关于种族、表现型、谱系甚至身份的线索。
一些单核苷酸多态性与不同药物的代谢有关。[15][16][17] 癌症、传染病(艾滋病、麻风病、肝炎等)等多种人类疾病的关联。)具有不同SNPs的自身免疫性、神经精神性和许多其他疾病可以作为药物治疗的相关药物基因组靶标。[18]
单个单核苷酸多态性可能导致孟德尔病,尽管对于复杂的疾病,单核苷酸多态性通常不单独发挥作用,相反,它们与其他单核苷酸多态性协同工作,表现出骨质疏松症的疾病状态。[19] 该领域最早的成功之一是在APOC3(载脂蛋白C3基因)的非编码区发现了一个单碱基突变,该突变与高甘油三酯血症和动脉粥样硬化的高风险相关。[20]
所有类型的单核苷酸多态性都可能具有可观察到的表型或导致疾病:
正如基因一样,SNPs也有生物信息学数据库。
国际单核苷酸多态性图谱工作组通过与基因库中插入大克隆片段的基因组序列比对,绘制了每个单核苷酸多态性侧翼的序列。这些比对被转换成染色体坐标,如表1所示。这个列表已经大大增加了,例如,卡维亚数据库现在列出了1.62亿个单核苷酸变异体。
Chromosome | Length(bp) | All SNPs | TSC SNPs | ||
---|---|---|---|---|---|
Total SNPs | kb per SNP | Total SNPs | kb per SNP | ||
1 | 214,066,000 | 129,931 | 1.65 | 75,166 | 2.85 |
2 | 222,889,000 | 103,664 | 2.15 | 76,985 | 2.90 |
3 | 186,938,000 | 93,140 | 2.01 | 63,669 | 2.94 |
4 | 169,035,000 | 84,426 | 2.00 | 65,719 | 2.57 |
5 | 170,954,000 | 117,882 | 1.45 | 63,545 | 2.69 |
6 | 165,022,000 | 96,317 | 1.71 | 53,797 | 3.07 |
7 | 149,414,000 | 71,752 | 2.08 | 42,327 | 3.53 |
8 | 125,148,000 | 57,834 | 2.16 | 42,653 | 2.93 |
9 | 107,440,000 | 62,013 | 1.73 | 43,020 | 2.50 |
10 | 127,894,000 | 61,298 | 2.09 | 42,466 | 3.01 |
11 | 129,193,000 | 84,663 | 1.53 | 47,621 | 2.71 |
12 | 125,198,000 | 59,245 | 2.11 | 38,136 | 3.28 |
13 | 93,711,000 | 53,093 | 1.77 | 35,745 | 2.62 |
14 | 89,344,000 | 44,112 | 2.03 | 29,746 | 3.00 |
15 | 73,467,000 | 37,814 | 1.94 | 26,524 | 2.77 |
16 | 74,037,000 | 38,735 | 1.91 | 23,328 | 3.17 |
17 | 73,367,000 | 34,621 | 2.12 | 19,396 | 3.78 |
18 | 73,078,000 | 45,135 | 1.62 | 27,028 | 2.70 |
19 | 56,044,000 | 25,676 | 2.18 | 11,185 | 5.01 |
20 | 63,317,000 | 29,478 | 2.15 | 17,051 | 3.71 |
21 | 33,824,000 | 20,916 | 1.62 | 9,103 | 3.72 |
22 | 33,786,000 | 28,410 | 1.19 | 11,056 | 3.06 |
X | 131,245,000 | 34,842 | 3.77 | 20,400 | 6.43 |
Y | 21,753,000 | 4,193 | 5.19 | 1,784 | 12.19 |
RefSeq | 15,696,674 | 14,534 | 1.08 | ||
Totals | 2,710,164,000 | 1,419,190 | 1.91 | 887,450 | 3.05 |
单核苷酸多态性中比较重要的一类是那些与导致蛋白质水平氨基酸变化的错义突变相对应的多态性。特定残基的点突变对蛋白质功能有不同的影响(从无影响到完全破坏其功能)。通常,具有相似大小和物理化学性质的氨基酸改变(例如亮氨酸被缬氨酸取代)具有温和的效果,反之亦然。类似地,如果单核苷酸多态性破坏二级结构元件(例如α螺旋区脯氨酸被取代),这种突变通常会影响整个蛋白质的结构和功能。利用这些简单的和许多其他机器学习导出的规则,开发了一组预测单核苷酸多态性效应的程序:
^"single-nucleotide polymorphism / SNP | Learn Science at Scitable". www.nature.com. Archived from the original on 2015-11-10. Retrieved 2015-11-13..
^Ingram VM (October 1956). "A specific chemical difference between the globins of normal human and sickle-cell anaemia haemoglobin". Nature. 178 (4537): 792–4. doi:10.1038/178792a0. PMID 13369537..
^Chang JC, Kan YW (June 1979). "beta 0 thalassemia, a nonsense mutation in man". Proceedings of the National Academy of Sciences of the United States of America. 76 (6): 2886–9. doi:10.1073/pnas.76.6.2886. PMC 383714. PMID 88735..
^Hamosh A, King TM, Rosenstein BJ, Corey M, Levison H, Durie P, Tsui LC, McIntosh I, Keston M, Brock DJ (August 1992). "Cystic fibrosis patients bearing both the common missense mutation Gly----Asp at codon 551 and the delta F508 mutation are clinically indistinguishable from delta F508 homozygotes, except for decreased risk of meconium ileus". American Journal of Human Genetics. 51 (2): 245–50. PMC 1682672. PMID 1379413..
^Wolf AB, Caselli RJ, Reiman EM, Valla J (April 2013). "APOE and neuroenergetics: an emerging paradigm in Alzheimer's disease". Neurobiology of Aging. 34 (4): 1007–17. doi:10.1016/j.neurobiolaging.2012.10.011. PMC 3545040. PMID 23159550..
^Zhang K, Qin ZS, Liu JS, Chen T, Waterman MS, Sun F (May 2004). "Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies". Genome Research. 14 (5): 908–16. doi:10.1101/gr.1837404. PMC 479119. PMID 15078859. Archived from the original on 8 May 2018..
^Gupta PK, Roy JK, Prasad M (25 February 2001). "Single nucleotide polymorphisms: a new paradigm for molecular marker technology and DNA polymorphism detection with emphasis on their use in plants". Current Science. 80 (4): 524–535. Archived from the original on 13 February 2017..
^Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR (October 2015). "A global reference for human genetic variation". Nature. 526 (7571): 68–74. doi:10.1038/nature15393. PMC 4750478. PMID 26432245..
^Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L (March 2008). "Natural selection has driven population differentiation in modern humans". Nature Genetics. 40 (3): 340–5. doi:10.1038/ng.78. PMID 18246066..
^Nachman MW (September 2001). "Single nucleotide polymorphisms and recombination rate in humans". Trends in Genetics. 17 (9): 481–5. doi:10.1016/S0168-9525(01)02409-X. PMID 11525814..
^Varela MA, Amos W (March 2010). "Heterogeneous distribution of SNPs in the human genome: microsatellites as predictors of nucleotide diversity and divergence". Genomics. 95 (3): 151–9. doi:10.1016/j.ygeno.2009.12.003. PMID 20026267..
^Carlson, Bruce (15 June 2008). "SNPs — A Shortcut to Personalized Medicine". Genetic Engineering & Biotechnology News. Mary Ann Liebert, Inc. 28 (12). Archived from the original on 26 December 2010. Retrieved 2008-07-06. (subtitle) Medical applications are where the market's growth is expected.
^Cornelis S, Gansemans Y, Deleye L, Deforce D, Van Nieuwerburgh F (February 2017). "Forensic SNP Genotyping using Nanopore MinION Sequencing". Scientific Reports. 7: 41759. doi:10.1038/srep41759. PMC 5290523. PMID 28155888..
^Kidd KK, Pakstis AJ, Speed WC, Grigorenko EL, Kajuna SL, Karoma NJ, Kungulilo S, Kim JJ, Lu RB, Odunsi A, Okonofua F, Parnas J, Schulz LO, Zhukova OV, Kidd JR (December 2006). "Developing a SNP panel for forensic identification of individuals". Forensic Science International. 164 (1): 20–32. doi:10.1016/j.forsciint.2005.11.017. PMID 16360294..
^Goldstein JA (October 2001). "Clinical relevance of genetic polymorphisms in the human CYP2C subfamily". British Journal of Clinical Pharmacology. 52 (4): 349–55. doi:10.1046/j.0306-5251.2001.01499.x. PMC 2014584. PMID 11678778..
^Lee CR (July–August 2004). "CYP2C9 genotype as a predictor of drug disposition in humans". Methods and Findings in Experimental and Clinical Pharmacology. 26 (6): 463–72. PMID 15349140..
^Yanase K, Tsukahara S, Mitsuhashi J, Sugimoto Y (March 2006). "Functional SNPs of the breast cancer resistance protein-therapeutic effects and inhibitor development". Cancer Letters. 234 (1): 73–80. doi:10.1016/j.canlet.2005.04.039. PMID 16303243..
^Fareed M, Afzal M (April 2013). "Single-nucleotide polymorphism in genome-wide association of human population: A tool for broad spectrum service". Egyptian Journal of Medical Human Genetics. 14 (2): 123–134. doi:10.1016/j.ejmhg.2012.08.001..
^Singh M, Singh P, Juneja PK, Singh S, Kaur T (March 2011). "SNP-SNP interactions within APOE gene influence plasma lipids in postmenopausal osteoporosis". Rheumatology International. 31 (3): 421–3. doi:10.1007/s00296-010-1449-7. PMID 20340021..
^Rees A, Shoulders CC, Stocks J, Galton DJ, Baralle FE (February 1983). "DNA polymorphism adjacent to human apoprotein A-1 gene: relation to hypertriglyceridaemia". Lancet. 1 (8322): 444–6. doi:10.1016/S0140-6736(83)91440-X. PMID 6131168..
^Li G, Pan T, Guo D, Li LC (2014). "Regulatory Variants and Disease: The E-Cadherin -160C/A SNP as an Example". Molecular Biology International. 2014: 967565. doi:10.1155/2014/967565. PMC 4167656. PMID 25276428..
^Lu YF, Mauger DM, Goldstein DB, Urban TJ, Weeks KM, Bradrick SS (November 2015). "IFNL3 mRNA structure is remodeled by a functional non-coding polymorphism associated with hepatitis C virus clearance". Scientific Reports. 5: 16037. doi:10.1038/srep16037. PMC 4631997. PMID 26531896..
^Kimchi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM, Ambudkar SV, Gottesman MM (January 2007). "A "silent" polymorphism in the MDR1 gene changes substrate specificity". Science. 315 (5811): 525–8. doi:10.1126/science.1135308. PMID 17185560..
^Al-Haggar M, Madej-Pilarczyk A, Kozlowski L, Bujnicki JM, Yahia S, Abdel-Hadi D, Shams A, Ahmad N, Hamed S, Puzianowska-Kuznicka M (November 2012). "A novel homozygous p.Arg527Leu LMNA mutation in two unrelated Egyptian families causes overlapping mandibuloacral dysplasia and progeria syndrome". European Journal of Human Genetics. 20 (11): 1134–40. doi:10.1038/ejhg.2012.77. PMC 3476705. PMID 22549407. Archived from the original on 2015-11-16..
^Cordovado SK, Hendrix M, Greene CN, Mochal S, Earley MC, Farrell PM, Kharrazi M, Hannon WH, Mueller PW (February 2012). "CFTR mutation analysis and haplotype associations in CF patients". Molecular Genetics and Metabolism. 105 (2): 249–54. doi:10.1016/j.ymgme.2011.10.013. PMC 3551260. PMID 22137130..
^Giegling I, Hartmann AM, Möller HJ, Rujescu D (November 2006). "Anger- and aggression-related traits are associated with polymorphisms in the 5-HT-2A gene". Journal of Affective Disorders. 96 (1–2): 75–81. doi:10.1016/j.jad.2006.05.016. PMID 16814396..
^Kujovich JL (January 2011). "Factor V Leiden thrombophilia". Genetics in Medicine. 13 (1): 1–16. doi:10.1097/GIM.0b013e3181faa0f2. PMID 21116184..
^Morita A, Nakayama T, Doba N, Hinohara S, Mizutani T, Soma M (June 2007). "Genotyping of triallelic SNPs using TaqMan PCR". Molecular and Cellular Probes. 21 (3): 171–6. doi:10.1016/j.mcp.2006.10.005. PMID 17161935..
^Prodi DA, Drayna D, Forabosco P, Palmas MA, Maestrale GB, Piras D, Pirastu M, Angius A (October 2004). "Bitter taste study in a sardinian genetic isolate supports the association of phenylthiocarbamide sensitivity to the TAS2R38 bitter receptor gene". Chemical Senses. 29 (8): 697–702. doi:10.1093/chemse/bjh074. PMID 15466815..
^Ammitzbøll CG, Kjær TR, Steffensen R, Stengaard-Pedersen K, Nielsen HJ, Thiel S, Bøgsted M, Jensenius JC (28 November 2012). "Non-synonymous polymorphisms in the FCN1 gene determine ligand-binding ability and serum levels of M-ficolin". PLOS ONE. 7 (11): e50585. doi:10.1371/journal.pone.0050585. PMC 3509001. PMID 23209787. Archived from the original on 7 June 2015..
^Ji G, Long Y, Zhou Y, Huang C, Gu A, Wang X (May 2012). "Common variants in mismatch repair genes associated with increased risk of sperm DNA damage and male infertility". BMC Medicine. 10: 49. doi:10.1186/1741-7015-10-49. PMC 3378460. PMID 22594646..
^National Center for Biotechnology Information, United States National Library of Medicine. 2014. NCBI dbSNP build 142 for human. "Archived copy". Archived from the original on 2017-09-10. Retrieved 2017-09-11.CS1 maint: Archived copy as title (link).
^National Center for Biotechnology Information, United States National Library of Medicine. 2015. NCBI dbSNP build 144 for human. Summary Page. "Archived copy". Archived from the original on 2017-09-10. Retrieved 2017-09-11.CS1 maint: Archived copy as title (link).
^Glusman G, Caballero J, Mauldin DE, Hood L, Roach JC (November 2011). "Kaviar: an accessible system for testing SNV novelty". Bioinformatics. 27 (22): 3216–7. doi:10.1093/bioinformatics/btr540. PMC 3208392. PMID 21965822..
^Cao R, Shi Y, Chen S, Ma Y, Chen J, Yang J, Chen G, Shi T (January 2017). "dbSAP: single amino-acid polymorphism database for protein variation detection". Nucleic Acids Research. 45 (D1): D827–D832. doi:10.1093/nar/gkw1096. PMC 5210569. PMID 27903894..
^J.T. Den Dunnen (2008-02-20). "Recommendations for the description of sequence variants". Human Genome Variation Society. Archived from the original on 2008-09-14. Retrieved 2008-09-05..
^den Dunnen JT, Antonarakis SE (2000). "Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion". Human Mutation. 15 (1): 7–12. doi:10.1002/(SICI)1098-1004(200001)15:1<7::AID-HUMU4>3.0.CO;2-N. PMID 10612815..
^Ogino S, Gulley ML, den Dunnen JT, Wilson RB (February 2007). "Standard mutation nomenclature in molecular diagnostics: practical and educational challenges". The Journal of Molecular Diagnostics. 9 (1): 1–6. doi:10.2353/jmoldx.2007.060081. PMC 1867422. PMID 17251329..
^Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, Mullikin JC, Mortimore BJ, Willey DL, Hunt SE, Cole CG, Coggill PC, Rice CM, Ning Z, Rogers J, Bentley DR, Kwok PY, Mardis ER, Yeh RT, Schultz B, Cook L, Davenport R, Dante M, Fulton L, Hillier L, Waterston RH, McPherson JD, Gilman B, Schaffner S, Van Etten WJ, Reich D, Higgins J, Daly MJ, Blumenstiel B, Baldwin J, Stange-Thomann N, Zody MC, Linton L, Lander ES, Altshuler D (February 2001). "A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms". Nature. 409 (6822): 928–33. doi:10.1038/35057149. PMID 11237013..
^Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin J, Linton L, Lander ES (September 2000). "An SNP map of the human genome generated by reduced representation shotgun sequencing". Nature. 407 (6803): 513–6. doi:10.1038/35035083. PMID 11029002..
^Drabovich AP, Krylov SN (March 2006). "Identification of base pairs in single-nucleotide polymorphisms by MutS protein-mediated capillary electrophoresis". Analytical Chemistry. 78 (6): 2035–8. doi:10.1021/ac0520386. PMID 16536443..
^Griffin TJ, Smith LM (July 2000). "Genetic identification by mass spectrometric analysis of single-nucleotide polymorphisms: ternary encoding of genotypes". Analytical Chemistry. 72 (14): 3298–302. doi:10.1021/ac991390e. PMID 10939403..
^Tahira T, Kukita Y, Higasa K, Okazaki Y, Yoshinaga A, Hayashi K (2009). "Estimation of SNP allele frequencies by SSCP analysis of pooled DNA". Methods in Molecular Biology. Methods in Molecular Biology. 578: 193–207. doi:10.1007/978-1-60327-411-1_12. ISBN 978-1-60327-410-4. PMID 19768595..
^Malhis N, Jones SJ, Gsponer J (April 2019). "Improved measures for evolutionary conservation that exploit taxonomy distances". Nature Communications. 10 (1): 1556. doi:10.1038/s41467-019-09583-2. PMID 30952844..
^"View of SNPViz - Visualization of SNPs in proteins". genomicscomputbiol.org (in 英语). Retrieved 2018-10-20..
暂无