标题: RNA三级结构的比对与资料库的搜寻
Alignment and Database Search of RNA Tertiary Structures
作者: 杨忠翰
卢锦隆
林苕吟
Yang, Chung-Hang
Lu, Chin-Lung
Lin, Tiao-Yin
生物资讯及系统生物研究所
关键字: 核糖核酸;三级结构;结构比对;结构字元集;RNA;3D structure;Structural alignment;Structural alphabet
公开日期: 2016
摘要: 近年来,生物学家们对于RNA越来越感兴趣了,这是因为RNA不但能转译成蛋白质,它们更在细胞内扮演着许多重要的角色,包括基因的调控、RNA 的修饰与染色体的复制等等。 但是目前许多RNA的功能却仍是未知的,而如同在蛋白质上的研究一样,一个较为可靠的方法用来分析RNA的功能就是剖析它们的三级结构。这是因为RNA 分子的结构在演化上通常比其序列还来得保守。然而比较两个RNA 三级结构的相似度是一件困难的工作,因为它已被证明是NP-hard的问题。之前我们实验室已经利用一个启发式的方法发展了一个有用的RNA结构比对的工具名为iPARTS。这个方法是一种结构字元的方法,我们首先利用一个包含23个结构字元的字元集将RNA三级结构编码为由结构字元所组成的一级的序列,之后我们在应用传统的序列比对演算法来比对这些编码后的一级序列,藉此我们即可决定两个RNA三级结构间的相似程度。基于以上所描述的结构字元方法,在这次研究中我们首先发展了一个名为R3D-BLAST的资料库搜寻工具让生物学家去搜寻PDB资料库裡与特定RNA三级结构相似的RNA。R3D-BLAST基本想法如下:首先我们将PDB资料库中所有RNA三级结构利用一个含有23个结构字元的字元集编码成一级的序列,之后我们再利用BLAST这个程式去搜寻出与query RNA在三级结构有局部相似的RNA结构。实验结果也证实R3D-BLAST的确能快速且正确地在PDB资料库中搜寻出RNA其结构与query RNA拥有相似的子结构。其次我们提出第二版的iPARTS,简称iPARTS2。iPARTS2利用了一个包含92个元素的结构字元集将RNA三级结构编码为一级的结构字元序列。这个结构字元集与iPARTS的结构字元集最不同的地方在于前者的每个元素都含有三级结构以及一级序列的资讯,而后者的每个字元只有携带三级结构的资讯。实验结果也证实iPARTS2在RNA结构比对的品质与功能预测的表现上不但优于iPARTS,也胜过一些主流的软体像是SARA、SETTER跟RASS。使用者可以在http://genome.cs.nthu.edu.tw/R3D-BLAST/以及http://genome.cs.nthu.edu.tw/iPARTS2/来各别使用R3D-BLAST与iPARTS2。
In recent years, there is a fast growing interest in RNAs, because they not only transfer genetic information from DNA to proteins but also play essential roles in many cellular processes, such as gene regulation, RNA modification and chromosome replication. Actually, the func-tions of most available RNAs are still unknown. Likewise to proteins, a more reliable way for determining the functions of RNAs is to ana-lyze their tertiary structures, because structures of molecules are typ-ically more evolutionarily conserved than their primary sequences. However, detecting structural similarities in two RNA molecules at tertiary structure level is a difficult job, because it has been shown to be an NP-hard problem. Previously, our laboratory have used a heu-ristic approach to develop a useful tool, called iPARTS, which allows biologists to fast and accurately compare the structural similarity of two RNA tertiary structures. It was implemented by a structural al-phabet (SA)-based approach, which uses an SA of 23 letters to reduce RNA 3D structures into 1D sequences of SA letters and applies tra-ditional sequence alignment to these SA-encoded sequences for de-termining their global or local similarity. In this study, we first have further developed a BLAST-like search tool, called R3D-BLAST, based on the structural alphabet-based approach. R3D-BLAST allows the user to quickly and accurately search against the PDB for RNA structures sharing similar substructures with a specified query RNA structure. The basic idea behind R3D-BLAST is that all the RNA 3D structures deposited in the PDB are first encoded as 1D structural sequences using a structural alphabet of 23 distinct nucleotide con-formations, and BLAST is then applied to these 1D structural se-quences to search for those RNA substructures whose 1D structural sequences are similar to that of the query RNA substructure. The ex-perimental results have shown that our R3D-BLAST can quickly and accurately search the PDB for RNAs that share similar 3D sub-structures with a query RNA. Second, we have re-implemented iPARTS into a new web server iPARTS2 by constructing a totally new SA, which consists of 92 elements with each carrying both information of base and backbone geometry for a representative nucleotide. This SA is significantly different from the one used in iPARTS, because the latter consists of only 23 elements with each carrying only the back-bone geometry information of a representative nucleotide. Our ex-perimental results have shown that iPARTS2 outperforms its previous version iPARTS and also achieves better accuracy than other popular tools, such as SARA, SETTER and RASS, in RNA alignment quality and function prediction. R3D-BLAST and iPARTS2 are now available online at http://genome.cs.nthu.edu.tw/R3D-BLAST/ and
http://genome.cs.nthu.edu.tw/iPARTS2/, respectively.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT079851807
http://hdl.handle.net/11536/139042
显示于类别:Thesis