论文标题
有效平行蛋白质结构测定算法的过程
Process of Efficiently Parallelizing a Protein Structure Determination Algorithm
论文作者
论文摘要
计算蛋白结构的确定涉及在问题太大而无法详尽搜索的问题空间中的优化。现有方法包括优化算法,例如梯度下降和模拟退火,但通常只能找到局部最小值。一种在Redcraft中实施的新方法是,而不是同时折叠蛋白质,而是通过残基折叠残留物。当每个残基从产生的核糖体中退出时,这会模拟蛋白质折叠。虽然Redcraft成倍地减少了问题空间,因此可以在多项式时间内探索它,但在计算上仍然非常要求。该算法确实具有一个优势,即大多数执行时间都在本质上可行的代码中花费。但是,并行执行的初步结果表明,执行时间的大约三分之二用于系统开销。此外,通过仔细分析和计时程序的结构,可以确定主要的瓶颈。解决了这些问题后,Redcraft成为可扩展的并行应用,并改善了近两个数量级。
Computational protein structure determination involves optimization in a problem space much too large to exhaustively search. Existing approaches include optimization algorithms such as gradient descent and simulated annealing, but these typically only find local minima. One novel approach implemented in REDcRAFT is to instead of folding a protein all at the same time, fold it residue by residue. This simulates a protein folding as each residue exits from the generating ribosome. While REDcRAFT exponentially reduces the problem space so it can be explored in polynomial time, it is still extremely computationally demanding. This algorithm does have the advantage that most of the execution time is spent in inherently parallelizable code. However, preliminary results from parallel execution indicate that approximately two-thirds of execution time is dedicated to system overhead. Additionally, by carefully analyzing and timing the structure of the program the major bottlenecks can be identified. After addressing these issues, REDcRAFT becomes a scalable parallel application with nearly two orders of magnitude improvement.