机器学习可以指导实验方法进行蛋白质消化率估计

论文标题

机器学习可以指导实验方法进行蛋白质消化率估计

Machine learning can guide experimental approaches for protein digestibility estimations

论文作者

Malvar, Sara, Bhagavathula, Anvita, Balaguer, Maria Angels de Luis, Sharma, Swati, Chandra, Ranveer

论文摘要

食物蛋白的消化率和生物利用度是满足人类营养需求的关键方面，尤其是在寻求可持续的动物蛋白质替代品时。在这项研究中，我们提出了一种机器学习方法，以预测食品的真正回肠消化率系数。该模型利用了独特的策划数据集，该数据集将来自不同食物的营养信息与某些蛋白质家族的FastA序列相结合。我们提取了蛋白质的生化特性，并将这些特性与基于变压器的蛋白质语言模型（PLM）的嵌入结合在一起。此外，我们使用Shap来识别对模型预测最大贡献并提供可解释性的功能。与现有的实验技术相比，用于预测食品蛋白消化率的第一个基于AI的模型的精度为90％。有了这种准确性，我们的模型可以消除对长期体内或体外实验的需求，从而使创造新食物的过程更快，更便宜，更合乎道德。

Food protein digestibility and bioavailability are critical aspects in addressing human nutritional demands, particularly when seeking sustainable alternatives to animal-based proteins. In this study, we propose a machine learning approach to predict the true ileal digestibility coefficient of food items. The model makes use of a unique curated dataset that combines nutritional information from different foods with FASTA sequences of some of their protein families. We extracted the biochemical properties of the proteins and combined these properties with embeddings from a Transformer-based protein Language Model (pLM). In addition, we used SHAP to identify features that contribute most to the model prediction and provide interpretability. This first AI-based model for predicting food protein digestibility has an accuracy of 90% compared to existing experimental techniques. With this accuracy, our model can eliminate the need for lengthy in-vivo or in-vitro experiments, making the process of creating new foods faster, cheaper, and more ethical.

下载PDF全文

下载文献需遵守相关版权规定

论文标题