基于机器学习算法的化学品快速生物降解性筛查模型

徐嘉茜, 王浩博, 肖子君, 刘文佳, 何家乐, 陈景文. 基于机器学习算法的化学品快速生物降解性筛查模型[J]. 生态毒理学报, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
引用本文: 徐嘉茜, 王浩博, 肖子君, 刘文佳, 何家乐, 陈景文. 基于机器学习算法的化学品快速生物降解性筛查模型[J]. 生态毒理学报, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
Xu Jiaxi, Wang Haobo, Xiao Zijun, Liu Wenjia, He Jiale, Chen Jingwen. Machine Learning Models on Screening Ready Biodegradability of Chemicals[J]. Asian Journal of Ecotoxicology, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
Citation: Xu Jiaxi, Wang Haobo, Xiao Zijun, Liu Wenjia, He Jiale, Chen Jingwen. Machine Learning Models on Screening Ready Biodegradability of Chemicals[J]. Asian Journal of Ecotoxicology, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001

基于机器学习算法的化学品快速生物降解性筛查模型

    作者简介: 徐嘉茜(1999-),女,硕士研究生,研究方向为计算毒理学,E-mail:978932988@qq.com
    通讯作者: 陈景文(1969-),男,博士,教授,主要研究方向为新污染物治理技术、环境计算毒理学和化学品风险预测技术。E-mail:jwchen@dlut.edu.cn
  • 基金项目:

    国家重点研发计划项目(2022YFC3902100);国家自然科学基金资助项目(22136001)

  • 中图分类号: X171.5

Machine Learning Models on Screening Ready Biodegradability of Chemicals

    Corresponding author: Chen Jingwen, jwchen@dlut.edu.cn
  • Fund Project:
  • 摘要: 判别化学品能否被快速生物降解,有助于化学品的环境风险评估。以往化学品快速生物降解性(RB)的筛查模型,训练集所覆盖的化学空间小,模型预测准确性低,缺乏有效的应用域表征。本研究搜集5 606种化学品的RB数据,构建了机器学习筛查模型。结果表明,基于极端梯度提升树和Mordred分子描述符构建的模型性能最优,在外部验证集上的预测准确率为0.86,受试者工作特征曲线下面积为0.92。通过加权分子相似性密度和加权崎岖性2个指标,有效表征了模型应用域。通过模型的机理分析,发现羧基或羟基可显著提高化学物质的RB。对《中国现有化学物质名录》筛查结果表明,超过60%的化学物质难以快速生物降解,其中苯及其衍生物占比最高。所构建的RB筛查模型及其应用域,可为化学品的环境管理提供技术支持。
  • 加载中
  • Jiang S, Liang Y Z, Shi S L, et al. Improving predictions and understanding of primary and ultimate biodegradation rates with machine learning models [J]. The Science of the Total Environment, 2023, 904: 166623
    环境保护部化学品登记中心《化学品测试方法》编委会. 化学品测试方法降解与蓄积卷[M]. 第二版. 北京: 中国环境出版社, 2023: 25-61
    European Commission. Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999/45/ECC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EEC and 2000/21/EC [S]. Brussels: European Union, 2007
    Cheng F X, Ikenaga Y, Zhou Y D, et al. In silico assessment of chemical biodegradability [J]. Journal of Chemical Information and Modeling, 2012, 52(3): 655-669
    Mansouri K, Ringsted T, Ballabio D, et al. Quantitative structure-activity relationship models for ready biodegradability of chemicals [J]. Journal of Chemical Information and Modeling, 2013, 53(4): 867-878
    Lunghini F, Marcou G, Gantzer P, et al. Modelling of ready biodegradability based on combined public and industrial data sources [J]. SAR and QSAR in Environmental Research, 2020, 31(3): 171-186
    Huang K, Zhang H C. Classification and regression machine learning models for predicting aerobic ready and inherent biodegradation of organic chemicals in water [J]. Environmental Science & Technology, 2022, 56(17): 12755-12764
    Wang H B, Wang Z Y, Chen J W, et al. Graph attention network model with defined applicability domains for screening PBT chemicals [J]. Environmental Science & Technology, 2022, 56(10): 6774-6785
    Chen G C, Li X H, Chen J W, et al. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression [J]. Environmental Toxicology and Chemistry, 2014, 33(12): 2688-2693
    Liu W J, Wang Z Y, Chen J W, et al. Machine learning model for screening thyroid stimulating hormone receptor agonists based on updated datasets and improved applicability domain metrics [J]. Chemical Research in Toxicology, 2023, 36(6): 947-958
    Wang Z Y, Chen J W, Hong H X. Applicability domains enhance application of PPARγ agonist classifiers trained by drug-like compounds to environmental chemicals [J]. Chemical Research in Toxicology, 2020, 33(6): 1382-1388
    Ngara T R, Zeng P J, Zhang H J. mibPOPdb: An online database for microbial biodegradation of persistent organic pollutants [J]. iMeta, 2022, 1(4): e45
    Tayyebi A, Alshami A S, Rabiei Z, et al. Prediction of organic compound aqueous solubility using machine learning: A comparison study of descriptor-based and fingerprints-based models [J]. Journal of Cheminformatics, 2023, 15(1): 99
    Zheng S S, Guo W Q, Li C, et al. Application of machine learning and deep learning methods for hydrated electron rate constant prediction [J]. Environmental Research, 2023, 231(Pt 1): 115996
    Moriwaki H, Tian Y S, Kawashita N, et al. Mordred: A molecular descriptor calculator [J]. Journal of Cheminformatics, 2018, 10(1): 4
    Tang W H, Li Y Y, Yu Y, et al. Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms [J]. Chemosphere, 2020, 253: 126666
    Rich S L, Zumstein M T, Helbling D E. Identifying functional groups that determine rates of micropollutant biotransformations performed by wastewater microbial communities [J]. Environmental Science & Technology, 2022, 56(2): 984-994
    Lundberg S M, Lee S I. A unified approach to interpreting model predictions [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). New York: Red Hook, 2017: 4768-4777
    Jiang D J, Wu Z X, Hsieh C Y, et al. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models [J]. Journal of Cheminformatics, 2021, 13(1): 12
    Zhao Q M, Yu Y, Gao Y C, et al. Machine learning-based models with high accuracy and broad applicability domains for screening PMT/vPvM substances [J]. Environmental Science & Technology, 2022, 56(24): 17880-17889
    Galimberti F, Moretto A, Papa E. Application of chemometric methods and QSAR models to support pesticide risk assessment starting from ecotoxicological datasets [J]. Water Research, 2020, 174: 115583
    Mansouri K, Grulke C M, Richard A M, et al. An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling [J]. SAR and QSAR in Environmental Research, 2016, 27(11): 939-965
    Lei L, Zhang L M, Han Z B, et al. Advancing chronic toxicity risk assessment in freshwater ecology by molecular characterization-based machine learning [J]. Environmental Pollution, 2024, 342: 123093
    Toots K M, Sild S, Leis J, et al. Machine learning quantitative structure-property relationships as a function of ionic liquid cations for the gas-ionic liquid partition coefficient of hydrocarbons [J]. International Journal of Molecular Sciences, 2022, 23(14): 7534
    Grasset C, Groeneveld M, Tranvik L J, et al. Hydrophilic species are the most biodegradable components of freshwater dissolved organic matter [J]. Environmental Science & Technology, 2023, 57(36): 13463-13472
    Kim J R, Thelusmond J R, Albright V C 3rd, et al. Exploring structure-activity relationships for polymer biodegradability by microorganisms [J]. The Science of the Total Environment, 2023, 890: 164338
    Yin H Y, Lin C, Tian Y J, et al. Prediction and structure-activity relationship analysis on ready biodegradability of chemical using machine learning method [J]. Chemical Research in Toxicology, 2023, 36(4): 617-629
    Acharya K, Werner D, Dolfing J, et al. A quantitative structure-biodegradation relationship (QSBR) approach to predict biodegradation rates of aromatic chemicals [J]. Water Research, 2019, 157: 181-190
    He J, Qin W C, Zhang X J, et al. Linear and nonlinear relationships between biodegradation potential and molecular descriptors/fragments for organic pollutants and a theoretical interpretation [J]. The Science of the Total Environment, 2013, 444: 392-400
    Yang K C, Zhao Y X, Ji M, et al. Challenges and opportunities for the biodegradation of chlorophenols: Aerobic, anaerobic and bioelectrochemical processes [J]. Water Research, 2021, 193: 116862
    Singh A K, Bilal M, Jesionowski T, et al. Assessing chemical hazard and unraveling binding affinity of priority pollutants to lignin modifying enzymes for environmental remediation [J]. Chemosphere, 2023, 313: 137546
    Zhang X M, Sun X F, Jiang R F, et al. Screening new persistent and bioaccumulative organics in China’s inventory of industrial chemicals [J]. Environmental Science & Technology, 2020, 54(12): 7398-7408
    Djoumbou Feunang Y, Eisner R, Knox C, et al. ClassyFire: Automated chemical classification with a comprehensive, computable taxonomy [J]. Journal of Cheminformatics, 2016, 8: 61
  • 加载中
计量
  • 文章访问数:  363
  • HTML全文浏览数:  363
  • PDF下载数:  180
  • 施引文献:  0
出版历程
  • 收稿日期:  2024-03-22
徐嘉茜, 王浩博, 肖子君, 刘文佳, 何家乐, 陈景文. 基于机器学习算法的化学品快速生物降解性筛查模型[J]. 生态毒理学报, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
引用本文: 徐嘉茜, 王浩博, 肖子君, 刘文佳, 何家乐, 陈景文. 基于机器学习算法的化学品快速生物降解性筛查模型[J]. 生态毒理学报, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
Xu Jiaxi, Wang Haobo, Xiao Zijun, Liu Wenjia, He Jiale, Chen Jingwen. Machine Learning Models on Screening Ready Biodegradability of Chemicals[J]. Asian Journal of Ecotoxicology, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001
Citation: Xu Jiaxi, Wang Haobo, Xiao Zijun, Liu Wenjia, He Jiale, Chen Jingwen. Machine Learning Models on Screening Ready Biodegradability of Chemicals[J]. Asian Journal of Ecotoxicology, 2024, 19(4): 43-52. doi: 10.7524/AJE.1673-5897.20240322001

基于机器学习算法的化学品快速生物降解性筛查模型

    通讯作者: 陈景文(1969-),男,博士,教授,主要研究方向为新污染物治理技术、环境计算毒理学和化学品风险预测技术。E-mail:jwchen@dlut.edu.cn
    作者简介: 徐嘉茜(1999-),女,硕士研究生,研究方向为计算毒理学,E-mail:978932988@qq.com
  • 工业生态与环境工程教育部重点实验室, 大连市化学品风险防控及污染防治技术重点实验室, 大连理工大学环境学院, 大连 116024
基金项目:

国家重点研发计划项目(2022YFC3902100);国家自然科学基金资助项目(22136001)

摘要: 判别化学品能否被快速生物降解,有助于化学品的环境风险评估。以往化学品快速生物降解性(RB)的筛查模型,训练集所覆盖的化学空间小,模型预测准确性低,缺乏有效的应用域表征。本研究搜集5 606种化学品的RB数据,构建了机器学习筛查模型。结果表明,基于极端梯度提升树和Mordred分子描述符构建的模型性能最优,在外部验证集上的预测准确率为0.86,受试者工作特征曲线下面积为0.92。通过加权分子相似性密度和加权崎岖性2个指标,有效表征了模型应用域。通过模型的机理分析,发现羧基或羟基可显著提高化学物质的RB。对《中国现有化学物质名录》筛查结果表明,超过60%的化学物质难以快速生物降解,其中苯及其衍生物占比最高。所构建的RB筛查模型及其应用域,可为化学品的环境管理提供技术支持。

English Abstract

参考文献 (33)

返回顶部

目录

/

返回文章
返回