大数据挖掘和机器学习在毒理学中的应用

滕跃发, 王晓晴, 李斐, 吴惠丰, 吉成龙, 于进福. 大数据挖掘和机器学习在毒理学中的应用[J]. 生态毒理学报, 2022, 17(1): 93-101. doi: 10.7524/AJE.1673-5897.20210915001
引用本文: 滕跃发, 王晓晴, 李斐, 吴惠丰, 吉成龙, 于进福. 大数据挖掘和机器学习在毒理学中的应用[J]. 生态毒理学报, 2022, 17(1): 93-101. doi: 10.7524/AJE.1673-5897.20210915001
Teng Yuefa, Wang Xiaoqing, Li Fei, Wu Huifeng, Ji Chenglong, Yu Jinfu. Application of Data Mining and Machine Learning in Toxicology[J]. Asian Journal of Ecotoxicology, 2022, 17(1): 93-101. doi: 10.7524/AJE.1673-5897.20210915001
Citation: Teng Yuefa, Wang Xiaoqing, Li Fei, Wu Huifeng, Ji Chenglong, Yu Jinfu. Application of Data Mining and Machine Learning in Toxicology[J]. Asian Journal of Ecotoxicology, 2022, 17(1): 93-101. doi: 10.7524/AJE.1673-5897.20210915001

大数据挖掘和机器学习在毒理学中的应用

    作者简介: 滕跃发(1998-),男,硕士研究生,研究方向为生态毒理学,E-mail:yfteng@yic.ac.cn
    通讯作者: 李斐, E-mail: fli@yic.ac.cn 于进福, E-mail: yujinfu@ytvc.edu.cn
  • 基金项目:

    烟台市科技创新发展计划项目(2020MSGY060)

    国家自然科学基金资助项目(21677173,41530642)

    中国科学院青年创新促进会项目(2017255)

  • 中图分类号: X171.5

Application of Data Mining and Machine Learning in Toxicology

    Corresponding authors: Li Fei, fli@yic.ac.cn ;  Yu Jinfu, yujinfu@ytvc.edu.cn
  • Fund Project:
  • 摘要: 随着高通量筛选技术的快速发展,化学品的毒性相关信息与日俱增。现今快速发展的数据挖掘技术和机器学习等计算机方法为化学品的毒性预测和风险防控提供了新途径。有害结局路径(adverse outcome pathway,AOP)将化合物的结构、分子启动事件和生物的有害结局建立关联,为污染物的毒性测试、预测和评估提供了新的模式,最终实现风险评估并应用于管理决策。定量结构-活性关系(QSAR)建模、分子模拟以及多组学技术在AOP的各个方面发挥了重要作用。基于此,本综述主要介绍数据挖掘与机器学习在毒理学中的应用方法,涉及QSAR建模、分子模拟及组学等方面,并结合实例分析系统阐述了当前研究的重点与方向,以更好地适应当前大数据时代的研究背景。
  • 加载中
  • Stokes W. The interagency coordinating committee on the validation of alternative methods (ICCVAM):Recent progress in the evaluation of alternative toxicity testing methods[R]. Bethesda:NTP Interagency Center for the Evaluation of Alternative Toxicological Methods, 2017
    郭家彬,彭双清.动物实验替代方法与21世纪毒性测试发展策略[J].中国比较医学杂志, 2011, 21(S1):157-161

    , 156Guo J B, Peng S Q. Animal alternative methods and the development of strategy for toxicity testing in the Twenty-First Century[J]. Chinese Journal of Comparative Medicine, 2011, 21(S1):157-161, 156(in Chinese)

    王中钰,陈景文,乔显亮,等.面向化学品风险评价的计算(预测)毒理学[J].中国科学:化学, 2016, 46(2):222-240

    Wang Z Y, Chen J W, Qiao X L, et al. Computational toxicology:Oriented for chemicals risk assessment[J]. Scientia Sinica Chimica, 2016, 46(2):222-240(in Chinese)

    Card M L, Gomez-Alvarez V, Lee W H, et al. History of EPI SuiteTM and future perspectives on chemical property estimation in US Toxic Substances Control Act new chemical risk assessments[J]. Environmental Science Processes&Impacts, 2017, 19(3):203-212
    Dimitrov S D, Diderich R, Sobanski T, et al. QSAR toolbox:Workflow and major functionalities[J]. SAR and QSAR in Environmental Research, 2016, 27(3):203-219
    Fatoyinbo T, Rincon R F, Sun G Q, et al. Ecosar:A P-band digital beamforming polarimetric interferometric SAR instrument to measure ecosystem structure and biomass[C]//Vancouver, BC, Canada:IEEE International Geoscience and Remote Sensing Symposium, 2011:1524-1527
    Tice R R, Austin C P, Kavlock R J, et al. Improving the human hazard characterization of chemicals:A Tox21 update[J]. Environmental Health Perspectives, 2013, 121(7):756-765
    Shukla S J, Huang R L, Austin C P, et al. The future of toxicity testing:A focus on in vitro methods using a quantitative high-throughput screening platform[J]. Drug Discovery Today, 2010, 15(23-24):997-1007
    Sturla S J, Boobis A R, FitzGerald R E, et al. Systems toxicology:From basic research to risk assessment[J]. Chemical Research in Toxicology, 2014, 27(3):314-329
    李杰,李柯佳,张臣,等.计算系统毒理学:形成、发展及应用[J].科学通报, 2015, 60(19):1751-1761

    Li J, Li K J, Zhang C, et al. Computational systems toxicology:Emergence, development and application[J]. Chinese Science Bulletin, 2015, 60(19):1751-1761(in Chinese)

    Ankley G T, Bennett R S, Erickson R J, et al. Adverse outcome pathways:A conceptual framework to support ecotoxicology research and risk assessment[J]. Environmental Toxicology and Chemistry, 2010, 29(3):730-741
    Jagiello K, Halappanavar S, Rybińska-Fryca A, et al. Transcriptomics-based and AOP-informed structure-activity relationships to predict pulmonary pathology induced by multiwalled carbon nanotubes[J]. Small, 2021, 17(15):e2003465
    Hu M Y, Palic D A. Micro-and nano-plastics activation of oxidative and inflammatory adverse outcome pathways[J]. Redox Biology, 2020, 37:101620
    Rugard M, Coumoul X, Carvaillo J C, et al. Deciphering adverse outcome pathway network linked to bisphenol F using text mining and systems toxicology approaches[J]. Toxicological Sciences:An Official Journal of the Society of Toxicology, 2020, 173(1):32-40
    Jordan M I, Mitchell T M. Machine learning:Trends, perspectives, and prospects[J]. Science, 2015, 349(6245):255-260
    Lu S H, Zhou Q H, Ouyang Y X, et al. Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning[J]. Nature Communications, 2018, 9(1):3405
    Caruana R, Niculescu-Mizil A. An empirical comparison of supervised learning algorithms[C]//Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh:ACM Press, 2006
    Sathya R, Abraham A. Comparison of supervised and unsupervised learning algorithms for pattern classification[J]. International Journal of Advanced Research in Artificial Intelligence, 2013, 2(2):34-38
    Hu J Y, Niu H L, Carrasco J, et al. Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2020, 69(12):14413-14423
    Qin R, Wang H, Yan A. Classification and QSAR models of leukotriene A4 hydrolase (LTA4H) inhibitors by machine learning methods[J]. SAR and QSAR in Environmental Research, 2021, 32(5):411-431
    Ha M K, Trinh T X, Choi J S, et al. Toxicity classification of oxide nanomaterials:Effects of data gap filling and PChem score-based screening approaches[J]. Scientific Reports, 2018, 8(1):3141
    Furxhi I, Murphy F, Poland C A, et al. Application of Bayesian networks in determining nanoparticle-induced cellular outcomes using transcriptomics[J]. Nanotoxicology, 2019, 13(6):827-848
    Drgan V, Bajželj B. Application of supervised SOM algorithms in predicting the hepatotoxic potential of drugs[J]. International Journal of Molecular Sciences, 2021, 22(9):4443
    Ge Z Q, Song Z H, Ding S X, et al. Data mining and analytics in the process industry:The role of machine learning[J]. IEEE Access, 2017, 5:20590-20616
    United States Environmental Protection Agency (US EPA). Exploring ToxCast Data[EB/OL].[2021-09-15]. https://www.epa.gov/chemical-research/exploring-toxcast-data-downloadable-data
    United States Environmental Protection Agency (US EPA). ACToR Safer Chemicals Research[EB/OL].[2021-09-15]. https://19january2017snapshot.epa.gov/chemical-research/actor.html
    European Bioinformatics Institute. ChEMBL:A large-scale bioactivity database for drug discovery[EB/OL].[2021-09-15]. https://www.ebi.ac.uk/chembl/
    Bioinfogate. OFF-X Website[EB/OL].[2021-09-15]. https://www.targetsafety.info/
    National Library of Medicine. PubChem:A public information system for analyzing bioactivities of small molecules.[EB/OL].[2021-09-15]. https://pubchem.ncbi.nlm.nih.gov/
    Wishart D S. DrugBank Online. Database for Drug and Drug Target Info.[EB/OL].[2021-09-15]. https://go.drugbank.com/
    United States Environmental Protection Agency (US EPA). ECOTOX Knowledgebase[EB/OL].[2021-09-15]. https://cfpub.epa.gov/ecotox/
    National Institute of Environmental Health Sciences. The Comparative Toxicogenomics Database[EB/OL].[2021-09-15]. https://ctdbase.org/
    Cherkasov A, Muratov E N, Fourches D, et al. QSAR modeling:Where have You been?Where are you going to?[J]. Journal of Medicinal Chemistry, 2014, 57(12):4977-5010
    Tang W H, Chen J W, Hong H X. Discriminant models on mitochondrial toxicity improved by consensus modeling and resolving imbalance in training[J]. Chemosphere, 2020, 253:126768
    Capuzzi S J, Politi R, Isayev O, et al. QSAR modeling of Tox21 challenge stress response and nuclear receptor signaling toxicity assays[J]. Frontiers in Environmental Science, 2016, 4:3
    Cao Q Q, Liu L, Yang H B, et al. in silico estimation of chemical aquatic toxicity on crustaceans using chemical category methods[J]. Environmental Science Processes&Impacts, 2018, 20(9):1234-1243
    Vegosen L, Martin T M. An automated framework for compiling and integrating chemical hazard data[J]. Clean Technologies and Environmental Policy, 2020, 22(2):441-458
    Yu F B, Wei C H, Deng P, et al. Deep exploration of random forest model boosts the interpretability of machine learning studies of complicated immune responses and lung burden of nanoparticles[J]. Science Advances, 2021, 7(22):eabf4130
    Rabinowitz J R, Goldsmith M R, Little S B, et al. Computational molecular modeling for evaluating the toxicity of environmental chemicals:Prioritizing bioassay requirements[J]. Environmental Health Perspectives, 2008, 116(5):573-577
    Walden D M, Bundey Y, Jagarapu A, et al. Molecular simulation and statistical learning methods toward predicting drug-polymer amorphous solid dispersion miscibility, stability, and formulation design[J]. Molecules, 2021, 26(1):E182
    Mazurek A H, Szeleszczuk , Pisklak D M. A review on combination of ab initio molecular dynamics and NMR parameters calculations[J]. International Journal of Molecular Sciences, 2021, 22(9):4378
    Li J, Cao H M, Feng H R, et al. Evaluation of the estrogenic/antiestrogenic activities of perfluoroalkyl substances and their interactions with the human estrogen receptor by combining in vitro assays and in silico modeling[J]. Environmental Science&Technology, 2020, 54(22):14514-14524
    Xue Q, Liu X, Liu X C, et al. The effect of structural diversity on ligand specificity and resulting signaling differences of estrogen receptor Α[J]. Chemical Research in Toxicology, 2019, 32(6):1002-1013
    Cao H M, Wang L, Liang Y, et al. Protonation state effects of estrogen receptor α on the recognition mechanisms by perfluorooctanoic acid and perfluorooctane sulfonate:A computational study[J]. Ecotoxicology and Environmental Safety, 2019, 171:647-656
    de Araujo A S, Martínez L, de Paula Nicoluci R, et al. Structural modeling of high-affinity thyroid receptor-ligand complexes[J]. European Biophysics Journal, 2010, 39(11):1523-1536
    Subramaniam S, Mehrotra M, Gupta D. Virtual high throughput screening (vHTS):A perspective[J]. Bioinformation, 2008, 3(1):14-17
    Troger F, Delp J, Funke M, et al. Identification of mitochondrial toxicants by combined in silico and in vitro studies:A structure-based view on the adverse outcome pathway[J]. Computational Toxicology, 2020, 14:100123
    Kanehisa M, Bork P. Bioinformatics in the post-sequence era[J]. Nature Genetics, 2003, 33(Suppl.):305-310
    Wang X Q, Li F, Liu J L, et al. Transcriptomic, proteomic and metabolomic profiling unravel the mechanisms of hepatotoxicity pathway induced by triphenyl phosphate (TPP)[J]. Ecotoxicology and Environmental Safety, 2020, 205:111126
    Kang W L, Li X K, Sun A Q, et al. Study of the persistence of the phytotoxicity induced by graphene oxide quantum dots and of the specific molecular mechanisms by integrating omics and regular analyses[J]. Environmental Science&Technology, 2019, 53(7):3791-3801
    Xia P, Peng Y, Fang W D, et al. Cross-model comparison of transcriptomic dose-response of short-chain chlorinated paraffins[J]. Environmental Science&Technology, 2021, 55(12):8149-8158
    Song Y, Villeneuve D L, Toyota K, et al. Ecdysone receptor agonism leading to lethal molting disruption in arthropods:Review and adverse outcome pathway development[J]. Environmental Science&Technology, 2017, 51(8):4142-4157
    Baralic K, Živancevic K, Božic D, et al. Potential genomic biomarkers of obesity and its comorbidities for phthalates and bisphenol A mixture:[QX (Y12#] In silico toxicogenomic approach[J]. BIOCELL, 2022, 46(2):519-533
    Yu F F, Zuo J, Fu X L, et al. Role of the hippo signaling pathway in the extracellular matrix degradation of chondrocytes induced by fluoride exposure[J]. Ecotoxicology and Environmental Safety, 2021, 225:112796
    Peng T, Wei C H, Yu F B, et al. Predicting nanotoxicity by an integrated machine learning and metabolomics approach[J]. Environmental Pollution, 2020, 267:115434
    Yamane J, Aburatani S, Imanishi S, et al. Prediction of developmental chemical toxicity based on gene networks of human embryonic stem cells[J]. Nucleic Acids Research, 2019, 47(3):1600
    Neves B, Moreira-Filho J, Silva A, et al. Automated framework for developing predictive machine learning models for data-driven drug discovery[J]. Journal of the Brazilian Chemical Society, 2021:110-122
    Chen H M, Engkvist O, Wang Y H, et al. The rise of deep learning in drug discovery[J]. Drug Discovery Today, 2018, 23(6):1241-1250
    Seal S, Yang H B, Vollmers L, et al. Comparison of cellular morphological descriptors and molecular fingerprints for the prediction of cytotoxicity-and proliferation-related assays[J]. Chemical Research in Toxicology, 2021, 34(2):422-437
  • 加载中
计量
  • 文章访问数:  2943
  • HTML全文浏览数:  2943
  • PDF下载数:  171
  • 施引文献:  0
出版历程
  • 收稿日期:  2021-09-15

大数据挖掘和机器学习在毒理学中的应用

    通讯作者: 李斐, E-mail: fli@yic.ac.cn ;  于进福, E-mail: yujinfu@ytvc.edu.cn
    作者简介: 滕跃发(1998-),男,硕士研究生,研究方向为生态毒理学,E-mail:yfteng@yic.ac.cn
  • 1. 中国科学院海岸带环境过程与生态修复重点实验室(烟台海岸带研究所), 山东省海岸带环境过程重点实验室, 中国科学院烟台海岸带研究所, 烟台 264003;
  • 2. 烟台职业学院网络中心, 烟台 264670;
  • 3. 中国科学院大学, 北京 100049;
  • 4. 中国科学院海洋大科学研究中心, 青岛 266071
基金项目:

烟台市科技创新发展计划项目(2020MSGY060)

国家自然科学基金资助项目(21677173,41530642)

中国科学院青年创新促进会项目(2017255)

摘要: 随着高通量筛选技术的快速发展,化学品的毒性相关信息与日俱增。现今快速发展的数据挖掘技术和机器学习等计算机方法为化学品的毒性预测和风险防控提供了新途径。有害结局路径(adverse outcome pathway,AOP)将化合物的结构、分子启动事件和生物的有害结局建立关联,为污染物的毒性测试、预测和评估提供了新的模式,最终实现风险评估并应用于管理决策。定量结构-活性关系(QSAR)建模、分子模拟以及多组学技术在AOP的各个方面发挥了重要作用。基于此,本综述主要介绍数据挖掘与机器学习在毒理学中的应用方法,涉及QSAR建模、分子模拟及组学等方面,并结合实例分析系统阐述了当前研究的重点与方向,以更好地适应当前大数据时代的研究背景。

English Abstract

参考文献 (59)

目录

/

返回文章
返回