行业研究公司研究宏观策略财报招股书会议纪要 Token 低空经济十五五 AIGC 大模型

大语言模型时代的科学人工智能

信息技术 2025-03-05 - VT 等待花开

概述

本研报探讨了大型语言模型（LLMs）在科学领域的应用，重点关注科学文本、脑信号和生物序列三大方向。报告首先介绍了LLMs的基本概念及其在机器翻译、对话系统等领域的应用，并指出科学数据与语言具有相似性，为LLMs在科学领域的应用提供了基础。报告随后详细阐述了科学LLMs的挑战与机遇，包括复杂推理与规划、多模态学习以及LLMs的可信度问题。

科学文本

报告回顾了科学LLMs的最新进展，包括Med-Palm-2、OpenAI o1等模型在科学问答方面的突破。同时，报告也指出了LLMs在临床决策中的局限性，例如诊断准确率低于临床专家，对信息量和顺序敏感等。为解决这些问题，报告介绍了TriageAgent模型，该模型通过多智能体协作提升了LLMs在临床分诊中的表现。

脑信号

报告重点介绍了基于脑电图（EEG）的脑LLMs，包括BrainBERT、MMM和LaBraM等模型。这些模型在脑到文本翻译任务中取得了显著成果。报告还分析了EEG数据的特性，如时空序列、高时间分辨率但低空间分辨率以及低信噪比等，并提出了未来研究方向，包括更好的时空信号编码、EEG模型的个性化以及多模态脑LLMs的构建。

生物序列

报告探讨了LLMs在生物序列领域的应用，包括DNA、RNA和蛋白质LLMs。报告介绍了DNABERT、GenSLMs、RNA-FM、ProteinBERT等模型在预测调控元件、RNA结构功能、蛋白质结构预测等任务中的应用。报告还重点介绍了单细胞LLMs，如scBERT、scGPT和scMVP等模型，这些模型在单细胞分析中表现出色，可用于细胞类型注释、轨迹推断和基因调控网络推理等任务。

未来方向

报告最后总结了LLMs在科学领域的未来发展方向，包括：

复杂推理与规划：通过多智能体LLMs提升复杂任务的解决能力。
多模态学习：系统性地整合网络、文本和图像等多模态数据，构建多模态生物序列LLMs。
可信度问题：系统性地评估和缓解LLMs在科学应用中的可信度问题。

报告强调了LLMs在推动科学发现中的巨大潜力，并展望了AI科学家和智能体实验室等未来应用场景。

报告封面

Xuan WangAssistant ProfessorDepartment of Computer ScienceSanghaniCenter for AI and Data AnalyticsVirginia Tech About the Tutor Xuan Wang, AssistantProfessor,Department ofComputer Science,Virginia Tech Research Interests:naturallanguage processing, data mining, AIfor sciences, and AI for healthcare. Website:https://xuanwang91.github.io/ Tutorial Outline •8:30 am–9:00 am:Introduction•9:00am–10:00 am:Part I: Scientific Text•10:00 am–10:30 am:Part II: Brain Signals •10:30 am–11:00 am: Break •11:00 am–12:00 am:Part III: Biological Sequences•12:00 pm–12:30 pm:Summary and Q&A AI for Sciences Zhanget al., “Artificial Intelligence for Science in Quantum,Atomistic, and Continuum Systems”,arXiv, 2023 Large Language Models (LLMs) BERTKentonet al., 2019 GPTBrownet al., 2019 T5Raffelet al., 2020 Machine Translation Dialog Systems, Chatbots, Digital Assistants Natural Language Generation Resemblance Between Scientific Data and Language •Sequences! •Scientific Textual Data: Scientific Literature, Electronic Health Record •Sensor Data: Brain Electroencephalogram (EEG) Signals•Biological Sequences: DNA, RNA, protein This Tutorial: Can we harnessthe powerof theserecent LLMsto drivescientific progress? Scientific Large Language Models:Challenges and Opportunities Xuan WangAssistant ProfessorDepartment of Computer ScienceSanghaniCenter for AI and Data AnalyticsVirginia Tech Tutorial Outline •8:30 am–9:00 am:Introduction•9:00 am–10:00 am:Part I: Scientific Text•10:00 am–10:30 am:Part II: Brain Signals •10:30 am–11:00 am: Break •11:00 am–12:00 am:Part III: Biological Sequences•12:00 pm–12:30 pm:Summary and Q&A Outline •Scientific Large Language Models •Future Directions: •Complex Reasoning and Planning•Multi-modal Learning•Trustworthiness of LLMs LargeLanguageModels(LLMs) Yang, J., Jin, H., Tang, R.,Han, X., Feng, Q., Jiang, H., ...& Hu, X. (2024). Harnessingthe Power of LLMs in Practice:A Survey onChatGPTandBeyond.ACM Transactions onKnowledge Discovery fromData,18(6), 1-32. Jian Ma, “Large LanguageModels in ComputationalBiology–A Primer (2024Update)”, 2024 A Comprehensive Survey of Scientific Large LanguageModels and Their Applications in Scientific Discovery(Zhanget al.,EMNLP 2024) •Survey over260 scientific LLMs •Across fields: 1) general science, 2) mathematics, 3) physics, 4)chemistry and material science, 5) biology and medicine, and 6)geography, geology, and environmental science •Across modalities: 1) text, 2) graph, 3) vision, and 4) time series •Website:https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models A Comprehensive Survey of ScientificLarge Language Models and TheirApplications in Scientific Discovery(Zhang et al., EMNLP 2024) Towards Expert-Level Medical Question Answering withLarge Language Models (Med-Palm-2, Google, 2024) 18Singhal, K., Tu, T.,Gottweis, J.,Sayres, R.,Wulczyn, E., Hou, L., ... & Natarajan, V. (2023). Towards expert-level medical questionanswering with large language models.arXivpreprint arXiv:2305.09617. OpenAI o1 Surpasses Human Performance on PhD-Level Science Questions (OpenAI, 2024) OpenAI o1 Surpasses Human Performance on PhD-Level Science Questions (OpenAI, 2024) Outline •Scientific Large Language Models •Future Directions:•Complex Reasoning and Planning•Multi-modal Learning•Trustworthiness of LLMs Evaluation and Mitigation of the Limitations of LargeLanguage Models in Clinical Decision-Making(Hageret al.,Nature Medicine 2024) Evaluation and Mitigation of the Limitations of LargeLanguage Models in Clinical Decision-Making(Hageret al.,Nature Medicine 2024) LLMs Diagnose Significantly Worse than Clinicians Diagnostic Accuracy of LLMs Decreased in anAutonomous Clinical Decision-Making Scenario LLMs Do Not Consistently Recommend Essential andPatient-Specific Treatment LLMs Are Sensitive to the Quantity of InformationProvided LLMs Are Sensitive to the Order of Information TriageAgent:Towards Better Multi-Agent Collaboration forLarge Language Model-Based Clinical Triage(Luet al.,EMNLP 2024) TriageAgent:Towards Better Multi-Agent Collaboration forLarge Language Model-Based Clinical Triage(Luet al.,EMNLP 2024) Outline •Scientific Large Language Models •Future Directions: •Complex Reasoning and Planning•Multi-Modal Learning•Trustworthiness of LLMs Vision–Language Foundation Model for EchocardiogramInterpretation (Christensenet al.,Nature Medicine 2024) EchoCLIPWorkflow Vision–Language Foundation Model for EchocardiogramInterpretation (Christensenet al.,Nature Medicine 2024) Transparent Medical Image AI via an Image–TextFoundation Model Grounded in Medical Literature(Kimet al.,Nature Medicine 2024) Transparent Medical Image AI via an Image–TextFoundation Model Grounded in Medical Literature(Kimet al.,Nature Medicine 2024) Transparent Medical Image AI via an Image–TextFoundation Model Grounded in Medical Literature(Kimet al.,Nature Medicine 2024) Transparent Medical Image AI via an Image–TextFoundation Model Grounded i

点击免费查看完整报告

你可能感兴趣

科学时代的大语言模型中的人工智能

信息技术北京大学2024-12-10

人工智能大语言模型技术影响下的劳动力市场求职错配情况报告

信息技术北大国发院&智联招聘2025-12-15

中国对大语言模型的批判：寻找通用人工智能之路

信息技术安全与新兴技术中心2025-01-01

人工智能系列二：基于大语言模型的多信源舆情指数构建与应用

国泰期货2025-09-05

人工智能大语言模型技术影响下的劳动力市场求职错配情况报告

信息技术未知机构2025-12-23

王博-AiDD2024北京站-大语言模型时代的变异分析

信息技术2024AI研发数字峰会AiDD北京站2024-11-17

当人工智能遇到机器人：与麻省理工学院计算机科学与人工智能实验室（csail）主任达妮拉·鲁斯的对话

信息技术凯捷研究院2025-09-01

发现新的黄金时代：抓住人工智能为科学带来的机遇

信息技术Google DeepMind2024-11-01

远见：人工智能在科学过程中的使用和影响

信息技术欧洲研究理事会(ERC)2023-12-04

关于科学领域人工智能（AI）环境影响的思考

信息技术Evaluate2025-09-08