行业研究公司研究宏观策略财报招股书会议纪要 seedance2.0 低空经济 DeepSeek AIGC 大模型

科学时代的大语言模型中的人工智能

信息技术 2024-12-10 - 北京大学周剑

概述

本次研报探讨了大型语言模型（LLMs）在科学领域的应用，重点关注科学文本、脑信号和生物序列三个方面的进展和挑战。

科学文本

LLMs 在科学领域的应用现状：已有大量科学 LLMs 被构建，涵盖多个学科领域（如数学、物理、化学、生物医学等）和多种模态（如文本、图、视觉、时间序列）。
关键研究成果：例如 Med-PaLM 2 在医学问答任务中达到专家水平，OpenAI o1 在博士级别的科学问题中超越人类表现。
挑战：LLMs 在复杂任务、多模态融合和可信度方面仍面临挑战。例如，研究发现 LLMs 在临床决策中表现不如临床医生，且对信息量和顺序敏感。
未来方向：包括复杂推理和规划、多模态学习以及 LLMs 的可信度提升。

脑信号

脑机接口（BCI）技术：BCI 技术实现大脑与外部设备之间的直接通信，其中脑电图（EEG）信号具有高时间分辨率但低空间分辨率的特点。
脑基础模型：基于大规模公开 EEG 数据构建了多个脑基础模型，如 BrainBERT、MMM 和 LaBraM，用于下游任务如异常检测和脑到文本翻译。
EEG 到文本翻译：EEG2Text 模型能够将 EEG 信号解码为自然语言句子，为脑机接口应用提供新的可能性。
未来方向：包括更好的时空信号编码、EEG 基础模型的个性化以及多模态脑基础模型。

生物序列

分子生物学基础：介绍了 DNA、RNA 和蛋白质的基本结构和功能，以及转录和翻译过程。
生物序列建模：利用机器学习技术对生物序列进行建模，构建 DNA 语言模型、RNA 语言模型和蛋白质语言模型。
基因组规模建模：Evo 模型是一个基于大规模基因组数据训练的模型，能够理解整个细胞的活动。
多模态学习：ProtST 模型结合了蛋白质序列和生物医学文本进行多模态预训练，在多种下游任务中表现出色。
未来方向：基因组规模 LLMs 的发展，以及利用文本桥接不同分子功能。

结论

LLMs 在科学领域具有巨大的潜力，但仍面临诸多挑战。未来研究方向包括复杂推理和规划、多模态学习以及 LLMs 的可信度提升。生物学领域是 LLMs 的下一个前沿，基因组规模 LLMs 能够理解整个细胞的活动，文本可以作为桥接不同分子功能的工具。

ZhenyuBi1,MinghaoXu2, Jian Tang2, Xuan Wang1 1Department of Computer Science, Virginia Tech, USA2Mila-Quebec AI Institute, Canada About the Tutors ZhenyuBi, PhD Student,Department of ComputerScience, Virginia Tech MinghaoXu, PhD Student,Mila-Quebec AI Institute Xuan Wang,AssistantProfessor,Department of ComputerScience, Virginia Tech Jian Tang,Associate Professor,Mila-Quebec AI Institute Tutorial Outline •9:00 am–9:10 am:Introduction•9:10 am–10:00 am:Part I: Scientific Text•10:00 am–10:10 am: Break•10:10 am–11:10 am:Part II: Brain Signals•11:10 am–11:20 am: Break•11:20 am–12:20 pm:Part III: Biological Sequences•12:20 pm–12:30 pm:Summary and Q&A AI for Sciences Large Language Models (LLMs) BERTKentonet al., 2019 GPTBrownet al., 2019 T5Raffelet al., 2020 Machine Translation Dialog Systems, Chatbots, Digital Assistants Natural Language Generation Resemblance Between Scientific Data and Language •Sequences! •Scientific Textual Data: Scientific Literature, Electronic Health Record•Sensor Data: Brain Electroencephalogram (EEG) Signals•Biological Sequences: DNA, RNA, protein This Tutorial: Can we harness the potential oftheserecent LLMsto drivescientific progress? Scientific Large Language Models:Challenges and Opportunities Xuan WangAssistant ProfessorDepartment of Computer ScienceSanghaniCenter for AI and Data AnalyticsVirginia Tech Tutorial Outline •9:00 am–9:10 am:Introduction•9:10 am–10:00 am:Part I: Scientific Text•10:00 am–10:10 am: Break•10:10 am–11:10 am:Part II: Brain Signals•11:10 am–11:20 am: Break•11:20 am–12:20 pm:Part III: Biological Sequences•12:20 pm–12:30 pm:Summary and Q&A Outline •Scientific LargeLanguageModels •Future Directions: •Complex Reasoning and Planning•Multi-modalLearning•Trustworthiness of LLMs LargeLanguageModels(LLMs) Yang, J., Jin, H., Tang, R.,Han, X., Feng, Q., Jiang, H., ...& Hu, X. (2024). HarnessingthePowerofLLMsinPractice:ASurveyonChatGPTandBeyond.ACM Transactions onKnowledge Discovery fromData,18(6), 1-32. Non-comprehensiveevolutionarytree forProtein/RNA/DNAlanguage models Jian Ma, “Large LanguageModels in ComputationalBiology–A Primer (2024Update)”, 2024 A Comprehensive Survey of Scientific Large LanguageModels and Their Applications in Scientific Discovery(Zhanget al.,EMNLP 2024) •Survey over260 scientific LLMs •Across fields: 1) general science, 2) mathematics, 3) physics, 4)chemistry and material science, 5) biology and medicine, and 6)geography, geology, and environmental science •Acrossmodalities: 1) text, 2) graph, 3) vision, and 4) time series •Website:https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models A Comprehensive Survey of ScientificLarge Language Models and TheirApplications in Scientific Discovery(Zhang et al., EMNLP 2024) Towards Expert-LevelMedical QuestionAnswering withLargeLanguage Models (Med-Palm-2, Google, 2024) Figure 1|Med-PaLM 2 performance on MultiMedQALeft: Med-PaLM 2 achieved an accuracy of 86.5% on USMLE-stylequestions in the MedQA dataset. Right: In a pairwise ranking study on 1066 consumer medical questions, Med-PaLM 2 answerswere preferred over physician answers by a panel of physicians across eight of nine axes in our evaluation framework.Singhal, K., Tu, T.,Gottweis, J.,Sayres, R.,Wulczyn, E., Hou, L., ... & Natarajan, V. (2023). Towards expert-level medical questionanswering with large language models.arXivpreprint arXiv:2305.09617. OpenAI o1 Surpasses Human Performance on PhD-Level Science Questions (OpenAI, 2024) OpenAI o1 Surpasses Human Performance on PhD-Level Science Questions (OpenAI, 2024) Outline •Scientific LargeLanguageModels •Future Directions:•Complex Reasoning and Planning•Multi-modalLearning•Trustworthiness of LLMs Evaluation and Mitigation of the Limitations of LargeLanguageModels in Clinical Decision-Making(Hageret al.,Nature Medicine 2024)https://doi.org/10.1038/s41591-024-03097-1AppendicitisCholecystitisDiverticulitisPancreatitis2,400INPUTTOOLSHistory of present illnessPhysicalexaminationLaboratoryresultsRadiologistreports Evaluation and Mitigation of the Limitations of LargeLanguage Models in Clinical Decision-Making(Hageret al.,Nature Medicine 2024)Articlehttps://doi.org/10.1038/s41591-024-03097-1 LLMsDiagnoseSignificantly WorsethanCliniciansge agreement of MIMIC-IV, only open-access models that can be downloaded can be used with the data; thus, only LLMs based on Llama 2 were used in thiation has been made public.aMeta defines ‘public data’ as a ‘mix of data from publicly available sources’.bNo further information provided. DiagnosticAccuracyof LLMsDecreasedin anAutonomous Clinical Decision-Making Scenariohttps://doi.org/10.1038/s41591-02 LLMsDo NotConsistently Recommend EssentialandPatient-Specific Treatment LLMs Are Sensitiveto theQuantityofInformationProvidedhttps://doi.org/10.1038/s41591-0 LLMs Are Sensitive to the Order of Informationon the MIMIC-CDM-FI dataset. This suggests that LLMs cannotkey facts and degrade in performance when too much infor

点击免费查看完整报告