您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[美国安全与新兴技术中心]:开放模型在研究中的应用 - 发现报告

开放模型在研究中的应用

AI智能总结
查看更多
开放模型在研究中的应用

Executive Summary There is widespread consensus that open and freely available AI models benefitresearch. Yet there is a lack of empirical evidence detailing how this relationshipmanifests. This report aims to fill this gap by investigating the use of open largelanguage models (LLMs) in published research, overviewing what organizations andcountries use them most frequently, and considering their wider impact on research. Tothis end, we identify and analyzemore than250 publications that use open models inways that require access to model weights, and derive a taxonomy of use cases thatopenly available model weights exclusively or predominantly enable. We then reviewmore than130 publications that use closed models to compare use cases when modelweights are and are not openly available. Our analysis finds that open models enable a more diverse range of use cases thanclosed models. Of the eight high-level use cases for AI models we identified, five areexclusively enabled by access to model weights, two predominantly require weights,andone does not require weights. Those requiring weights include continuouslypretraining models to expand their general knowledge, compressing models to improvetheir efficiency, combining different models or synchronizing their modalities (e.g., textand imagery), and measuring the functionality of models on hardware or theperformance of hardware when running models. Two use cases predominantly require access to weights: fine-tuning models forparticular tasks or domains, and examining model internals to interpret theirfunctionality. While some closed modelapplication programming interfaces(API) allowfor these use cases, the access offered is generally very limited and does not, forexample, allow for customized fine-tuning or granular examination of model internals.These APIs are therefore generally less useful to researchers for these use cases, andmost studiesassessed in this report that conducted model fine-tuning or examinationrequired access to model weights. The final use case is prompting, which we define as any form of input-output probing.Prompting allows for the evaluation of model performance, capabilities, alignment, andsafety, among other things, and requires only minimal access to a model through a webor programming interface,so itcan be conducted on both open and closed models. Inour sample of papers that used closed models, researchers engaged almost exclusivelyin model prompting. These open model use cases allow researchers to investigate a wider range ofquestions, explore more avenues of experimentation, and implement and demonstratea wider range of techniques than if they only had access to closed models. For example,researchers can custom fine-tune or continuously pretrain open models to study how amodel’s performance or behavior changes with the introduction of new datasets andtechniques, or examine open models to assess how their internal parameters andprocesses contribute to and influence model behaviors, which is an important enablerof AI interpretability and auditing. We note that some researchers may prefer to useclosed models, especially for prompting, as state-of-the-art models tend to be closed,often come with convenient user interfaces and APIs, and do not require the user todownload and run the model on custom computing infrastructure. Notwithstandingsuch factors, we find that access to open models can support advances in importantareas of research beyond what is possible with closed models. When it comes to the types of authors and organizations conducting research that useopen models, we find that nearly 90% and 50% of the papers in our sample wereproduced by researchers atacademic institutions and companies, respectively, withabout 35% being written in collaboration by authors at these types of organizations.While open models can be beneficial to lower-resource academic organizations, theprevalence of academia in our sample is likely due to the fact they are more likely topublish their research. We also find that the majority of papers that use open models inour sample are produced by researchers at U.S. organizations (64%), followed byChinese organizations (38%), which reflects broader trends in AI research output, aswell as the predominance of English language research in our sample. Table of Contents Introduction...............................................................................................................................................4Context: AI Access, Openness, and Weights.................................................................................6Methodology.............................................................................................................................................8Assessing Open Model Use Cases................................................................................................................8Assessing Closed Model Use Cases.............................................