您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。 [世界银行]:监测全球援助流量:一种使用大型语言模型的新方法 - 发现报告

监测全球援助流量:一种使用大型语言模型的新方法

信息技术 2025-11-05 世界银行 SaintL
报告封面

Monitoring Global Aid Flows A Novel Approach Using Large Language ModelsPublic Disclosure Authorized Xubei LuoArvind Balaji RajasekaranAndrew Conner Scruggs Policy Research Working Paper11248 Abstract Effective monitoring of development aid is the foundationfor assessing the alignment of flows with their intendeddevelopment objectives. Existing reporting systems, suchas the Organisation for Economic Co-operation and Devel-opment’s Creditor Reporting System, provide standardizedclassification of aid activities but have limitations when itcomes to capturing new areas like climate change, digitali-zation, and other cross-cutting themes. This paper proposesa bottom-up, unsupervised machine learning frameworkthat leverages textual descriptions of aid projects to generatehighly granular activity clusters. Using the 2021 Credi-tor Reporting System data set of nearly 400,000 records,the model produces 841 clusters, which are then groupedinto 80 subsectors. These clusters reveal 36 emerging aidareas not tracked in the current Creditor Reporting System taxonomy, allow unpacking of “multi-sectoral” and “sectornot specified” classifications, and enable estimation of flowsto new themes, including World Bank Global ChallengePrograms,International Development Association–20Special Themes, and Cross-Cutting Issues. Validationagainst both Creditor Reporting System benchmarks andInternational Development Association commitment datademonstrates robustness. This approach illustrates howmachine learning and the new advances in large languagemodels can enhance the monitoring of global aid flowsand inform future improvements in aid classification andreporting. It offers a useful tool that can support moreresponsive and evidence-based decision-making, helping tobetter align resources with evolving development priorities. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about developmentissues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry thenames of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely thoseof the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank andits affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Monitoring Global Aid Flows: A Novel Approach Using Large LanguageModels Xubei Luo, Arvind Balaji Rajasekaran, Andrew Conner Scruggs1 JEL codes: F35, C38, C55Keywords: Foreign Aid, Classification Methods: Cluster Analysis, Modeling and Analysis I.Introduction Monitoring development aid contributes to improving the alignment of financial flows with their intendeddevelopment objectives. As global development priorities evolve, policy makers require monitoringsystems that can track commitments not only to traditional sectors such as health, education, andagriculture,but also to cross-cutting and newly emerged thematic domains like climate change,digitalization, and global pandemics. Without the ability to capture financial support to these dynamic areas,aid data risks underrepresenting the full scope of development efforts, potentially contributing to amisalignment between global goals such as the Sustainable Development Goals (SDGs), with operationalpriorities of development cooperations. The Organisation for Economic Co-operation and Development(OECD) Creditor Reporting System(CRS)2is the most widely used and comprehensive database of official aid flows, with over 4 millionactivity-level records reported since 2000. Its standardized “purpose codes,” designed by the OECDDevelopment Assistance Committee (DAC), enable comparability across financial flows from differentdonors and over time, making the CRS an indispensable tool for researchers, policy makers, anddevelopment practitioners. This categorization follows a top-down approach, where official labels (knownas “purpose codes”) for categorizing what sector the aid activity is intending to support (example: health,agriculture, etc.) are predetermined by the DAC before the reporting period begins.3Because of thisstructured and consistent coding system, the CRS is highly reliable for tracking flows to traditional sectorcategories. However, at the same time, it faces inherent challenges when applied to new or cross-cuttingpriorities. Activities related to new priorities are often dispersed across multiple sectors or grouped intoresidual categories like multi-sectoral/cross-cutting or sector not specified. As the Stockholm EnvironmentInstitute (2025) describes, “it can be very difficult to compile a picture of financial support for some issuesthat are of high importance on both national and international agendas, for examplesustainable oceans,because they are not in the coding system of