从零开始构建 大语言模型的关键要点 Table of Contents目录 Introduction.................................................................................................................................................................................................................................................3引言.............................................................................................................................................................................................................................................................3BUILD VS. BUY PRE-TRAINED LLM MODELS.................................................................................................................................................................................................3自建与采购预训练过的LLM模型.............................................................................................................................................................................................................3THE SCALING LAWS......................................................................................................................................................................................................................................8缩放定律.....................................................................................................................................................................................................................................................8HARDWARE................................................................................................................................................................................................................................................10硬件...........................................................................................................................................................................................................................................................10DATASET COLLECTION...............................................................................................................................................................................................................................17数据集整合...............................................................................................................................................................................................................................................17DATASET PRE-PROCESSING........................................................................................................................................................................................................................19数据集预处理...........................................................................................................................................................................................................................................19PRE-TRAINING STEPS.................................................................................................................................................................................................................................26预训练步骤...............................................................................................................................................................................................................................................26MODEL EVALUATION.................................................................................................................................................................................................................................33模型评估...................................................................................................................................................................................................................................................33BIAS AND TOXICITY....................................................................................................................................................................................................................................36偏差和毒性...............................................................................................................................................................................................................................................36INSTRUCTION TUNING...............................................................................................................................................................................................................................38指令调优...................................................................................................................................................................................................................................................38REINFORCEMENT LEARNING THROUGH HUMAN FEEDBACK (RLHF)....