AI智能总结
Research Report on the Development of Vector Database 綘ⱗ⽀⡙⟄♴䱗ぜ♶ⴔ⯓た ⻌❩嵳ꆀ侨研䪮助肅⟧剣ꣳⰖծ⻌❩㹈Ⱎ䗞鲱⟝肅⟧剣ꣳⰖծ⻌❩⚎倰鸑猰䪮肅⟧剣ꣳⰖծ帿㖕䋑ꆄ軧㣔敵✻雦皾肅⟧剣ꣳⰖծ굷艗⥌䜂䪮助剣ꣳⰖծ岌䗎緸絞猰䪮肅⟧剣ꣳⰖծ帿⥌剪猰䪮肅⟧剣ꣳⰖծ䷑㼷絁玐兰腊猰䪮(⻌❩)肅⟧剣ꣳⰖ 綘ⱗ絆䧭プ⟄♴䱗ぜ♶ⴔ⯓た 宐兪瀖䎛銯猰䪮㣐㷖絑崸♸盗椚㷖ꤎ⽇㡦ⶰ侅䱇ծ涯梐ծし牬ծ㸢兰ꩮծ獓㸌䣅ծ랫兪ծ⽔俒륫ծ랕謵ծ桬蒌僕ծ餪ꩋծ꣣⟗鴪ծ勿銯ⵄծ勚䘋뚱ծ卌楧ծ䧭⹄俛ծ鿓挳虽ծ齠䒊馄ծ勿㕂欰ծ곥⵿ծ龨⤦兜 晜勉㡮僈 劥灇瑕盟デ晜勉㾩✵Ⰼ椕雦皾翫湅կ ⢪欽霹僈劢絑Ⰼ椕雦皾翫湅✲⯓涸⛼䱇勉♶䖤⟄⟣⡦倰䒭㢕ⵖծ䪴鄆ծ䕧⽪ծ缺峂劥俒咓涸⟣⡦鿈ⴔկⳝ鲮鲿䧴䒸欽劥俒涸錜挿ծ侨研霼岤僈勻彂Ⰼ椕雦皾翫湅կ 湡 䔶 ぢꆀ侨研䎽涯淼⛼....................................................1Research Report on the Development of Vector Database............1痦♧畎ぢꆀ侨䰘䎽❡⚌胜兞........................................41.1 ➢侨䰘」ꬠ溏ぢꆀ侨䰘䎽涸䗳搬䚍..............................41.2 䪮助怵鵳....................................................4痦✳畎 㛇劥䪮助⾲椚.................................................72.1 ぢꆀ邍爙....................................................72.2 邍爙䅺Ⰶ....................................................72.3ぢꆀ程䒸♸㶸⪰⠏⻊.........................................122.4湱⡃䏞雦皾♸넞佪ぢꆀ䵂程...................................162.5 欰䙖禹絡...................................................18痦♲畎 䎾欽㖞兞....................................................213.1 露⛐䵂程♸⥌䜂唬程.........................................223.2濼霋䎽匬䒊.................................................233.3唬程㟞䔂欰䧭...............................................243.4Ⱖ➭㖞兞...................................................26痦㔋畎 䎾欽㖞兞䪮助䮋䧶♸鍒Ɀ倰呩..................................274.1呍䗱䮋䧶嚋鶣...............................................274.2〳的㾝䚍䮋䧶♸程䒸⠏⻊.....................................274.3 湱⡃䏞雦皾♸䵂程...........................................284.4 넞絶侨䰘㢅椚涸䮋䧶.........................................30 4.5 侨䰘㸝Ⰼ♸ꥧ猙⥂䫡.........................................314.6䎂〵Ⱟ㺂涸䮋䧶.............................................31痦❀畎 㹊倶♸䒊雳..................................................345.1 䪮助〄㾝馋⸷...............................................345.2 佟瘻♸岁錞馋⸷.............................................35痦Ⱉ畎 ぢꆀ侨䰘䎽遤⚌䎾欽㹊騨......................................386.1 遤⚌䎾欽Ⰼ兞㕃靶...........................................386.2 呍䗱遤⚌帿䏞㹊騨...........................................40痦♫畎 絕雿♸㾝劅..................................................497.1劥涯淼⛼涸⚺銳ⰻ㺂♸錜挿...................................497.2ぢꆀ侨䰘䎽䪮助涸劢勻〄㾝♸䎾欽兞.........................507.3렽⸠⚌歲Ⱏず♸ぢꆀ侨䰘䎽䪮助涸倝♸䲀䎛.................51痦Ⱃ畎 ꣡䔶........................................................53 痦♧畎ぢꆀ侨䰘䎽❡⚌胜兞 1.1 ➢侨䰘」ꬠ溏ぢꆀ侨䰘䎽涸䗳搬䚍 㖈侨㶶⻊嵠惐䌏⽷Ⰼ椕涸➚㣔侨研䊺䧭⚹䲀⸓爢吐鵳姿涸呍䗱欰❡銳稇ꬋ絕匬⻊侨研涸旘〄䒭㟞僽끮⸓ぢꆀ侨研䎽〄㾝涸呏劥⾲㔔կ研IDC곫崵ⵌ2025䎃Ⰼ椕侨研䚪ꆀ㼜鴪175ZBⰦ务80%⟄副⚹ꬋ絕匬⻊侨研俒劥ծ㕃⫸ծ갉錠곸瘝[1]կ⠛絡Ⱒ禹㘗侨研䎽⣜饆곫㹁⛐邍絕匬ㄤ礵烁⼐ꂁ劼ⵖ㖈㢅椚姼碫侨研傞㶸㖈儑衼櫕곯 •露⛐椚鍒緃㣟➑腊鸑鵂叻皊鵂応偽岁䯲䯝撑练露⛐Ⱒ翫⢾㥵㕃⫸ⰻ㺂唬程⣜饆➃䊨叻岤 •넞絶唬程佪桧⡛ꥥ满絶䏞㟞⸈⠛絡程䒸㥵B+⽼⚰絶䏞拇ꦼ唬程䒁鵷ⶢ㟞 •㹊傞䚍割駈ꦼ⟄佅䭯嬗猲紩ㆇ䎾涸㖞兞㥵㹊傞䲀虛ծ㸝港䱽կ ぢꆀ侨研䎽鸑鵂㼜侨研僥㼘ⵌ鵶絯涸넞絶ぢꆀ瑟ꢂ㹊梡✫➢礵烁⼐ꂁⵌ露⛐湱⡃涸薴䒭鲮」կ㖈鵯猫倝涸侨研邍爙䕎䒭♴侨研⛓ꢂ涸湱⡃䚍割ⱄ➑➑⣜饆✵㶶痗⚮⼐ꂁ䧴侨⧩嫱鳅罜僽鸑鵂ぢꆀ㖈瑟ꢂ务涸騄猌勻䏞ꆀ➢罜腊㢿刿ⲥ烁㖑䯲䯝侨研⛓ꢂ涸露⛐Ⱒ翫կ⢾㥵㖈䲀虛禹絡务ぢꆀ侨研䎽〳⟄呏研欽䨪涸⾎〷遤⚹ㄤ⨊㥩欰䧭欽䨪ぢꆀㄤ㉁ㅷぢꆀ䎇鸑鵂雦皾ぢꆀ⛓ꢂ涸湱⡃䚍⚹欽䨪䲀虛刿湱Ⱒ涸㉁ㅷկ 1.2 䪮助怵鵳 ぢꆀ侨研䎽涸〄㾝♸AI䪮助湱鳇湱䧭㣐荝〳ⴔ⚹♲⚡媯 1.2.1䪮助蟟蔠劍20⚆紬ⴲ-2017䎃 ぢꆀ侨研䎽䪮助涸〄㾝〳⟄鷅彷ⵌ劥⚆紬ⴲ涸帿䏞㷖⛴嵠惐կꥥ满帿䏞㷖⛴䪮助㖈露갉霋ⵆծ㕃⫸㢅椚ծ荈搬露鎊㢅椚瘝곭㚖涸瑲灶䚍鵳㾝䅺Ⰶ邍爙Embedding䪮助鷶幯䧭⚹♧猫叻ⲥ⻊涸侨研邍爙倰岁կ䅺Ⰶ邍爙䪮助鸑鵂㼜侨研僥㼘ⵌ⡛絶玸㺙ぢꆀ瑟ꢂ腊㢿剣佪㖑䯲䯝侨研⛓ꢂ涸露⛐Ⱒ翫⚹ぢꆀ侨研䎽涸〄㾝㤭㹁✫㗏㹊涸㛇炄կ⢾㥵Word2Vec垷㘗〳⟄㼜霒露僥㼘ⵌぢꆀ瑟ꢂ⢪䖤露⛐湱⡃涸霒露㖈ぢꆀ瑟ꢂ务涸騄猌⛲鳅鵛BERT垷㘗ⴭ〳⟄欰䧭刿⚪㺢涸副♴俒湱Ⱒぢꆀ邍爙鵳♧姿䲿⼮露⛐椚鍒涸ⲥ烁䚍կ 1.2.2欰䙖籖虽劍2018䎃-2023䎃 2010䎃去务劍ꥥ满㣐侨研傞去涸ⵌ勻㥵⡦넞佪㖑㶸⪰ㄤ唬程嵳ꆀ涸넞絶ぢꆀ侨研䧭⚹㷖助歲ㄤ䊨⚌歲Ⱏず⚰涸䮋䧶կ㖈䪮助輑ざ㽻ぢꆀ侨研䎽䊺瑲灶⠛絡唬程呥卹䕎䧭♸帿䏞㷖⛴垷㘗帿䏞羒ざ涸倝䕎䙖կ⟄FAISS⚹去邍涸䏀㽻䎽鸑鵂꧋䧭HNSW㕃程䒸ծꆀ⻊⾓綫瘝皾岁㼜涰厅紩ぢꆀ唬程䒁鵷⾓綫荛嬗猲紩鵯猫䚍腊駟鴲湬䱹⪶欰✫㹊傞䲀虛禹絡ծ⸓䙖굥䱽瘝㖞兞涸㉁⚌⻊衅㖑կ刿⧩䖤Ⱒ岤涸僽ぢꆀ侨研䎽♸㣐露鎊垷㘗涸絕ざ姻㖈鸣Ⰼ倝⟟⧩Ꝇ勵ˋˋ鸑鵂唬程㟞䔂欰䧭RAG䪮助侨研䎽割➑⡲⚹濼霋㶸⪰⽀⯋刿怵」⚹垷㘗䲀椚鵂玐务涸⸓䙖雵䗴⡤կ鵯猫錭蒀鲮」銳宠ぢꆀ侨研䎽Ⱘ㢊㹊傞刿倝腊⸂⟄⼐ꂁ欰䧭䒭AI㼆傞䎸侨研涸併䠭䚍⢾㥵㖈兰腊㹐剪㖞兞务禹絡⽰傞㼜剒倝㼆霢ぢꆀ岤Ⰶ侨研䎽烁⥂㔐撑涸傞佪䚍♸湱Ⱒ䚍կ ❡⚌䎾欽㖞兞涸䬪㾝ッ梡ⴀ僈儑涸垷䙖輑ざ暵䖄կ⠛絡䲀虛禹絡⚺銳㢅椚欽䨪-㉁ㅷ✳鿈㕃侨研罜梡去ぢꆀ侨研䎽ず傞!椚俒劥ծ㕃⫸ծ遤⚹䎸瘝㢴垷䙖ぢꆀկ㖈兰腊ⵖ鸣곭㚖霃㢊⠛䠭㐼侨研♸絶⥝䩛ⱃ俒劥ぢꆀ鄄絡♧僥㼘荛ず♧露⛐瑟ꢂ㹊梡㛇✵佦ꥻ䲽鶣涸兰腊霏倗㖈欰暟⼕蚋遤⚌跗涯餘絕匬ぢꆀ♸⚰䎯霚낉俒劥ぢꆀ匬䒊饰騗垷䙖濼霋㕃靶⸈鸟倝蚋灇〄鵳玐կ鵯猫垷䙖瑬鷳腊⸂⣜饆✵ぢꆀ侨研䎽㼆幊ざ程䒸絕匬涸佅䭯㥵㼜㕃⫸涸⡮䓛湱⡃䏞♸俒劥涸BM25勉ꅾ鵳遤翫ざ雦皾瑲灶⽀♧垷䙖涸唬程鴝歲կ ꥥたMilvusծVearch瘝䒓彂ぢꆀ侨研䎽呥卹湱絩⚆㸐⼜㛇✵FAISS瘝䏀㽻䎽䲿⣘✫刿⸈模㊤涸侨研䎽⸆腊㥵侨研䭯⛉⻊ծⴔ䋒䒭鿈縭ծ宐䎂的㾝瘝鵳♧姿꣭⡛✫ぢꆀ侨研䎽涸⢪欽꡶地կ䒓彂欰䙖涸籖虽姻㖈ꅾ㝕ぢꆀ侨研䎽涸䪮助怵鵳騟䖈կMilvusծVearch瘝고湡涸䧭⸆䳶爙✫䒓彂垷䒭㖈AI㛇炄霃倶곭㚖涸杝暵⠏⸷Ⰼ椕䒓〄罏鸑鵂GitHub⼸ず⠏⻊程䒸皾岁⢪䖤HNSW程䒸涸〮㔐桧㖈⚙䎃ⰻ䲿⼮40%爢⼓餑柄涸GPU⸈鸟䳃⟝⢪ぢꆀ唬程ふ属ꆀ瑲灶涰♰QPS鵯猫鶛去鸟䏞鵴馄⠛絡㉁⚌侨研䎽կ刿帿鵴涸䕧ㆇ㖈✵䒓彂欰䙖⪶欰✫㘌湬곭㚖涸㹁ⵖ⻊ⴔ佅㥵ꛏ㼆㛇㔔崵䎸⠏⻊涸ぢꆀ㶸⪰呔䒭ծ鷓ꂁ鴝続雦皾涸鲽ꆀ紩鿈縭倰呩鵯❈倝鸑鵂爢⼓导껩䘯鸟輑Ⰶ⚺䎁晜劥䕎䧭䪮助的侔涸굷鲰佪䎾կ 1.2.3㕂❡⻊⸈鸟劍2024䎃-荛➚ 鵛䎃勻ꥥ满㕂ⰻ➃䊨兰腊❡⚌涸觌⸽〄㾝㕂❡㉁⚌ぢꆀ侨研䎽䒸空⛲》䖤✫駈涸鵳姿կZillizծ艗雴ծ嵳ꆀ侨研ծ僤梠猰䪮瘝⟱⚌紹紹䲀ⴀ✫荈⚺灇〄涸ぢꆀ侨研䎽❡ㅷ鵯❈❡ㅷ㖈⸆腊ծ䚍腊ծ僒欽䚍瘝倰㖲割鷦蒀䎇㖈ꆄ輑ծ歏㉁ծ㸝瘝㢴⚡곭㚖䖤ⵌ✫䎛岌䎾欽կ⢾㥵Zilliz䲀ⴀ涸✻⾲欰ぢꆀ侨研䎽ⴭ䲿⣘✫䔀䚍的㾝ծ䭽➰餩瘝✻剪⸉暵䚍꣭⡛✫⟱⚌涸IT䧭劥⻌❩嵳ꆀ侨研䪮助肅⼝剣ꣳⰖ尽䲀ⴀ涸Vastbase V100ぢꆀ侨研䎽佅䭯叻ꆀ/ぢꆀ翫ざ唬程腊㢿忘駈筛⸉ծ⼕毫ծ侅肫瘝遤⚌㼆넞䚍腊ծ넞礵䏞涸宠鵳♧姿꣭⡛⟱⚌衅㖑AI䎾欽涸꡶地䒓彂爢⼓倰openGauss⛲䲀ⴀ✫ぢꆀ䒸空openGauss DataVec⣘ぐ㉁⚌侨研䎽⾊㉁蜦》䎇㛇✵姼䒸空䒓〄կ ꥥ满ぢꆀ侨研䎽䪮助涸兜⿺Ⱖ䧶殜⟟⧩⛲傈渤⳼儑կ♧倰ぢꆀ侨研䎽姻㖈䧭⚹遤⚌叻ⲥⵖ㹁涸Ⱒꝶ♸罏կ⢾㥵鸑鵂㹁⛐QPSծRecall@K瘝呍䗱䭷叻ぢꆀ侨研䎽姻㖈䲀⸓遤⚌䕎䧭絡♧涸䪮助霉⠮⡤禹կ鵯猫叻ⲥ⻊割➑剣⸔✵欽䨪鷥䭊ざ鷓涸❡ㅷ⛲⤛鵳✫䪮助劥魧涸⨴䐀〄㾝կ〥♧倰ぢꆀ侨研䎽姻㖈䧭⚹❡㷖灇欽⼸ず倝涸咕咿կ㷖助歲涸屠灇瑕㥵ぢꆀ〳鍒ꅺ䚍ծ腊署⠏⻊瘝姻㖈鸑鵂ぢꆀ侨研䎽䎂〵䘯鸟衅㖑ⵌ❡⚌歲կず傞❡⚌歲涸宠导껩⛲㖈割倗䒸㼋㷖助灇瑕涸倰ぢ䕎䧭✫葻䚍匹⸓կ姼㢪ぢꆀ侨研䎽鵮㖈䒸㼋餴劥♸筛瘻餴彂涸ꂁ縨կꥥ满ぢꆀ侨研䎽鄄紵Ⰶ㕂㹻倝㛇䒊䧶殜Ⱖ㖈兰䢴㙹䋑ծ侨㶶㷜欰瘝㖞兞涸䎾欽姻㖈蜦䖤刿㢴涸筛瘻佅䭯ㄤ餴劥䫏Ⰶկ鵯猫餴彂⧫俷㼜鵳♧姿⸈鸟ぢꆀ侨研䎽䪮助涸䧭擿ㄤ❡⚌涸㡫㣐կ 痦✳畎㛇劥䪮助⾲椚 2.1 ぢꆀ邍爙 向量是同时具有方向和大小的量,其在数学上表示为多为空间中的坐标,比如N维空间中的向量就是一个具有N个维度的坐标,(a!,a",a#,...)。向量的大小(也称为长度或者膜)通过公式&𝑎!"+𝑎""+𝑎#"+...计算获得。向量的方向通过从原点到坐标点连线的夹角表示