AI智能总结
AI让数据库的路走的“更快更远” AI的使用将会越来越普及 $2.9 trillion商业价值由AI创造6.2 billion hours人力花在了AI上Gartner 83% CEOs 相信AI 是一个战略重点 MIT Sloan Management Review AI的困境 1、特征、模型管理难 AI的困境 AI? DB? 流程简单化、低代码量、更低的开发成本&运维成本 为什么我们选择扩展DataOps到ModelOps 扩展DatOps到ModelOps,保持了数据新鲜度,维持了数据的易用性和可用性,避免了模型单独的数据管理系统,数据延迟和复杂的硬编码数据pipeline,方便了AI的在线决策。 DataOps + ModelOps:核心功能 数据管理+特征管理+模型管理 SQL+SQL for MLOps CREATE MODEL airlines_gbm_copy1 WITH (model_class='lightgbm’,x_cols ='Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,model_parameter=(boosting_type='gbdt', n_estimators=100,max_depth=8, num_leaves=256)) AS (SELECT * FROM airlines_train) SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm_copy1,SELECT * FROM airlines_train_1000_copy1)WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length’,y_cols='Delay', primary_key='TripID', PolarDB for AI 模型上传 模型创建 CREATE MODEL airlines_gbm WITH (model_class='lightgbm’,x_cols ='Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,model_parameter=(boosting_type='gbdt', n_estimators=100, max_depth=8, num_leaves=256))AS (SELECT * FROM airlines_train) UPLOAD MODEL model_name WITH (model_location = '', req_location = '') 模型部署 模型评估 DEPLOY MODEL model_name SELECT Delay FROM evaluate(MODEL airlines_gbm, SELECT * FROM airlines_test)WITH (x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length’,y_cols='Delay', metrics='acc'); UDF创建DEPLOY MODEL my_lr_model WITH (mode = 'in_db');CREATE FUNCTION my_lr_model RETURNS REAL SONAME "#ailib#_my_lr_model.so"; 模型推理(离线) SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm_copy1,SELECT * FROM airlines_train_1000_copy1) WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,primary_key='TripID', mode='async') INTO lightgbm_v2_predict82201; 特征创建 CREATE FEATURE feature_name WITH (feature_class= '',parameters=()) AS(SELECT select_expr[, select_expr] ... FROM table_reference) 模型推理(在线) 特征更新 SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm_copy1,SELECT * FROM airlines_train_1000_copy1) WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,primary_key='TripID',); UPDATE FEATURE feature_name WITH (feature_class = '',parameters=()) AS(SELECT select_expr [, select_expr] ... FROM table_reference) 模型描述 特征删除 DESCRIBE MODEL model_name DROP FEATURE feature_name 模型删除DROP MODEL model_name 等AI SQL PolarDB for AI:DB for AI in PolarDB MySQL SQL:Feature Creation,Model Creation,Model Evaluation,Model Inference, etc. 一个系统:PolarDB PolarDB for AI SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm_copy1,SELECT * FROM airlines_train_1000_copy1) WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,primary_key='TripID', mode='async') INTO lightgbm_v2_predict82201; PolarDB for AI:场景化 场景一:从数据到模型到应用 模型创建 CREATE MODEL airlines_gbm WITH (model_class='lightgbm’,x_cols ='Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,model_parameter=(boosting_type='gbdt', n_estimators=100, max_depth=8, num_leaves=256))as (SELECT * FROM db4ai.airlines_train) 模型开发 模型评估 SELECT Delay FROM evaluate(MODEL airlines_gbm, SELECT * FROM airlines_test)WITH (x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length’,y_cols='Delay', metrics='acc'); 模型应用 结果查看 SHOW TASK `df05244e-21f7-11ed-be66-xxxxxxxxxxxx`; DESCRIBE MODEL airlines_gbm;模型描述 模型列表SHOW MODELS 模型在线推理 模型离线推理 SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm_copy1,SELECT * FROM airlines_train_1000_copy1) WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,primary_key='TripID') INTO lightgbm_v2_predict1030; SELECT TripID,Delay FROM PREDICT ( MODEL airlines_gbm,SELECT * FROM airlines_train_1000_copy1) WITH ( s_cols='TripID,Delay’,x_cols = 'Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length', y_cols='Delay’,primary_key='TripID') https://help.aliyun.com/zh/polardb/polardb-for-mysql/user-guide/polardb-for-ai 场景二:预训练的模型 模型上传 UPLOAD MODEL my_model WITH(model_location='https://xxxx/model.pkl?Expires=xxxx&OSSAccessKeyId=xxxx&Signature=xxxx’,req_location='https://xxxx/requirements.txt?Expires=xxxx&OSSAccessKeyId=xxxx&Signature=xxxx') DEPLOY MODEL my_model;模型部署 模型在线推理 SELECT Y FROM PREDICT(MODEL my_model,SELECT * FROM db4ai.regression_test LIMIT 10)WITH (x_cols = 'x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28', y_cols=''); or DEPLOY MODEL model_name WITH (mode = 'in_db'); CREATE FUNCTION function_name RETURNS return_value SONAME "soname";UDF创建 SELECT function_name("content");UDF使用 场景三:开箱即用的方案 情感分析 reviews SELECT * FROM PREDICT (MODEL_polar4ai_tongyi_sa,SELECT product_review FROMreviews WHERE id=1) WITH (); 总结 result select * FROM PREDICT (MODEL_polar4ai_tongyi_sum