AI智能总结
周敏华为云算法创新Lab主任工程师 自我介绍 ◆本科毕业于中科大,博士毕业于新加坡国立大学◆研究方向:图数据、序列数据模式挖掘和学习 目 录C O N T E N T 01Background Semantic-aware active learning on graph02 03Unlabeled Nodes Labeling for imbalanced Graph 04Conclusion Background01 Graphs •Data available in the form of graphs are ubiquitous. Financial networks Fraud Detection Link prediction, community detection, nodeclassification, etc. Challenges inFraud Detectiondetection •Graph Neural Networks are promising tools for fraud detection–Label scarce–Class imbalance Active Learningon Graphs02 Label scarce •Labels are hard/expensive to collect Active Learning in Machine Leaning •Prioritizing the data which needs to be labelled in order to have the highest impact totraining a model. Active Learning in Machine Leaning •Prioritizing the data which needs to be labelled in order to have the highest impact totraining a model. •Valuable samples--The most informative examples are the ones that the classifier is the leastcertain about. Active Learning in Graph Machine Leaning •Selects the most informative nodes as the training labelled nodes based on thegraphical information •Design different graph-based criteria for node selection on graphs–AGE : Uncertainty (entropy) & Representativeness (density & centrality)–GRAIN: Influence Maximization& Diversity https://arxiv.org/pdf/1705.05085.pdfhttps://arxiv.org/pdf/2108.00219.pdf Semantic-aware Graph ActiveLearning ⚫Mitigating Semantic Confusion from Hostile Neighborhood Semantic-aware Graph ActiveLearning •Semantic-aware Influence correction •Node influence •Semantic-aware influence •1 Semantic-aware Graph ActiveLearning •Semantic-aware Influence correction •Node influence •Semantic-aware influence •1 Semantic-aware Graph ActiveLearning •Prototype-based Diversity •Score unifying Semantic-aware Graph ActiveLearning ⚫Experiments Semantic-aware Graph ActiveLearning Class Imbalanceon Graphs03 Imbalance Problem in Machine Leaning •Data imbalance leads to decision boundary shift. Solutions for Learning from Imbalanced Data •Re-sampling/re-weighting/cost-sensitivity/hybrid Solutions for Learning from Imbalanced Graph Data –Feature extractor–Synthetic Node Generation–Edge Generator–GNN Classifier Solutions for Learning from Imbalanced Graph Data •Renode:Topology-imbalance learning for semi-supervised node classification–Re-weight the samples according to their distance to the classification boundary •GraphENS:Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification–synthesizes the whole ego network for minor class •TAM: Topology-aware margin loss for Class-imbalanced node classification–Modify loss based on statistics of the true label distributions of target nodes and classes Unlabeled Nodes Retrieval and Labeling •Oversampling without synthesizing virtual nodes. •Take advantage of unlabeled information on graphs. •Traditional Self-Training(ST) encounters pseudo-labelmisjudgementaugmentationproblem in imbalanced learning. Unlabeled Nodes Retrieval and Labeling •Dual Pseudo-tag Alignment Mechanism for Node Filtering •Node-Reordering–Geometric ranking–Confidence ranking•Geometric certain node selection–Select the most certain node Unlabeled Nodes Retrieval and Labeling •Experiments Unlabeled Nodes Retrieval and Labeling •Experiments Acknowledgement 华为云算法创新Lab 面向全域,聚焦云智能Top挑战问题 Q&A 实验室目前有多个领域的实习生/博后/研究员岗位开放详情请关注部门主页 ➢云智能运营/云硬件可靠性管理 zhoumin27@huawei.com 非 常 感 谢 您 的 观 看