您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[清华大学]:知识图谱综述 - 发现报告
当前位置:首页/行业研究/报告详情/

知识图谱综述

知识图谱综述

知识图谱综述CCFADL65期《知识图谱前沿》2015年12月26日李涓子(lijuanzi@mail.tsinghua.edu.cn)清华大学计算机科学与技术系赵军(jzhao@nlpr.ia.ac.cn)中国科学院自动化研究所 主要内容知识图谱概览人工智能中的知识表示语义Web中的语义观知识图谱与应用知识图谱的种类知识图谱构建技术总结 3智能智能是人类运用知识解决问题的能力演绎能力和归纳能力人工智能研究怎样使计算机来模仿人脑所从事的推理、学习、思考、规划等思维活动,解决需人类专家才能处理的复杂问题。将人的思考过程、智能活动一部分机械化智能系统智能系统利用人工智能技术解决实际问题智能、人工智能和智能系统 人工智能的三种研究方法符号主义假设:智能活动的基础是物理符号系统,思维过程是符号模式的处理过程(纽威尔,西蒙,1976)核心是知识表示Assumption:Humanandcomputerbotharephysicalsymbolsystems“Symbolsandsearch” 符号系统中的知识表示按照描述知识的性质陈述性知识过程性知识按照知识描述的涵盖范围常识性知识(通用知识):Cyc,HowNet领域知识 符号系统中的知识表示逻辑(Logic)命题逻辑,谓词逻辑,描述逻辑产生式规则(Production Rule)状态空间,规则语义网络(Sematic Network)概念及其关系框架(Frame)Marvin Minsky1974类,子类,继承,槽KL-ONE language 符号系统中的知识表示(cont.)脚本(Script)Roger Schank, Robert P. Abelson特定事件序列的结构化表示A script is a structured representation describing a stereotyped sequence of events in a particular context.本体(ontology)Gruber,1993概念及其关系An ontology is a formal explicit specification of a shared conceptualization of a domain of interest 联结主义的表示学习联结主义来自大脑或神经系统的整体活动数据驱动的方法感知机-50s神经元-50s-60s神经网络-80s-深度学习-2000-表示学习的结果:参数的权重 行为主义的人工智能方法行为主义感知—行动模拟人在控制过程中的智能行为和作用自组织、自适应等智能控制,智能机器人 10http://info.cern.chThe world's first-ever web server in 1990Tim Berners-LeeInfrastructureInternet hypertext (HTML)Web BrowserHTTP Engineer changed the World Nov 2010Web Science @ Tsinghua 11Tim Berners-Lee’s Proposal 1989Linked Data was there 12语义Web的提出The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to workin co-operation. –Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001 AnnotatedWebpagesAnnotatedWebpagesOntologyAgentsAgents 哲学中的本体概念三角形“Tank“ReferentFormStands forRelates toactivatesConcept[Ogden, Richards, 1923]?Ontology is the philosophical study of the nature of being, becoming, existence, or reality, as well as the basiccategoriesof being and their relations. ---Wikipedia 本体An ontologyis a formal, explicit specification of a shared conceptualization –Gruber 1993Conceptualization: an abstract model of phenomena in the world by having identified the relevant concepts of those phenomena.Explicit: the type of concepts used, and the constraints on their use are explicitly defined. Formal: the fact that the ontology should be machine readable. Shared: ontology should capture consensual knowledge accepted by the communities 万维网信息描述语言塔 RDF stands for作用知识共享知识重用Web语义互操作含义Resource(资源): pages, dogs, ideas... everything that can have a URI Description(描述): attributes, features, and relations of the resourcesFramework(框架): model, languages and syntaxes for these descriptions 17特征Web拥有唯一的URI事物事物之间由链接关联(如人物、地点、事件、建筑物)事物之间链接显式存在并拥有类型Web上数据的结构显式存在数据万维网yingfoaf:Personrdf:typeYing Dingfoaf:nameStefanfoaf:knowsdb:Galway72Kdp:populationdp:Cities_in_Irelandskos:subjectdp:Dublinfoaf:based_nearskos:subjectdblp:publicationsfoaf:publication 数据万维网 FreebaseInvited talk at ISWC2008Freebase: a open, writable database of world’s informationmetaWeb funded in 20052010年被google收购 What’s in freebase? -Light type systemDomain: a collection oftypeswhich share namespaceSchema: Each type has collection of zero or more properties, known as the schema of that typeTopic: one concept or one entity with globally unique IDType: properties are grouped into types, an object that is used to semantically group topics Property: attribute of a topicLiteral: string, numeric value, Boolean, or timestampNow wikidata(https://www.wikidata.org) Google: encourageusing rich snippets:October 2009Google RDFa support for videos:September 2009Google image’s license using RDFa:August 2009Google rich snippets for review:May 2009Bing acquires Powerset: July 2008Google: structured data of an organization: March 2010Facebook: open graph protocol based on RDFa: April 2010 Google acquires MetaWeb:July 2010 Google Refine: November 2010Google rich snippets for shopping sites:November 2010Google: knowledge graph andembedded in search engine: May 2012Google, Yahoo, and Bing: Schema.org: June 2011Yahoo! : SearchMonkyFebruary 2008Industry effort to make semantic contentschema.orgGoogle, yahoo, microsoft 什么是知识图谱TheKnowledgeGraphisasystemthatunderstandsfactsaboutpeople,placesandthingsandhowtheseentitiesareallconnected.知识图谱本质上是一种语义网络。其结点代表实体(entity)或者概念(concept),边代表实体/概念之间的各种语义关系。 知识图谱再认识知识图谱不是一种新的知识表示方法知识图谱是表示客观世界实体和实体关系的知识库知识图谱结点对是互联网上可以识别的客观世界对象知识图谱是知识表示在工业界的大规模知识应用知识图谱的数据模型是图模型 语义数据集成将知识图谱与图谱之外的数据源进行基于语义的集成搜狗搜索 互联网语义搜索知识图谱的语义链接,使得搜索引擎可以用基于实体的搜索来代替基于字符串的搜索,从而实现搜索时的歧义消除 互联网语义搜索entity and relation summarizationEntity search and ranking 基于知识库的问答系统 基于知识的行业大数据分析影视大数据分析最具影响力和市场价值的主力受众:中年男性专业人士受欢迎电视剧类型:政治惊悚剧受欢迎导演:大卫•芬奇受欢迎演员:凯文•史派西观看偏好:一次观看多集基于知识图谱的影视元素关系挖掘:预测出凯文.史派西、大卫.芬奇和“BBC出品”三种元素结合在一起的电视剧产品相比传统文本的方式大大提高了影视数据分析的精准度和可行性 The Web 1.0Connects informationWeb of