知识图谱多步推理

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
1. Stanford 知识图谱多步推理 任泓宇 斯坦福大学四年级博士研究生
2. 自我介绍 科研兴趣:图机器学习,逻辑推断, 常识推理,表征学习 任泓宇 斯坦福大学计算机系四年级博士生 导师:Jure Leskovec教授 Hongyu Ren, Stanford University 2
3. Knowledge Graphs § Knowledge Graphs are heterogenous graphs § Multiple types of entities and relations exist § Facts are represented as triples ℎ, 𝑟, 𝑡 § (‘Mona Lisa’, ‘created_by’, ‘Leonardo da Vinci’) § (‘Paris’, ‘is_a’, ‘City’) § … § Examples: Hongyu Ren, Stanford University 3
4. Traditional Tasks Knowledge Graph Completion Predict the missing head or tail for a given triple (ℎ, 𝑟, 𝑡) Hongyu Ren, Stanford Jure Leskovec, Stanford University University 4
5. Logical Reasoning Goal: Complex logical reasoning on KG § For example: § Where did Canadian Turing Award winners graduate? § Who are current presidents of European countries which never held a (soccer) world cup? § Predict drugs that might target proteins that are associated with SARS-CoV2? Hongyu Ren, Stanford University 5
6. Examples: Applications in biomedicine: Bio KG Hongyu Ren, Stanford Jure Leskovec, Stanford University University 6
7. Examples: Applications in recommender systems: Query formula Query plan (a DAG) Example subgraphs that satisfy the query Hongyu Ren, Stanford University 7
8. Traditional Approach § Translate the question into a structured query § Execute the query on the knowledge graph § Match grounded entities § “Traverse” the knowledge graph along the relations Hongyu Ren, Stanford University 8
9. Our Task: Complex Link Prediction Given arbitrary logical query, directly predict the answer entities: “Where did all Canadian citizens with Turing Award graduate?” Logical query: 𝑞 = 𝑉 ? . ∃ 𝑉 ∶ 𝑊𝑖𝑛(𝑇𝑢𝑟𝑖𝑛𝑔𝐴𝑤𝑎𝑟𝑑, 𝑉) ∧ 𝐶𝑖𝑡𝑖𝑧𝑒𝑛 𝐶𝑎𝑛𝑎𝑑𝑎, 𝑉 ∧ 𝐺𝑟𝑎𝑑𝑢𝑎𝑡𝑒 𝑉, 𝑉 ? Pearl Win Query plan Turing Award Win Hinton Turing Award NYU Graduate Bengio Intersection Edinburgh Cambridge Canada Citizen Graduate Intersection Bieber Canada Citizen Jure Leskovec, Stanford University University Hongyu Ren, Stanford Trudeau McGill Answer entities 9
10. Challenges KGs are incomplete and noisy Traversing them requires time exponential in the query depth Completing the KG creates additional links, which further slows down query answering § How to capture uncertainty § How to impute missing relations § How to efficiently answer queries on large graphs Hongyu Ren, Stanford University 10
11. Our General Idea Logical query Reason in the embedding space Knowledge graph Embedding space § Map queries into embedding space. § Learn to reason in that space [Embedding Logical Queries on Knowledge Graphs. Hamilton, et al., NeurIPS 2018] Hongyu Ren, Stanford Jure Leskovec, Stanford University University 11
12. Our Framework Goal: Answer logical queries E.g.: “Predict drugs C likely target proteins X associated with diseases d 1 and d 2 ” Idea: Logical operators become spatial operators Hongyu Ren, Stanford University 12
13. Reasoning over KG § Query2box: box embeddings for EPFO queries (with ∃, ∧, ∨) Hongyu Ren, Stanford University 13
14. Our Idea: Query2Box Idea: § 1) Embed entities/relations of the KG § 2) For every logical operator learn a spatial operator So that: § 1) Take an arbitrary logical query. Decompose it into a set of logical operators (∃,∧,∨) § 2) Apply a sequence of spatial operators to embed the query § 3) Answers to the query are entities close to the embedding of the query Hongyu Ren, Stanford University 14
15. Our Idea: Query2Box Idea: § 1) Embed nodes of the graph § 2) For every logical operator learn a spatial operator Key insight: So that: Represent query as a box. § 1) Take an arbitrary logical query. Decompose Operations (intersection etc.) it into a set of logical operators (∃,∧,∨) are well defined over boxes. § 2) Apply a sequence of spatial operators to embed the query § 3) Answers to the query are entities close to the embedding of the query Hongyu Ren, Stanford University 15
16. Embedding Queries Query2Box embedding: Embed queries with hyper-rectangles (boxes): 𝐪 = (𝐶𝑒𝑛 𝑞 , 𝑂𝑓𝑓(𝑞)). Cambridge 𝐶𝑒𝑛(𝑞) Edinburgh McGill Stanford Embedding Space [Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures. Vilnis, et al., ACL 2018] Hongyu Ren, Stanford University 16
17. Embedding Queries Query Plan Turing Award Canada Projection Intersection Projection Projection Intersection § Geometric Projection Operator § Geometric Intersection Operator Hongyu Ren, Stanford University 17
18. Projection Operator Geometric Projection Operator 𝓟 § 𝒫 : Box × Relation → Box 𝑞′ 𝑟 𝑞 Hongyu Ren, Stanford University 18
19. Projection Operator: Example Pearl Win Turing Award Hinton Win Pearl Turing Award Bengio Hinton Bengio Bengio Hinton Edinburgh McGill Graduate Edinburgh Cambridge McGill Cambridge Graduate Bengio Hongyu Ren, Stanford University Hinton 19
20. Intersection Operator Geometric Intersection Operator 𝓘 § ℐ : Box × ⋯× Box → Box § The new center is a weighted average § The new offset shrinks Hongyu Ren, Stanford University 20
21. Intersection Operator: Example Pearl Win Hinton Turing Award Turing Award Win Pearl Bengio Bengio Bieber Canada Citizen Canadian Trudeau Citizen Hinton Bieber Trudeau Hongyu Ren, Stanford University 21
22. Example “Where did Canadian citizens with Turing Award graduate?” Hongyu Ren, Stanford University 22
23. Example “Where did Canadian citizens with Turing Award graduate?” Query Plan Embedding Process Turing Award McGill Pearl Bengio Canada Cambridge Edinburgh Hinton Bieber Trudeau Hongyu Ren, Stanford University 23
24. Example “Where did Canadian citizens with Turing Award graduate?” Query Plan Turing Award Canada Embedding Process Turing Award Projection Win McGill Win Pearl Projection Citizen Bengio Canada Each point corresponds to a set of entities Hongyu Ren, Stanford University Citizen Cambridge Edinburgh Hinton Bieber Trudeau 24
25. Example “Where did Canadian citizens with Turing Award graduate?” Query Plan Turing Award Canada Embedding Process Turing Award Projection Win Projection Citizen Intersection McGill Win Pearl Bengio Intersection Canada Each point corresponds to a set of entities Hongyu Ren, Stanford University Citizen Cambridge Edinburgh Hinton Bieber Trudeau 25
26. Example “Where did Canadian citizens with Turing Award graduate?” Query Plan Turing Award Canada Embedding Process Turing Award Projection Win Intersection Projection Citizen Projection Intersection Graduate McGill Win Pearl Bengio Canada Each point corresponds to a set of entities Hongyu Ren, Stanford University Citizen Cambridge Edinburgh Graduate Hinton Bieber Trudeau 26
27. How to Handle Disjunction So far we can handle Conjunctive queries Can we learn a geometric disjunction operator? § Theorem (paraphrased): For a KG with 𝑀 nodes, we need embedding dimension of 𝑀 to handle disjunction Hongyu Ren, Stanford University 27
28. Disjunctive Normal Form § Any query with AND and OR can be transformed into equivalent Disjunctive Normal Form (disjunction of conjunctive queries). § 𝑎∨𝑏 ∧𝑐 = 𝑎∧𝑐 ∨ 𝑏∧𝑐 Converted Query Plan Original Query Plan 𝑣 ! 𝑣 ! Projection 𝑣 " Projection Projection Projection Union Projection Union 𝑣 # Projection Intersection 𝑣 # 𝑣 " Projection Projection Projection Intersection Union Intersection Intersection Union Intersection Hongyu Ren, Stanford University 𝑣 # Projection Intersection 28
29. Disjunctions: Solution Given an arbitrary AND-OR query § 1) Transform it into an DNF § 2) Answer each conjunctive query § 3) Overall answer is the union of conjunctive query answers Hongyu Ren, Stanford University 29
30. Benefits of Query2Box Scalability and efficiency: § Any query can be reduced to a couple of matrix operations and a single k-nearest neighbor search Generality: § We can answer any query (even those we have never seen before) Robustness to noise: § Graph can contain missing and noisy relationships Hongyu Ren, Stanford University 30
31. Self-Supervised Training Training examples: A batch of queries on the graph For each query, § One Positive: Path with a known answer § 𝒌 (~200) Negatives: Non-answer nodes of the correct answer type § Contrastive Objective: Minimize the distance between embedding of query and positive answer and maximize for negatives. Hongyu Ren, Stanford University 31
32. Limitations § Query2box (Q2B) can only handle a subset of first-order logic queries with only ∃, ∨, ∧. § How to handle negation ¬? What are drugs that do not cause Short of Breath and treat diseases associated with protein ESR2? 𝑞 = 𝑉 ? . ∃ 𝑉 ∶ 𝐴𝑠𝑠𝑜𝑐(𝐸𝑆𝑅2, 𝑉) ∧ 𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝐵𝑦 𝑉, 𝑉 ? ∧ ¬ 𝐶𝑎𝑢𝑠𝑒𝑑𝐵𝑦 𝑆ℎ𝑜𝑟𝑡𝑂𝑓𝐵𝑟𝑒𝑎𝑡ℎ, 𝑉 ? ESR2 Intersection Assoc TreatedBy Short of Breath CausedBy Negation Hongyu Ren, Stanford University Intersection 32
33. Embedding Queries with Negation § Why modeling negation is interesting? § Reasoning in real-world requires negation § Can handle full set of FOL queries § Provides an alternative way to model disjunction De Morgan’s laws: 𝑎 ∨ 𝑏 = ¬ ¬𝑎 ∧ ¬𝑏 § Can Q2B handle negation? No, the complement of a box is not a box 𝑞 Embedding Space Hongyu Ren, Stanford University 33
34. Our Idea: BetaE Idea: § Embed entities and queries as Beta distributions § Design probabilistic logical operators for ∃, ∧, ¬. So that: § Logical operators are closed (especially for ¬) § Can naturally represent ∨ with ∧ and ¬: § P ∨ Q = ¬ ((¬P) ∧ (¬Q)) § Handle the uncertainty of queries: Entropy of Beta embedding ↔ number of answers § Embed any FOL query and find answers by measuring the “distance” between query embedding and an entity in the embedding space Hongyu Ren, Stanford University 34
35. BetaE Example Hongyu Ren, Stanford University 35
36. Experimental Setup Questions: § Does our method generalize to new unseen queries? § Does our method generalize to new query structures? § Can method model the uncertainty? Hongyu Ren, Stanford University 36
37. Experimental Setup Training on an incomplete graph: Test queries are not answerable in the training graph § Every test query has at least an edge missing § Method has to (implicitly) impute missing edges § Note: Query template matching / Traversing would have accuracy of random guessing Jure Leskovec (@jure), Stanford University 37
38. KG and Query Statistics § Datasets FB15K, FB15K-237, NELL995 NELL: Never-Ending Language Learning § Queries: Training + Evaluation Queries Further Evaluation Queries u 1p n 2in 2p n 3in 3p n inp 2i n pni 3i n pin Hongyu Ren, Stanford University ip pi u 2u u u up 38
39. AND-OR Queries: Results Training Queries Unseen Queries u u u u H@1 results of BetaE, Q2B, GQE on FB15k, FB15k-237, NELL995 Observations: § GQE maps query into point vectors instead of box/beta § On “training” queries: +18.4% H@1 § On new conjunctive query structures: +13.5% H@1 Hongyu Ren, Stanford University 39
40. Negation Results Queries with negation n n n n n § BetaE is the first method to effectively handle queries with negation! Hongyu Ren, Stanford University 40
41. Embedding for KG Reasoning § Box/Beta embeddings for answering logical queries on knowledge graphs § Handle any FOL query with ∃, ∨, ∧, ¬ § Generalize well to unseen, extrapolated queries Hongyu Ren, Stanford Jure Leskovec, Stanford University University 41
42. Embedding for KG Reasoning § Box/Beta embeddings for answering logical queries on knowledge graphs § Handle any FOL query with ∃, ∨, ∧, ¬ § Generalize well to unseen, extrapolated queries § How to scale up these algorithms on extremely large graphs? Hongyu Ren, Stanford Jure Leskovec, Stanford University University 42
43. Reasoning over KG § Query2box: box embeddings for EPFO queries § BetaE: beta embeddings for FOL queries § SMORE: scale Q2B/BetaE to extremely large graphs Hongyu Ren, Stanford University 43
44. Scalable Reasoning KG embedding Scalable Multi-hOp REasoning framework (SMORE) § Asynchronous Design with Optimized Pipeline § Multi-thread bidirectional sampler with pre-fetching mechanism § Sparse embedding R/W: CPUóPinned Memory óGPU (including 1 st /2 nd order moments for optimizer) Hongyu Ren, Stanford University 44
45. Large-Scale Benchmarks § We propose three new large KGs to benchmark KG reasoning algorithms. (1365x larger) # Entities Ours # Edges 1E+09 100000000 10000000 Ours 100000000 Existing 1000000 10000000 Existing 100000 1000000 10000 100000 1000 FB15k FB15k-237 NELL995 FB15k FB15k-237 NELL995 FB400k OGB-Wikikg Freebase FB400k OGB-Wikikg Freebase Hongyu Ren, Stanford University 45
46. Small KG Performance Compared with existing multi-hop implementation § Small graphs: § Training speed: 2.2x § GPU memory: -30.6% Hongyu Ren, Stanford University 46
47. Large KG Performance § § § SMORE supports 6 different methods Prior implementation runs out of GPU memory and time limit on the three large KGs SMORE enables (almost) graph-size agnostic speed and GPU memory usage Iterations / Sec 40 35 # Entities 100000000 10000000 1000000 30 25 20 GPU Mem (MB) 2100 1600 100000 1100 600 100 Hongyu Ren, Stanford University 47
48. Link Prediction Runtime of ComplEx on Freebase § SMORE also supports link prediction KG embeddings: TransE, DistMult, RotatE, ComplEx. § SMORE achieves comparable/superior performance with existing systems Hongyu Ren, Stanford University 48
49. Multi-GPU Speedup § SMORE achieves almost linear speedup across models and datasets § https://github.com/google-research/smore Hongyu Ren, Stanford University 49
50. Conclusion Logical query Reason in the embedding space Knowledge graph Embedding space § Reasoning in the embedding space § Box/Beta Embeddings for answering complex queries on KGs § SMORE: Large-scale system for training on massive knowledge graphs § LEGO: apply embedding models to answer natural language questions. Hongyu Ren, Stanford University 50
51. References § § § § § Query2box: Reasoning Over Knowledge Graphs In Vector Space Using Box Embeddings. H. Ren, W. Hu, J. Leskovec. ICLR, 2020. Scaling up Logical Query Embeddings on Knowledge Graphs. H. Ren, H. Dai, B. Dai, X. Chen, D. Zhou, J. Leskovec, D. Schuurmans. ICML SSL workshop 2021. Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs. H. Ren, J. Leskovec. NeurIPS 2020. LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs. H. Ren, H. Dai, B. Dai, X. Chen, M. Yasunaga, H. Sun, D. Schuurmans, J. Leskovec, D. Zhou. ICML 2021. Code: § https://github.com/snap-stanford/KGReasoning § https://github.com/google-research/smore Hongyu Ren, Stanford University 51
52. Collaborators Hongyu Ren, Stanford University 52

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.125.0. UTC+08:00, 2024-05-09 03:56
浙ICP备14020137号-1 $访客地图$