glam/docs/oclc/extracted_enterprise_kg/OEBPS/xhtml/18_chapter05.xhtml

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en-US">
<head>
<title>Designing and Building Enterprise Knowledge Graphs</title>
<link href="../styles/stylesheet.css" rel="stylesheet" type="text/css"/>
<link href="../styles/page-template.xpgt" rel="stylesheet" type="application/vnd.adobe-page-template+xml"/>
<meta content="urn:uuid:81982e4f-53b2-476f-ab11-79954b0aab3c" name="Adept.expected.resource"/>
</head>
<body epub:type="bodymatter">
<section epub:type="chapter">
<h1 class="chno" epub:type="title"><span epub:type="pagebreak" id="page_129" title="129"/>CHAPTER 5</h1>
<h1 class="chtitle" epub:type="title">What’s Next?</h1>
<p class="noindent">You are now asking how knowledge graphs relate to traditional movements and new trends:</p>
<section>
<h2 class="head2" id="ch5_1">5.1<span class="space3"/><span epub:type="title">COULDN’T I HAVE DONE THIS WITH A RELATIONAL DATABASE?</span></h2>
<p class="noindent">Yes. Technically, you can do whatever you want with a <i>Turing-complete language.</i> It’s all just software. The question is if a relational database is the right technology to manage enterprise data and metadata. Recall, “The limits of my language mean the limits of my world.” We have already been using relational database technology for almost half a century and we continue to experience this same conundrum. That is why we need to evolve enterprise data management.</p>
</section>
<section>
<h2 class="head2" id="ch5_2">5.2<span class="space3"/><span epub:type="title">ISN’T THIS JUST MASTER DATA MANAGEMENT?</span></h2>
<p class="noindent">Master Data Management (MDM) entails data integration of master data (customer, products, etc.). However, MDM is not a technology. It is considered a business discipline in which data consumers and data producers work together to ensure that an enterprise’s master data assets are accurate and consistent. Therefore, MDM is one of the applications of Knowledge Graphs because it is a way to connect all your data together.</p>
</section>
<section>
<h2 class="head2" id="ch5_3">5.3<span class="space3"/><span epub:type="title">KNOWLEDGE GRAPHS AND AI</span></h2>
<p class="noindent">The goal of Artificial Inteligence—per some definition—is to build software agents that can display human intelligence. Inference, also called “reasoning,” is traditionally a mechanism where via the application of <i>rules—</i>typically rules expressed in mathematical logic—graph data that is implicit can be made explicit, or where inconsistencies in your graph can be detected. For example, consider a graph containing information about people and their parents. Implicitly, the graph will reveal who people’s grandparents are, but this information is not explicit. By defining that a “grandparent” is the “parent” of one’s “parent,” one can add “grandparent” edges to the graph and thus make this information explicit. Obviously, this is a very trivial example, but illustrates the point about implicit vs. explicit knowledge. This type of reasoning is called “symbolic”, since it is based on the manipulation of symbols—in case of the example, the symbols include “parent” and “grandparent.” “Non-symbolic” methods typically take the form of the application of statistical calculations to reveal implicit information or patterns in your data. Both types are discussed below.</p>
<figure>
<div class="image" id="fig5_1"><img alt="Image" src="../images/fig5_1.jpg"/></div>
<figcaption>
<p class="figcaption"><span class="blue">Figure 5.1:</span> Simple example of entailment.</p>
</figcaption>
</figure>
<section>
<h3 class="head3" id="ch5_3_1"><span epub:type="pagebreak" id="page_130" title="130"/>5.3.1<span class="space3"/><span epub:type="title">SYMBOLIC REASONING</span></h3>
<p class="noindent">The mechanisms for symbolic reasoning are defined for RDF graphs, by way of how the semantics of RDF Schema—RDF(S) for short—and OWL are defined.</p>
<p class="indent">RDF(S) is a simple ontology language which—in terms of logic—does not contain the notion of <i>negation.</i> This means that RDF(S) reasoning cannot create or reveal inconsistencies in your graph, it can merely add data. The added data—new edges in your graph—are called “entailments.” RDF(S) establishes a simple object-oriented modeling system, with classes and subclasses, and can also organize property types (i.e., the kinds of edges you can have) hierarchically, as properties and “sub-properties.” Reasoning mostly exploits the transitive nature of the subclass and sub-property relations: for example, if A is a subclass of B, and B is a subclass of C, we can infer that A is also a subclass of C, and so on. All this type of reasoning is “hard-wired” in RDF(S), so you cannot define any new relations that would be transitive, for example. See <a href="#fig5_1">Figure <span class="blue">5.1</span></a> for a simple example of entailment in RDF.</p>
<p class="indent">OWL adds to the expressive power of RDF(S). For example, you can define new transitive relations, you can define a relation and its “inverse relation,” and you can state that two nodes are, in fact, the same. As an example, if we define that “has-parent” is the inverse of, say, “has-child,” then asserting that Bob has a parent called Alice allows the inference engine to conclude that Alice has a child called Bob.</p>
<p class="indent">The convenient characteristic of inference without negation is that you can establish a bigger, “virtual” graph that contains not only the original data but also all the entailments. Assuming you are running an inference engine either in your graph database or in front of it, your application code does in fact not need to know that inference is being performed. You merely query against the larger graph, whether that be virtual or whether the entailments have been physically materialized and added to the original graph.</p>
<p class="indent">Adding negation to the mix can complicate things from the application development standpoint. No longer can your application be ignorant about the existence of an inference engine, since inference may uncover conflicts or violations of constraints expressed in your model.</p>
<p class="indent"><span epub:type="pagebreak" id="page_131" title="131"/>In OWL, negation comes in several forms: <i>Restrictions</i> are definitions of logical constraints that must hold, typically for the properties of a class in one way or another (e.g., the value of a property must always be of certain type, or there can only be at most one of a particular property). Other constraints include <i>disjointness</i> (e.g., the class of Cats and the class of Dogs are defined as disjoint, so the discovery of an instance that is both a Cat and a Dog is a violation of this constraint).</p>
<p class="indent">OWL specifications define several “profiles” for the language, with sligthly different semantics as well as different computational requirements and benefits. It is also very typical to employ something often referred to as “RDF+,” an ad hoc mix of RDF(S) and some OWL features. Finally, some reasoning engines allow you to define “custom rules” to add to the reasoning capabilities of the system.</p>
</section>
<section>
<h3 class="head3" id="ch5_3_2">5.3.2<span class="space3"/><span epub:type="title">NON-SYMBOLIC REASONING</span></h3>
<p class="noindent">Reasoning does not have to be limited to classical, symbolic methods. If we consider the term more broadly, it can mean any methods by which we, say, uncover implicit information from the graph, or identify conflicts, constraint violations, or errors, in the graph. Reasoning is a way to enrich a knowledge graph, and this can be done by way of symbolic or non-symbolic inference (or even by purely procedural calculation and processing: sometimes this means the use of graph algorithms that consider the entire graph to calculate new, useful information about the graph).</p>
<p class="indent">Graph models and graph data are usually easy to understand, easier than corresponding relational models. Because of this, graphs can serve as input for machine learning models and make the work of data scientists easier. Results of machine learning, as suggested above, can be inserted back into the graph, creating a “virtuous cycle” of graph enrichment.</p>
<p class="indent">There are several books and articles that survey the landscape of ML and graphs, e.g., <span class="blue">Hamilton</span> [<span class="blue">2020</span>]<sup><a epub:type="noteref" href="#pgfn5_1" id="rpgfn5_1">1</a></sup> and <span class="blue">Nickel et al.</span> [<span class="blue">2016</span>].</p>
</section>
</section>
<section epub:type="footnotes">
<div epub:type="footnote" id="pgfn5_1"><p class="pgnote"><sup><a href="#rpgfn5_1">1</a></sup> The book [<span class="blue">Hamilton, 2020</span>] is available in pre-print form here: <span class="blue"><a href="https://www.cs.mcgill.ca/~wlh/grl_book/">https://www.cs.mcgill.ca/~wlh/grl_book/</a>.</span></p></div>
</section>
</section>
</body>
</html>