From OntologPSMW

Revision as of 00:59, 26 June 2020 by JanetSinger (Talk | contribs)

Jump to: navigation, search
[ ]


Ontology Summit 2020 Whence Working Group     (1)

  • What brought the use of graphs in as persistence mechanisms. Will need to address other non-relational persistence mechanisms; historical background; this can include parts of 'why'.     (1A)

First Draft     (1B)

As a possible starting point on our overview, here are some excerpts from Gutierrez and Sequeda A Brief History of Knowledge Graph's Main Ideas. They give accomplishments and foci for each of the eras followed by ‘realizations’ and ‘limitations’. A related report by Juan Sequeda is at Also Sequeda and others presented Whence related material at the Stanford Course Lectures (CS520)     (1B1)

  • Introduction “Those who cannot remember the past are condemned to repeat it”- George Santayana     (1B2)
    • Knowledge Graphs can be considered to be fulfilling an early vision in Computer Science of creating intelligent systems that integrate knowledge and data at large scale. The term “Knowledge Graph” was introduced by researchers at the turn of this century and has rapidly gained popularity in academia and industry since Google popularized it in 2012. It is paramount to note that, regardless of the discussions on, and definitions of the term “Knowledge Graph”, it stems from scientific advancements in diverse research areas such as Semantic Web, Databases, Knowledge Representation and Reasoning, NLP, Machine Learning, among others. … The integration of ideas and techniques from such disparate disciplines give the richness to the notion of Knowledge Graph, but at the same time presents a challenge to practitioners and researchers to know how current advances develop from, and are rooted in, early techniques.     (1B2A)
  • How is this paper written?     (1B3)
    • The essential elements involved in the notion of knowledge graphs can be traced to ancient history. If one would like to dig into their origins, several disciplines should be considered, among them mathematics, philosophy, linguistics, and psychology.[2] However, we do not have the time to go back to ancient times[3] and revisit broad areas of science. Thus, from a temporal point of view, we will concentrate on the evolution after the advent of computing in its modern sense (1950s). … We periodized by decades, but are conscious that the boundaries are much more blurry.[4]     (1B3A)
  • Advent of the digital age (1950s and 1960s)     (1B4)
    • Realizations during the decades of the 50s and 60s:     (1B4A)
      • Importance and possibility of automated reasoning.     (1B4A1)
      • The problem of dealing with large search spaces.     (1B4A2)
      • Need to understand natural language and other human representations of knowledge     (1B4A3)
      • Potential of semantic nets (and graphical representations in general) as abstraction layers     (1B4A4)
      • Relevance of systems and high level languages to manage data.     (1B4A5)
    • Limitations of contemporary (50s and 60s) techniques:     (1B4B)
      • Physical, technical and cost limitations of hardware     (1B4B1)
      • Gap between graphical representation and linear implementation     (1B4B2)
      • Gap between the logic of human language and data as handled by computer systems     (1B4B3)
  • Foundations Data and Knowledge (1970s)     (1B5)
    • Realizations:     (1B5A)
      • The need of representational independence, having the relational model as the first example. This approach could also be implemented in practical systems.     (1B5A1)
      • The need to formalize semantic networks using the tools of formal logic.     (1B5A2)
      • The possibilities of combining logic and data by means of networks.     (1B5A3)
    • Contemporary Limitations:     (1B5B)
      • On the DATA side, more flexible data structures were needed to represent new forms of data giving rise to Object Oriented and Graph data structures.     (1B5B1)
      • On the KNOWLEDGE side, more understanding was needed on the formalization of knowledge in logic giving rise to Description Logics.     (1B5B2)
  • Managing Data and Knowledge (1980s)     (1B6)
    • Realizations:     (1B6A)
      • Combining logic and data needs to be tightly coupled (not just layer prolog/expert system on top of a database)     (1B6A1)
      • Tradeoff between expressive power of logical languages and computational complexity of reasoning tasks     (1B6A2)
    • Contemporary Limitations:     (1B6B)
      • Negation was a killer. It was not well understood at this time.     (1B6B1)
      • Reasoning at large scale was still hard. Hardware was not going to be up to the task.     (1B6B2)
      • Realization of what would be known as the knowledge acquisition bottleneck     (1B6B3)
  • Data, Knowledge and the Web (1990s)     (1B7)
  • Data and Knowledge at Large Scale (2000s)     (1B8)
    • Realizations     (1B8A)
      • We learned to think about data and knowledge in a much bigger way (at Web scale)     (1B8A1)
      • Entering the era of Neural Networks due to new hardware and clever learning techniques     (1B8A2)
    • Limitations     (1B8B)
      • Do not know how to integrate logical and statistical views     (1B8B1)
      • Statistical methods (particularly in neural networks) do not provide information about the process of “reasoning” or “deduction”, which generates problems in areas where explanation is needed     (1B8B2)
  • Where are we now?     (1B9)
    • Throughout this history, we observed two important threads:     (1B9A)
      • Represent and manage data and knowledge at large scale     (1B9A1)
      • Integrate the most diverse, disparate and almost unlimited amount of sources of data and knowledge (structured data text, rules, images, voice, videos, etc.).     (1B9A2)
      • Furthermore, all of this must be available and accessible for “normal” users.     (1B9A3)
    • In 2012, Google announced a product called the Knowledge Graph, which is based on representing data in the form of graph connected with knowledge. … Later on, a myriad of companies ( e.g. Microsoft, Facebook, IBM) and organizations started to use the Knowledge Graph keyword to refer to the integration of data given rise to entities and relations forming graphs.[37] Academia began to use this keyword to designate loosely systems that integrate data with some structure of graphs, a reincarnation of the Semantic Web and Linked Data.     (1B9B)

Discussion on 20200624 John Sowa Neuro-symbolic     (1B10)

Good material on KG history to digest and incorporate    (1C)

  1. It turns out Google was *not* the first to introduce even the KG label with their blog post announcement in 2012 This promotional post included no specifics, but has still been widely cited as authoritative and original; it is cited in the WP article on KGs     (1C1)
  2. According to, Knowledge graph theory was initiated [in 1982] by C. Hoede, a discrete mathematician at the University of Twente and F.N. Stokman, a mathematical sociologist at the University of Groningen, both in the Netherlands. Abstract: “The project on knowledge graph theory was begun in 1982. At the initial stage, the goal was to use graphs to represent knowledge in the form of an expert system. By the end of the 80's expert systems in medical and social science were developed successfully using knowledge graph theory. In the following stage, the goal of the project was broadened to represent natural language by knowledge graphs. Since then, this theory can be considered as one of the methods to deal with natural language processing. At the present time knowledge graph representation has been proven to be a method that is language independent. The theory can be applied to represent almost any characteristic feature in various languages. The objective of the paper is to summarize the results of 25 years of development of knowledge graph theory and to point out some challenges to be dealt with in the next stage of the development of the theory. The paper will give some highlight on the difference between this theory and other theories like that of conceptual graphs which has been developed and presented by Sowa in 1984 and other theories like that of formal concept analysis by Wille or semantic networks.”     (1C2)
  3. Below is an excellent article that does cite the work by Hoede, etc. al., and discusses the problem for the community of not having a clear definition of KGs. Abstract: “Recently, the term knowledge graph has been used frequently in research and business, usually in close association with Semantic Web technologies, linked data, large-scale data analytics and cloud computing. Its popularity is clearly in- fluenced by the introduction of Google’s Knowledge Graph in 2012, and since then the term has been widely used with- out a definition. A large variety of interpretations has ham- pered the evolution of a common understanding of knowledge graphs. Numerous research papers refer to Google’s Knowl- edge Graph, although no official documentation about the used methods exists. The prerequisite for widespread academic and commercial adoption of a concept or technology is a common understanding, based ideally on a definition that is free from ambiguity. We tackle this issue by discussing and defining the term knowledge graph, considering its history and diversity in interpretations and use. Our goal is to propose a definition of knowledge graphs that serves as basis for discussions on this topic and contributes to a common vision.”     (1C3)

Whence Chat notes 20200624    (1D)

We had a productive conversation but did not take notes: we were relying on making a transcription of the recording but unfortunately lost the it due to internet connection problems. The topics covered were:     (1D1)

  1. Follow-on discussion to the presentation by George Hurlburt, including parallels between the history of general systems research and the history of semantic technologies, where pieces of earlier more comprehensive work are forgotten, to be rediscovered/rebranded often in simpler form. John related the story of how DL focus superseded the original call for a more sophisticated unifying logic layer in the semantic web vision.     (1D2)
  2. Discussion of “Good material” insights above underscored the need to develop a more coherent story about what led to the KG efforts of today so the communiqué doesn’t end up adding to the false historical narrative crediting Google with some landmark innovation in 2012.     (1D3)
  3. The two “whence” presentations we have are John’s and Chaitanya Barr’s, with the latter focused on the very recent era of the NSF OKN initiative. In order to present a clear summary story for the communiqué, we can draw on those but need agreement among ourselves on what KGs are in relation to database architectures, knowledge bases, conceptual schemas, ontologies, reasoners, and technologies for natural language processing, entity extraction, machine learning, etc. In John’s view KGs are identified with RDF, but he agreed it’s best not to identify a general definition with a specific technology. However the level of expressiveness of DL as opposed to other logics should be brought out.     (1D4)
  4. We agreed with Gary that it would be good to have a standard architecture diagram to refer to throughout the communiqué. Before next meeting Monday, Janet will collect candidate diagrams including Gary’s and others from John, Ravi and George, plus characterizations of the related concepts, above.     (1D5)