Actions

ConferenceCall 2023 10 04 and ConferenceCall 2023 10 18: Difference between pages

Ontolog Forum

(Difference between pages)
 
 
Line 2: Line 2:
|-
|-
! scope="row" | Session
! scope="row" | Session
| [[session::Overview]]
| [[session::A look across the industry, Part 1]]
|-
|-
! scope="row" | Duration
! scope="row" | Duration
Line 8: Line 8:
|-
|-
! scope="row" rowspan="3" | Date/Time
! scope="row" rowspan="3" | Date/Time
| [[has date::4 Oct 2023 16:00 GMT]]
| [[has date::18 Oct 2023 16:00 GMT]]
|-
|-
| 9:00am PDT/12:00pm EDT
| 9:00am PDT/12:00pm EDT
Line 14: Line 14:
| 4:00pm GMT/5:00pm CST
| 4:00pm GMT/5:00pm CST
|-
|-
! scope="row" | Conveners
! scope="row" | Convener
| [[convener::AndreaWesterinen|Andrea Westerinen]] and [[convener::MikeBennett|Mike Bennett]]
| [[convener::AndreaWesterinen|Andrea Westerinen]] and [[convener::MikeBennett|Mike Bennett]]
|}
|}
Line 21: Line 21:


== Agenda ==
== Agenda ==
* '''Kurt Cagle''', Author of [https://thecaglereport.com/ The Cagle Report]
** '''Title:''' Complementary Thinking: Language Models, Ontologies and Knowledge Graphs
** '''Abstract:''' With the advent of Retrieval Augmented Generators (RAGs), a more or less standardized workflow has become available for integrating large language models such as ChatGPT with knowledge graphs. This in turn has raised the question about the nature of ontologies associated with LLMs and how knowledge graphs can be structured and queried to make integrated data access possible between the two types of systems. In this talk, Editor and AI Explorer Kurt Cagle of The Cagle Report looks at this process and discusses how they affect both knowledge portals and ontology design.
** [https://bit.ly/3S040lR Slides]
* '''Tony Seale''', Knowledge graph architect and thought leader ([https://www.linkedin.com/in/tonyseale/ LinkedIn])
** '''Title:''' How Ontologies Can Unlock the Potential of Large Language Models for Business
** '''Abstract:''' LLMs have remarkable capabilities; they can craft letters, analyze data, orchestrate workflows, generate code, and much more. Companies such as Google, Apple, Amazon, Meta, and Microsoft are all investing heavily in this technology. Everything indicates that LLMs have enormous disruptive potential. However, there is a problem: they can hallucinate, and for any serious business, that is a deal-breaker. This is where ontologies can come in. In combination with Knowledge Graphs, they can place guardrails around the LLMs, thus allowing organizations to harness the capabilities of LLMs within the framework of a safely controlled ontological structure. 


'''[[AndreaWesterinen|Andrea Westerinen]]''' and '''[[MikeBennett|Mike Bennett]]'''
[https://bit.ly/46A3EH2 Slides]


'''Title:''' ''Fall Series Kickoff and Overview''
[https://bit.ly/46DJyvo Video Recording]
 
'''Abstract:''' The opening session of the Ontology Summit 2024 Fall Series overviews the LLM, ontology and knowledge graph landscapes, as well as introducing the participating speakers. The goal of the Series is to understand, discuss and debate the similarities, differences and overlaps across these landscapes. In addition, we will use these sessions to help to formulate the full 2024 Summit.
 
[https://bit.ly/3Q28U00 Slides]
 
[https://bit.ly/3rCnyC0 Video Recording]


== Conference Call Information ==
== Conference Call Information ==
* Date: '''Wednesday, 4 October 2023'''  
* Date: '''Wednesday, 18 October 2023'''  
* Start Time: 9:00am PDT / 12:00pm EDT / 6:00pm CEST / 5:00pm BST / 1600 UTC
* Start Time: 9:00am PDT / 12:00pm EDT / 6:00pm CEST / 5:00pm BST / 1600 UTC
** ref: [http://www.timeanddate.com/worldclock/fixedtime.html?month=10&day=04&year=2023&hour=12&min=00&sec=0&p1=179 World Clock]
** ref: [http://www.timeanddate.com/worldclock/fixedtime.html?month=10&day=18&year=2023&hour=12&min=00&sec=0&p1=179 World Clock]
* Expected Call Duration: 1 hour
* Expected Call Duration: 1 hour
{{:OntologySummit2024/ConferenceCallInformation}}
{{:OntologySummit2024/ConferenceCallInformation}}


== Participants ==
== Participants ==
* [[AlexShkotin|Alex Shkotin]]
* [[AndreaWesterinen|Andrea Westerinen]]
* [[ToddSchneider|Todd Schneider]]
* [[MikeBennett|Mike Bennett]]
* Ayya Niyyanika Bhikkhuni
* Bill McCarthy
* Zefi Kavvadia
* [[RamSriram|Ram D Sriram]]
* Andrew McCaffrey
* Steve Wartik
* [[BartGajderowicz|Bart Gajderowicz]]
* [[MarkFox|Mark Fox]]
* Seungmin Seo
* JL Valente
* [[MichaelDeBellis|Michael DeBellis]]
* [[DouglasMiles|Douglas Miles]]
* [[GaryBergCross|Gary Berg-Cross]]
* Sima Yazdani
* [[JohnSowa|John Sowa]]
* [[KenBaclawski|Ken Baclawski]]
* [[RaviSharma|Ravi Sharma]]
* Sergey Rodionov
* Taj Uddin
* [[MarkRessler|Mark Ressler]]
* Asiyah Yu Lin
* Hayden Spence
* Michael Singer
* Roberta Ferrario
* Chris Novell
* Emanuele Bottazzi
* Marco Monti
* [[JanetSinger|Janet Singer]]


== Discussion ==
== Discussion ==
* [[MikeBennett|Mike Bennett]]: Andrea's quote: "Ontologies are the backing definitions behind knowledge graphs" is a great way of describing the distinction between them.
* Anthony's OtterPilot: Hi, I'm an AI assistant helping Anthony Alcaraz take notes for this meeting. Follow along the transcript here:  https://otter.ai/u/WLIaj2w-OmOEoVVCP5gURhhZHaY?utm_source=va_chat_link_2

You'll also be able to see screenshots of key moments, add highlights, comments, or action items to anything being said, and get an automatic summary after the meeting.
** Emanuele Bottazzi: Or justifications
** [[MikeBennett|Mike Bennett]]: Dang! Put on a session on AI and the AIs start showing up!
   
* [[AndreaWesterinen|Andrea Westerinen]]: I like the quote, “Ontologies are the shapes of information and knowledge”
   
* [[ToddSchneider|Todd Schneider]]: What is a ‘shape of information’?
** [[AndreaWesterinen|Andrea Westerinen]]: Roles, constraints, relationships for a domain; a local representation
   
* [[RaviSharma|Ravi Sharma]]: Can LLMs and/or ontologies be used to detect AI artifacts?
** [[AndreaWesterinen|Andrea Westerinen]]: @Ravi Sharma There are some different articles on this, but basically no.
** [[MikeBennett|Mike Bennett]]: Not unless you could write an ontology that defines what it is to be truly human
** Michael Robbins: Or rebuild the web from the bottom up to embed new frameworks for digital identity and content provenance/authenticity (which is what we need to commit to)
** [[AndreaWesterinen|Andrea Westerinen]]: Provenance also aids with detecting bots
 
* [[MichaelDeBellis|Michael DeBellis]]: Someone asked about agents in relation to Data Mesh (can't find the orig comment) IMO Data Mesh could be implemented in terms of Agents but typically isn't.
   
* [[RaviSharma|Ravi Sharma]]: Is there one kind or multiple kinds of connectivity in KG as well as in Ontologies?
   
* [[RaviSharma|Ravi Sharma]]: Container in the sense of domain overlaps especially overlapping vocabs as Venn diagrams?
   
* Anh: Why is it hard to build LLMs for other languages?  Why can't it be replicated easily when translation work (e.g. Facebook, Google Translate, LinkedIn) has already done a somewhat good job?
** [[AndreaWesterinen|Andrea Westerinen]]: It is about the training data.
** [[BartGajderowicz|Bart Gajderowicz]]: [Anh], on longer text, translation looses many of the cultural nuances of a language, and looses context. Also, most training data is in English so most models are trained on English then translated.
** Anh: Thank you, @Andrea Westerinen & @Bart Gajderowicz.  Does it meean it'd cost the same to build a new LLMs for a new language?
** [[AndreaWesterinen|Andrea Westerinen]]: I would believe so.
** [[BartGajderowicz|Bart Gajderowicz]]: The volume of English text for training is cheaper, it’s just the web. So I’d imagine finding enough text in your target language would be the biggest cost. Librarians are our friends here 🙂
** [[PennyAnderson|Penny Anderson]]: this context free is difficult when you start thinking of utility of UI
** Anh: Could you please elaborate more on utility of UI? @Penny Anderson
** [[PennyAnderson|Penny Anderson]]: I mean UIs tend to be process driven data-centric is not tightly bound to a particular process that is context free
** [[PennyAnderson|Penny Anderson]]: other than search ?
** Anh: I see. Thanks.
** [[PennyAnderson|Penny Anderson]]: Data providence, Ethical AI ?
** [[AndreaWesterinen|Andrea Westerinen]]: @Penny Anderson Much better declared and reasoned against in KGs.
** [[PennyAnderson|Penny Anderson]]: Zero trust in networks?
   
* [[RaviSharma|Ravi Sharma]]: Kurt when you talk of box, you are essentially stating in and out of scope items or is there a way of capturing the info outside the box and bringing it in?
   
* [[MichaelDeBellis|Michael DeBellis]]: I don't agree that knowledge graphs are a form of data mesh. They are two related but different concepts. Data mesh is essentially an architecture for enterprise data definition and management. Knowledge graphs are a tool that can be used to implement a data mesh. Data mesh IMO brings the philosophy of microservices to data.
** Alan Morrison:  Michael, re: microservices to data, should we be thinking of agents as messengers and KGs as the data resource>
   
* [[RaviSharma|Ravi Sharma]]: What kind of aggregates are these data shapes? Are these only valid for a class of data or can you mix data types in a shape aggregate?
** [[MichaelDeBellis|Michael DeBellis]]: Ravi: in SHACL you define similar things as you do with OWL. E.g., does a property have to have exactly one value, the datatypes of a property, other constraints. The difference is OWL is used for reasoning over large knowledge graphs and uses the Open World Assumption. SHACL is for constraining data so it uses the Closed World Assumption. E.g., you can define ss_number as a property that must have exactly one value in either OWL or SHACL but in OWL you will almost never trigger an error if the restriction isn't satisfied due to OWA. With SHACL you will get error messages due to CWA
   
* [[ToddSchneider|Todd Schneider]]: LLM reasoning is probabilistic.
   
* Ayya Niyyanika Bhikkhuni: When weighting a concept is that some sort of credibility score?
   
* Michael Robbins: LangChains - I’m thinking about what role ConLangs could serve its intermediaries n revolutionizing language modes and NLP. Happy to have a follow-up discussion with anyone who is interested. https://en.wikipedia.org/wiki/Constructed_language
 
* [[RaviSharma|Ravi Sharma]]: What is the equivalent of Objects in LLM? What are these entities called and can same onto-entity be different in LLM context?
** [[MikeBennett|Mike Bennett]]: Words, I presume?
   
* [[RaviSharma|Ravi Sharma]]: Kurt thanks for including wonderful valuable background cultural images, these are inspiring. Are you also conveying the there is external (databased) and internal knowledge such as contemplation?
   
* [[RaviSharma|Ravi Sharma]]: Why are you limiting your examples to RDF why not MOF also?
   
* [[ToddSchneider|Todd Schneider]]: The important question is not mapping(s). It’s how can well constructed ontologies be used in the ingestion/training of LLMs.
   
* [[SusanneVejdemo|Sus Vejdemo]]: Question: what are good case studies of KGE and LLM integrations?
** [[AndreaWesterinen|Andrea Westerinen]]: I addressed some of this in the opening session. “Hybrid systems” include both where LLMs help ontologies (actually the Oct 25th session) and where ontologies help LLMs (Oct 4 and Nov 1 sessions).
** [[AndreaWesterinen|Andrea Westerinen]]: Also, Tony is highlighting a GREAT integration.
 
* [[MichaelDeBellis|Michael DeBellis]]: One question I have is what happens once you load a knowledge graph into an LLM? I know it can be done but once you load say a Turtle file into the LLM then what?
** [[BartGajderowicz|Bart Gajderowicz]]: Generally the LLM builds a small knowledge graph instance inside its memory and you can query it. Ask it to write SPARQL to get some instances, etc. I have not seen it used for large KGs, just small ones.
   
* Amit Jain: Will the recording be shared with attendees after the summit?
** [[MikeBennett|Mike Bennett]]: Yes, the recording will be uploaded to the session page when it is ready.
   
* Cedric Berger: Is there any metrics to measure that indeed KG combines with LLM are less hallucinating?
** [[AndreaWesterinen|Andrea Westerinen]]: I will try to provide these in the summary. I have read papers on this as well as blog posts.
** [[BartGajderowicz|Bart Gajderowicz]]: All the work I’ve come across relies on the knowledge graph to provide explicit knowledge. So if you can ground the LLM with a graph, you can verify if the answers the LLM provides are “facts” in the graph
   
* Kurt Cagle: Once you load in the ontology into a conversation, it will create (to some extent) an LLM conceptual space for that data. Also keep in mind that getting ALL of a knowledge graph via a RAG is usually not feasible.
 
* [[RaviSharma|Ravi Sharma]]: Tony great diagram for LLM ontology lop to improve each other.why is ontology weak in capturing concepts Vs LLM?
 
* [[MichaelDeBellis|Michael DeBellis]]: One of the most interesting papers I've read is from Lawrence Berkeley Labs on using LLMs to extend an ontology.


* [[ToddSchneider|Todd Schneider]]: Many Knowledge Graphs are not based on an ontology.
* [[RaviSharma|Ravi Sharma]]: How would models such as in physics work with LLM and Ontology loop or cycle that you show, actually ontologies are conceptually richer than KGs alone?
** [[AlexShkotin|Alex Shkotin]]: but keep it inside
** [[AndreaWesterinen|Andrea Westerinen]]: KGs are the data, ontologies are the concepts … So, it does not seem right to ask about one being richer.
** [[MikeBennett|Mike Bennett]]: I would not characterize such a thing as a knowledge graph, even if it re-uses that label for itself. Whence the claim of 'Knowledge' in KG if not semantics? Might not be an OWL-ology of course.
** Michael Robbins: https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/
** [[BartGajderowicz|Bart Gajderowicz]]: An ontology is the “schema” for a knowledge graph, so it may not be designed well but there is a “schema” that defines nodes and edges in some way.
   
* [[RaviSharma|Ravi Sharma]]: Tony the built-in ubcertainty in LLMs gives it extra power to apply to real life probabilistic world?
* Steven Wartik: I like to distinguish between a KG and a knowledge base. A KG is a graph. It doesn't necessarily have a schema. A KB is a KG whose schema is an ontology. This is just terminology, but I find it helps my sponsors understand.
* [[AlexShkotin|Alex Shkotin]]: Give me KG and I extract it's ontology.
* [[KenBaclawski|Ken Baclawski]]: KGs were covered in Ontology Summit 2020.  The communique has precise definitions: https://ontologforum.s3.amazonaws.com/OntologySummit2020/Communique/OntologySummit2020Communique.pdf
* [[ToddSchneider|Todd Schneider]]: Knowledge =def. “facts, information, and skills acquired by a person through experience or education; the theoretical or practical understanding of a subject” (from New Oxford American Dictionary)
* [[ToddSchneider|Todd Schneider]]: ‘Meaning’ is an ambiguous term.
* Andrew McCaffrey: To "table" a motion means completely the opposite things in the US and the UK. :D


* Ayya Niyyanika Bhikkhuni: As generative AI hallucinations become an issue, there seems a need for credibility scoring.  I am about a decade out-of-the-loop, but know we were talking about this many summits ago.  This is in regards to trust.
* [[RaviSharma|Ravi Sharma]]: Tony LLMs and analog and ontology as Quantum? great way. thanks
** [[BartGajderowicz|Bart Gajderowicz]]: Explanations WITH hallucinations are a huge problem for LLMs. They sound credible, and may be logically sound, but are completely wrong.
** Emanuele Bottazzi: Perhaps all the probabilistic approaches cannot be explanatory, since they “happen” to be wrong or right
** [[BartGajderowicz|Bart Gajderowicz]]: Ideally the explanation would come from explicit knowledge. Most LLMs just don’t have that. Ensemble ML architectures may include explicit knowledge somewhere, but if the underlying processes and representations are probabilistic we reach a hard limit on explainability. Of course you can have an explanation that provides “certainty” about the answer and explanation, which is often sufficient.
** Emanuele Bottazzi: I would add that ideally the explanation would come from the  explicit _use_ of knowledge and principles
* [[BartGajderowicz|Bart Gajderowicz]]: Do LLMs perform natural language understanding (NLU), or just processing (NLP)?
** [[BartGajderowicz|Bart Gajderowicz]]: Given my definition of knowledge I’d say NLP only. Even a simple Word2vec embedding is able to identify similarity between complex objects, but I would not consider it understanding (or knowledge)
* Ayya Niyyanika Bhikkhuni: “What is really true” is the underlying question when translating ancient text.  The project I am working on is taking translations from humans and Generative AI and it is hoped then that people practicing according to their interpretation of the texts would tune the translations based on ‘tacit knowledge.’
* Marco Monti: QUESTION: if neither LLM models nor Knowledge Graphs allow for compositionality and high contextualization of answers from a chat bot, what are the mechanisms behind the scenes of GPT X to answer so punctually and contextually ?
* [[JanetSinger|Janet Singer]]: Yes, mimicry is the key characterization of what LLMs do. Parallels the 1950s it was thought that mimicry of biological behavior would inevitably lead to a structural model of living systems, and then to artificially generated life itself. See critiques by Robert Rosen.
* [[GaryBergCross|Gary Berg-Cross]]: LLM based systems can learn on the job although you wouldn't call it based on experience.  This has been said about the learning: "When a user interacts with an LLM-based system, the system is able to observe the user's responses and learn from them. This allows the system to improve its ability to generate responses that are relevant to the user's needs.


* [[GaryBergCross|Gary Berg-Cross]]: There are a number of ways that LLM-based systems can be trained using chat responses. One common approach is to use reinforcement learning. In reinforcement learning, the system is rewarded for generating responses that are positive and helpful. This encourages the system to learn what kinds of responses are most likely to be well-received by users."
* [[MikeBennett|Mike Bennett]]: Left brain v right brain is a great way of talking about ontology v LLM. Now you have to create a good corpus collosum.
** [[GaryBergCross|Gary Berg-Cross]]: A better model than left right hemispheres is by layers - old, mid brain (associative) and neo-cortex. They interconnect in many ways and some by the limbic system.
** [[BartGajderowicz|Bart Gajderowicz]]: I like the System 1 / System 2 analogy


* [[ToddSchneider|Todd Schneider]]: Could explain “links in OWL are not first class objects”?
* Cedric Berger: Aren’t LLMs also kind of discrete as relying on vectors (arrays of numbers) of limited dimensions?
**  Steven Wartik: Todd, a first-class object is uniquely identifiable. A reified triple is a 1st-class object.
** [[BartGajderowicz|Bart Gajderowicz]]: LLMs are fundamentally probabilistic, not discrete. Some models are trained to provide discrete classifications, but that’s just at the output level. Internally they are probabilistic.
** Asiyah Yu Lin: I think the knowledge graph users who doesn't care too much about OWL thinking of data level or instance level. The ontology is really about classes. There is a blurred line between what is data and what is class.
** [[AndreaWesterinen|Andrea Westerinen]]: The discreteness pertains to the encoding. But, does the probabilistic nature of LLMs make it more continuous? I am not sure.
** [[MichaelDeBellis|Michael DeBellis]]: @Todd Schneider Suppose you have a model of a highway as a graph where nodes are cities and links are roads. You want to model the time it takes to get from two nodes as information directly on the link. You can do that with Neo4J but now with OWL. With OWL you need to use the design pattern where you reify the relation with a new class.
** [[MikeBennett|Mike Bennett]]: The weightings are a number rather than a logical truth value as in KGs
** [[ToddSchneider|Todd Schneider]]: Michael, thank you for the explanation. Per your example, it could be the case that the representation (of the entities and their relations) was inadequate to support the query (i.e. with reification). Typo “ with reification’ should be ‘Without reification).
** Kurt Cagle: Even with KGs, you can set up reifications that also set up Bayesians that are again more fuzzy (or at least more stochastic).
** [[MichaelDeBellis|Michael DeBellis]]: @Todd Schneider Yes. My question is how easy is it to take an OWL ontology where you have reified the relations and use graph theoretic algorithms? I don't know because I haven't used these algorithms in a long time. One thing I'm thinking about is creating an extension to OWL (I mean things like new classes and Python or SPARQL) where when you assert a new property value you have the option to create an instance of a Relation class and store data directly on that instance. That way you could treat the OWL ontology as a true graph.
** [[MikeBennett|Mike Bennett]]: If you had an ontology with weightings instead of truth values and can train those values, you have a semantic network like a brain/mind.
** [[MichaelDeBellis|Michael DeBellis]]: Often you can even ask GPT-4 to create the KIF or CycL or CLIF .. and it will
** Kurt Cagle: It's where I think we're heading. People in the semantic space have known for years that knowledge is fuzzy / fractal, but getting there has always been the rub.
** [[MikeBennett|Mike Bennett]]: I roughed out an idea for this kind of semantic network application back in the 90s.
* Hayden Spence: RE: Generating ontologies with LLMs: https://github.com/monarch-initiative/ontogpt


* [[JanetSinger|Janet Singer]]: Here, mimicry of knowledge-driven behavior is being promoted as inevitably leading to structural models of knowledge and then to ‘emergent consciousness’. Ontologies are structural (good for modeling within their scope); LLMs are behavioral
* Michael Robbins: A great article on vector embeddings: https://kdb.ai/learning-hub/fundamentals/vector-embeddings/ How can we use this for transparency and explainability? Give users confidence intervals (and other potential response options) along with responses?
** [[JanetSinger|Janet Singer]]: Here as in the hype cycle, not by Andrea 🙂


* [[Douglas Miles|Douglas Miles]]: GPT-3 btw seems useless compared to GPT-4 on this front
* [[RaviSharma|Ravi Sharma]]: What are vectors equivalent to in LLM context?
   
* Hayden Spence: From my understanding, GPT-4 is multimodal and multimodel in the sense its training is higher parameter, it incorporates more than just text data, and the actual interface is the interaction of multiple GPT models working together.
* Harvey King: Are the embeddings stored across 3 layers?
** [[AndreaWesterinen|Andrea Westerinen]]: The embeddings are there, but simplified/reduced.
*[[ToddSchneider|Todd Schneider]]: What is ‘semantic understanding’?
   
* Anh: Does KG consider the time stamp of the assertions/objects?  Context of my question: could we use it to mark the originality of posts of similar contents to alert plagiarism.
* Hayden Spence: Is the use of established controlled vocabularies that are under license like SNOMED CT, MedDRA, ICD10/0, or standards like FHIR, and the mappings between them -- once embedded -- still restricted? At what point does transformation of information collection become its own separate from the digested information.
** [[AndreaWesterinen|Andrea Westerinen]]: The KG CAN do this, if it is encoded.
   
* [[Douglas Miles|Douglas Miles]]: i don't have a question at this point.. but love this talk!
* [[RaviSharma|Ravi Sharma]]: Has anyone done this a roundtrip quality check, learn from LLM and put it in ontologies and the other way around?
   
** Benoit Claise: In which context/use case?
* [[JanetSinger|Janet Singer]]: Symbolic and connectionist theories of cognition are both computationalist. Leaves out 4-E embodied cognition perspective
** [[MichaelDeBellis|Michael DeBellis]]: That paper I posted earlier from Lawrence Berkeley Labs used LLM to extend an ontology but just went in one direction, expanding the ontology not changing the LLM.
   
* [[RaviSharma|Ravi Sharma]]: Can you do reasoning on same concept in both to differentiate their respective strengths and weakness?
   
* [[RaviSharma|Ravi Sharma]]: Tony your tree or chain of thought are great ways
   
* [[RaviSharma|Ravi Sharma]]: Tony and Kurt Can you address feature space Vs training set learning approaches?
   
* Liju Fan: Why are the relations in the ontology explicit?
** [[AndreaWesterinen|Andrea Westerinen]]: Relations are defined, and they can be inferred, but this is the essence of ontology. Ontologies are “open world” but do need relationships.
** [[MichaelDeBellis|Michael DeBellis]]: Because in the ontology you (usually manually) create the relations. In an LLM the relations are inferred by the ML algorithm and usually can't be manually changed
** Liju Fan: It seems there is a need to be able to rename LLM inferred relations for them to be human-understandable and practically useful.
   
* Kurt Cagle: Please note the similarity of Tony's slide with biological cells. Hmmm ...
   
* Cedric Berger: Has anyone asked AI to generate an image of a factual made of network base units?
   
* Michael Robbins: Agreed, Tony. And we’ve talked about this on LinkedIn. A constellation of domain-specific and ecosystem-based Community Knowledge Graphs and Language Models. #CKGs and #CLMs
** [[MikeBennett|Mike Bennett]]: TLO as mitochondria?
** Michael Robbins: Language is inseparable from culture and context
   
* Harvey King: Do KG's take a different nature when dealing with math?
** [[AndreaWesterinen|Andrea Westerinen]]: Not different, but with more rules?
** [[MichaelDeBellis|Michael DeBellis]]: Yes. An ontology uses explicit models like linear algebra. LLMs use linear algebra but computes answers based on examples, doesn't have a theoretical model of math (or other domains) as an ontology does.
   
* [[SusanneVejdemo|Sus Vejdemo]]: Thanks for a great talk! I love the conceptualization of embeddings as ontologies.
   
* [[MikeBennett|Mike Bennett]]: If you reify every edge you can give each one an analog value
   
* [[MarkUnderwood|Mark Underwood]]: For those interested in practical cybersec use cases (lots of chatty network data, some NLP), lot of narrow domain-specific emergent ontologies; e.g., Lambda / microservice mesh etc. mark.underwood@syf.com
   
* [[GaryBergCross|Gary Berg-Cross]]: I think the working memory graph is a useful early concept bring the LLMs and ontologies together but it is not so easy to capture what is the context for knowledge in this representation.  I would guess this is a sub-set of the fluid knowledge of what human cognition employs.  Much remains unconscious.  But with research AI systems may make more of this explicit.
   
* [[RaviSharma|Ravi Sharma]]: My query is what is the relationship among reification, provenance and context history?
   
* [[AndreaWesterinen|Andrea Westerinen]]: Quote from Tony, “LLMs for compute” and “As much data into graph, then translating the graph paths to NL and adding to the LLM”
   
* [[GaryBergCross|Gary Berg-Cross]]: Reality may be atomistically discrete but at such a nano-level that continuous models make better predictions than discrete models that are orders of magnitude too gross rather than fine grained.
   
* [[AndreaWesterinen|Andrea Westerinen]]: Quote from Kurt: “Community Language Models - decentralized, federated, ad hoc network of information”
   
* [[GaryBergCross|Gary Berg-Cross]]: Models need to be more than federated.  Because we center on semantic accuracy and relevance they need to be semantically harmonized.


== Resources ==
== Resources ==
* [https://bit.ly/3rCnyC0 Video Recording]
* [https://bit.ly/46DJyvo Video Recording]


== Previous Meetings ==
{{#ask: [[Category:OntologySummit2024]] [[Category:Icom_conf_Conference]] [[<<ConferenceCall_2023_10_18]]
        |?|?Session|mainlabel=-|order=desc|limit=3}}
== Next Meetings ==
== Next Meetings ==
{{#ask: [[Category:OntologySummit2024]] [[Category:Icom_conf_Conference]] [[>>ConferenceCall_2023_10_04]]
{{#ask: [[Category:OntologySummit2024]] [[Category:Icom_conf_Conference]] [[>>ConferenceCall_2023_10_18]]
         |?|?Session|mainlabel=-|order=asc|limit=3}}
         |?|?Session|mainlabel=-|order=asc|limit=3}}



Revision as of 02:51, 23 October 2023

Session A look across the industry, Part 1
Duration 1 hour
Date/Time 18 Oct 2023 16:00 GMT
9:00am PDT/12:00pm EDT
4:00pm GMT/5:00pm CST
Convener Andrea Westerinen and Mike Bennett

Ontology Summit 2024 A look across the industry, Part 1

Agenda

  • Kurt Cagle, Author of The Cagle Report
    • Title: Complementary Thinking: Language Models, Ontologies and Knowledge Graphs
    • Abstract: With the advent of Retrieval Augmented Generators (RAGs), a more or less standardized workflow has become available for integrating large language models such as ChatGPT with knowledge graphs. This in turn has raised the question about the nature of ontologies associated with LLMs and how knowledge graphs can be structured and queried to make integrated data access possible between the two types of systems. In this talk, Editor and AI Explorer Kurt Cagle of The Cagle Report looks at this process and discusses how they affect both knowledge portals and ontology design.
    • Slides
  • Tony Seale, Knowledge graph architect and thought leader (LinkedIn)
    • Title: How Ontologies Can Unlock the Potential of Large Language Models for Business
    • Abstract: LLMs have remarkable capabilities; they can craft letters, analyze data, orchestrate workflows, generate code, and much more. Companies such as Google, Apple, Amazon, Meta, and Microsoft are all investing heavily in this technology. Everything indicates that LLMs have enormous disruptive potential. However, there is a problem: they can hallucinate, and for any serious business, that is a deal-breaker. This is where ontologies can come in. In combination with Knowledge Graphs, they can place guardrails around the LLMs, thus allowing organizations to harness the capabilities of LLMs within the framework of a safely controlled ontological structure.

Slides

Video Recording

Conference Call Information

  • Date: Wednesday, 18 October 2023
  • Start Time: 9:00am PDT / 12:00pm EDT / 6:00pm CEST / 5:00pm BST / 1600 UTC
  • Expected Call Duration: 1 hour
  • Video Conference URL: https://bit.ly/48lM0Ik
    • Conference ID: 876 3045 3240
    • Passcode: 464312

The unabbreviated URL is: https://us02web.zoom.us/j/87630453240?pwd=YVYvZHRpelVqSkM5QlJ4aGJrbmZzQT09

Participants

Discussion

  • Andrea Westerinen: I like the quote, “Ontologies are the shapes of information and knowledge”
  • Ravi Sharma: Can LLMs and/or ontologies be used to detect AI artifacts?
    • Andrea Westerinen: @Ravi Sharma There are some different articles on this, but basically no.
    • Mike Bennett: Not unless you could write an ontology that defines what it is to be truly human
    • Michael Robbins: Or rebuild the web from the bottom up to embed new frameworks for digital identity and content provenance/authenticity (which is what we need to commit to)
    • Andrea Westerinen: Provenance also aids with detecting bots
  • Michael DeBellis: Someone asked about agents in relation to Data Mesh (can't find the orig comment) IMO Data Mesh could be implemented in terms of Agents but typically isn't.
  • Ravi Sharma: Is there one kind or multiple kinds of connectivity in KG as well as in Ontologies?
  • Ravi Sharma: Container in the sense of domain overlaps especially overlapping vocabs as Venn diagrams?
  • Anh: Why is it hard to build LLMs for other languages? Why can't it be replicated easily when translation work (e.g. Facebook, Google Translate, LinkedIn) has already done a somewhat good job?
    • Andrea Westerinen: It is about the training data.
    • Bart Gajderowicz: [Anh], on longer text, translation looses many of the cultural nuances of a language, and looses context. Also, most training data is in English so most models are trained on English then translated.
    • Anh: Thank you, @Andrea Westerinen & @Bart Gajderowicz. Does it meean it'd cost the same to build a new LLMs for a new language?
    • Andrea Westerinen: I would believe so.
    • Bart Gajderowicz: The volume of English text for training is cheaper, it’s just the web. So I’d imagine finding enough text in your target language would be the biggest cost. Librarians are our friends here 🙂
    • Penny Anderson: this context free is difficult when you start thinking of utility of UI
    • Anh: Could you please elaborate more on utility of UI? @Penny Anderson
    • Penny Anderson: I mean UIs tend to be process driven data-centric is not tightly bound to a particular process that is context free
    • Penny Anderson: other than search ?
    • Anh: I see. Thanks.
    • Penny Anderson: Data providence, Ethical AI ?
    • Andrea Westerinen: @Penny Anderson Much better declared and reasoned against in KGs.
    • Penny Anderson: Zero trust in networks?
  • Ravi Sharma: Kurt when you talk of box, you are essentially stating in and out of scope items or is there a way of capturing the info outside the box and bringing it in?
  • Michael DeBellis: I don't agree that knowledge graphs are a form of data mesh. They are two related but different concepts. Data mesh is essentially an architecture for enterprise data definition and management. Knowledge graphs are a tool that can be used to implement a data mesh. Data mesh IMO brings the philosophy of microservices to data.
    • Alan Morrison: Michael, re: microservices to data, should we be thinking of agents as messengers and KGs as the data resource>
  • Ravi Sharma: What kind of aggregates are these data shapes? Are these only valid for a class of data or can you mix data types in a shape aggregate?
    • Michael DeBellis: Ravi: in SHACL you define similar things as you do with OWL. E.g., does a property have to have exactly one value, the datatypes of a property, other constraints. The difference is OWL is used for reasoning over large knowledge graphs and uses the Open World Assumption. SHACL is for constraining data so it uses the Closed World Assumption. E.g., you can define ss_number as a property that must have exactly one value in either OWL or SHACL but in OWL you will almost never trigger an error if the restriction isn't satisfied due to OWA. With SHACL you will get error messages due to CWA
  • Ayya Niyyanika Bhikkhuni: When weighting a concept is that some sort of credibility score?
  • Michael Robbins: LangChains - I’m thinking about what role ConLangs could serve its intermediaries n revolutionizing language modes and NLP. Happy to have a follow-up discussion with anyone who is interested. https://en.wikipedia.org/wiki/Constructed_language
  • Ravi Sharma: What is the equivalent of Objects in LLM? What are these entities called and can same onto-entity be different in LLM context?
  • Ravi Sharma: Kurt thanks for including wonderful valuable background cultural images, these are inspiring. Are you also conveying the there is external (databased) and internal knowledge such as contemplation?
  • Ravi Sharma: Why are you limiting your examples to RDF why not MOF also?
  • Todd Schneider: The important question is not mapping(s). It’s how can well constructed ontologies be used in the ingestion/training of LLMs.
  • Sus Vejdemo: Question: what are good case studies of KGE and LLM integrations?
    • Andrea Westerinen: I addressed some of this in the opening session. “Hybrid systems” include both where LLMs help ontologies (actually the Oct 25th session) and where ontologies help LLMs (Oct 4 and Nov 1 sessions).
    • Andrea Westerinen: Also, Tony is highlighting a GREAT integration.
  • Michael DeBellis: One question I have is what happens once you load a knowledge graph into an LLM? I know it can be done but once you load say a Turtle file into the LLM then what?
    • Bart Gajderowicz: Generally the LLM builds a small knowledge graph instance inside its memory and you can query it. Ask it to write SPARQL to get some instances, etc. I have not seen it used for large KGs, just small ones.
  • Amit Jain: Will the recording be shared with attendees after the summit?
    • Mike Bennett: Yes, the recording will be uploaded to the session page when it is ready.
  • Cedric Berger: Is there any metrics to measure that indeed KG combines with LLM are less hallucinating?
    • Andrea Westerinen: I will try to provide these in the summary. I have read papers on this as well as blog posts.
    • Bart Gajderowicz: All the work I’ve come across relies on the knowledge graph to provide explicit knowledge. So if you can ground the LLM with a graph, you can verify if the answers the LLM provides are “facts” in the graph
  • Kurt Cagle: Once you load in the ontology into a conversation, it will create (to some extent) an LLM conceptual space for that data. Also keep in mind that getting ALL of a knowledge graph via a RAG is usually not feasible.
  • Ravi Sharma: Tony great diagram for LLM ontology lop to improve each other.why is ontology weak in capturing concepts Vs LLM?
  • Michael DeBellis: One of the most interesting papers I've read is from Lawrence Berkeley Labs on using LLMs to extend an ontology.
  • Ravi Sharma: Tony the built-in ubcertainty in LLMs gives it extra power to apply to real life probabilistic world?
  • Ravi Sharma: Tony LLMs and analog and ontology as Quantum? great way. thanks
  • Mike Bennett: Left brain v right brain is a great way of talking about ontology v LLM. Now you have to create a good corpus collosum.
    • Gary Berg-Cross: A better model than left right hemispheres is by layers - old, mid brain (associative) and neo-cortex. They interconnect in many ways and some by the limbic system.
    • Bart Gajderowicz: I like the System 1 / System 2 analogy
  • Cedric Berger: Aren’t LLMs also kind of discrete as relying on vectors (arrays of numbers) of limited dimensions?
    • Bart Gajderowicz: LLMs are fundamentally probabilistic, not discrete. Some models are trained to provide discrete classifications, but that’s just at the output level. Internally they are probabilistic.
    • Andrea Westerinen: The discreteness pertains to the encoding. But, does the probabilistic nature of LLMs make it more continuous? I am not sure.
    • Mike Bennett: The weightings are a number rather than a logical truth value as in KGs
    • Kurt Cagle: Even with KGs, you can set up reifications that also set up Bayesians that are again more fuzzy (or at least more stochastic).
    • Mike Bennett: If you had an ontology with weightings instead of truth values and can train those values, you have a semantic network like a brain/mind.
    • Kurt Cagle: It's where I think we're heading. People in the semantic space have known for years that knowledge is fuzzy / fractal, but getting there has always been the rub.
    • Mike Bennett: I roughed out an idea for this kind of semantic network application back in the 90s.
  • Ravi Sharma: What are vectors equivalent to in LLM context?
  • Harvey King: Are the embeddings stored across 3 layers?
  • Anh: Does KG consider the time stamp of the assertions/objects? Context of my question: could we use it to mark the originality of posts of similar contents to alert plagiarism.
  • Ravi Sharma: Has anyone done this a roundtrip quality check, learn from LLM and put it in ontologies and the other way around?
    • Benoit Claise: In which context/use case?
    • Michael DeBellis: That paper I posted earlier from Lawrence Berkeley Labs used LLM to extend an ontology but just went in one direction, expanding the ontology not changing the LLM.
  • Ravi Sharma: Can you do reasoning on same concept in both to differentiate their respective strengths and weakness?
  • Ravi Sharma: Tony your tree or chain of thought are great ways
  • Ravi Sharma: Tony and Kurt Can you address feature space Vs training set learning approaches?
  • Liju Fan: Why are the relations in the ontology explicit?
    • Andrea Westerinen: Relations are defined, and they can be inferred, but this is the essence of ontology. Ontologies are “open world” but do need relationships.
    • Michael DeBellis: Because in the ontology you (usually manually) create the relations. In an LLM the relations are inferred by the ML algorithm and usually can't be manually changed
    • Liju Fan: It seems there is a need to be able to rename LLM inferred relations for them to be human-understandable and practically useful.
  • Kurt Cagle: Please note the similarity of Tony's slide with biological cells. Hmmm ...
  • Cedric Berger: Has anyone asked AI to generate an image of a factual made of network base units?
  • Michael Robbins: Agreed, Tony. And we’ve talked about this on LinkedIn. A constellation of domain-specific and ecosystem-based Community Knowledge Graphs and Language Models. #CKGs and #CLMs
    • Mike Bennett: TLO as mitochondria?
    • Michael Robbins: Language is inseparable from culture and context
  • Harvey King: Do KG's take a different nature when dealing with math?
    • Andrea Westerinen: Not different, but with more rules?
    • Michael DeBellis: Yes. An ontology uses explicit models like linear algebra. LLMs use linear algebra but computes answers based on examples, doesn't have a theoretical model of math (or other domains) as an ontology does.
  • Sus Vejdemo: Thanks for a great talk! I love the conceptualization of embeddings as ontologies.
  • Mike Bennett: If you reify every edge you can give each one an analog value
  • Mark Underwood: For those interested in practical cybersec use cases (lots of chatty network data, some NLP), lot of narrow domain-specific emergent ontologies; e.g., Lambda / microservice mesh etc. mark.underwood@syf.com
  • Gary Berg-Cross: I think the working memory graph is a useful early concept bring the LLMs and ontologies together but it is not so easy to capture what is the context for knowledge in this representation. I would guess this is a sub-set of the fluid knowledge of what human cognition employs. Much remains unconscious. But with research AI systems may make more of this explicit.
  • Ravi Sharma: My query is what is the relationship among reification, provenance and context history?
  • Andrea Westerinen: Quote from Tony, “LLMs for compute” and “As much data into graph, then translating the graph paths to NL and adding to the LLM”
  • Gary Berg-Cross: Reality may be atomistically discrete but at such a nano-level that continuous models make better predictions than discrete models that are orders of magnitude too gross rather than fine grained.
  • Andrea Westerinen: Quote from Kurt: “Community Language Models - decentralized, federated, ad hoc network of information”
  • Gary Berg-Cross: Models need to be more than federated. Because we center on semantic accuracy and relevance they need to be semantically harmonized.

Resources

Previous Meetings

 Session
ConferenceCall 2023 10 11Setting the stage
ConferenceCall 2023 10 04Overview

Next Meetings

 Session
ConferenceCall 2023 10 25A look across the industry, Part 2
ConferenceCall 2023 11 01Demos of information extraction via hybrid systems
ConferenceCall 2023 11 08Broader thoughts
... further results