From OntologPSMW

Revision as of 16:34, 23 April 2019 by Garybc (Talk | contribs)

Jump to: navigation, search
[ ]
    (1)
Session Synthesis Session 1
Duration 1 hour
Date/Time 27 February 2019 17:00 GMT
9:00am PST/12:00pm EST
5:00pm GMT/6:00pm CET
Convener Ken Baclawski

Contents

Ontology Summit 2019 Synthesis Session 1     (2)

Abstract     (2A)

The aim of this week's session is to synthesize the lessons learned so far on the tracks that are under way. Each track has met once, and so we will have gained insights from a combination of invited speakers, chat log comments and blog page discussions.     (2A1)

A second synthesis session will take place after the tracks have met again, and that, along with today's session outcome, will form the basis of this year's Ontology Summit Communiqué.     (2A2)

Agenda     (2B)

Summary of Ontology Summit 2019 Sessions     (2C)

There were 9 sessions so far. Each session had the proceedings (from the chat room) and a recording (one audio recording and the rest video recordings). The following are the speakers with links to their presentation slides (when they were provided) and the recordings.     (2C1)


    (2C2)
DateSpeakerTopicPresentationRecording
11/14John SowaExplanations and help facilities designed for peopleSlidesVideo
11/28Ram D. Sriram and Ravi SharmaIntroductory Remarks on XAISlidesVideo
Derek DoranOkay but Really... What is Explainable AI? Notions and Conceptualizations of the FieldSlides
12/05Gary Berg-Cross and Torsten HahmannIntroduction to Commonsense Knowledge and ReasoningSlidesVideo
1/16Ken BaclawskiIntroductory RemarksSlidesVideo
Gary Berg-Cross and Torsten HahmannCommonsenseSlides
Donna Fritzsche and Mark UnderwoodNarrativeSlides
Mark Underwood and Mike BennettFinancial Explanation
Ram D. Sriram and David WhittenMedical Explanation
Ram D. Sriram and Ravi SharmaExplainable AISlides
1/23Michael GrüningerOntologies for the Physical Turing TestSlidesVideo
Benjamin GrosofAn Overview of Explanation: Concepts, Uses, and IssuesSlides
1/30Donna FritzscheIntroduction to NarrativeAudio only
Ken BaclawskiProof as Explanation and NarrativeSlides
Mark UnderwoodBag of Verses: Frameworks for Narration from Cognitive PsychologySlides
2/6Mike BennettFinancial Explanations IntroductionSlidesVideo
Mark UnderwoodExplanation Use Cases from Regulatory and Service Quality Drivers in Retail Credit Card FinanceSlides
Mike BennettFinancial Industry ExplanationsSlides
2/13David WhittenIntroduction to Medical Explanation SystemsSlides
Augie TuranoReview and Recommendations from past Experience with Medical Explanation SystemsSlides
Ram D. SriramXAI for BiomedicineSlides
2/20William ClanceyExplainable AI Past, Present, and Future–A Scientific Modeling ApproachSlidesVideo

Conference Call Information     (2D)

Attendees     (2E)

Proceedings     (2F)

The following is a draft synthesis for CSK and Explanations by Gary Berg-Cross 4/23/2019     (2F1)

The following is a preliminary summary of material about this which may also be useful for the Summit Communique. Material is roughly organized as follows: 1.The History of Commonsense Knowledge and Explanations in AI. 2. Additional Views of Explanation 3.An overview of Building Applications 4.Challenges and issues to investigate 5. Preliminary findings     (2F2)

Commonsense and Explanation (Preliminary Synthesis) Commonsense reasoning and knowledge (CSK) was prominently featured as an early part of how Artificial Intelligence (AI) was conceptualized. It was assumed to be important in development and enhancement of human-like, intelligent systems including the ability of system to explain their reasoning and what they know. Commonsense research, since its inception, has has focused on studying the consensus reality, knowledge, causality, and rationales that are ubiquitous in everyday thinking, speaking and perceiving as part of ordinary human interaction with the world. And as the name suggest this knowledge and reasoning is common and available to the overwhelming majority of people, and manifest in human children's behavior by the age of 3 or 5. Commonsense and the ability to get along in an ordinary world was assumed the original Turing Test for example as discussed in Session 1 by Michael Gruninger. Some examples of CSK include the types: • Taxonomic: Cats are mammals • Causality: Eating a pear makes you less hungry • Goals: I don't want to get hot so let's find shade • Spatial: You often find a toaster in the kitchen • Functional: You can sit on park benches if tired • Planning: To check today's weather look in a paper • Linguistic: "can't" is the same as “can not” • Semantic: dog and canine have a similar meaning More recently with ML capabilities of describing images a visual Turing Test might involve question-answering based on real-world images, such as detecting and localizing instances of objects and relationships among the objects in a scene. • What is Person1 carrying? • What is Person2 placing on the table? • Is Person3 standing behind a desk?     (2F3)

Is Vehicle 1 moving?

This paper summarizes some of the research ideas discussed in our commonsense reasoning and knowledge as part of the 2019 Ontology Summit on Explanations. While both CSK & explanation are current topics of interest in AI and Machine Learning (ML) they are related. The current emphasis on explanation for example, grows in part out of the opacity of Deep Learning (DL) solutions for such things are labeling images or translating text (as discussed in more detail in other sessions of this Summit). These efforts motivate some opportunities for related work on commonsense that may be supportive. Of note this session on CSK examined issues around both commonsense and explanations particularly as they have been developing under the influence of modern machine learning (ML) and Deep Learning (DL) models. In part the excitement around ML grows out of the impact of big data and the recognition that to move forward we must automate activities that currently require much manual intervention. Compounding the challenge of explainability is the fact that current, rapid advances in AI include these neural nets and deep learning (DL) approaches. Among these efforts are the interestingly named Long short-term memory (LSTM) which is a particular artificial recurrent neural network (RNN) architecture (where output from previous step are fed as input to the current step) used in the field of deep learning. These were developed to deal with the exploding and vanishing gradient problems that can be encountered when training traditional RNNs. LSTMs have proven a useful NN architecture for processing sequential data such speech/discourse or a series of visual images. Such applications, however, raise questions of what such systems know and what they can say about their knowledge and judgments.     (2F4)

1. Some History and Background to Commonsense and Explanation in AI There is a long history showing the relevance of commonsense knowledge & reasoning to explanation. Certainly AI founders, such as John McCarthy, believed so and argued that major long-term goal of AI has been to endow computers with standard commonsense reasoning capabilities. In “Programs with Common Sense” (McCarthy 1959) described 3 ways for AI to proceed: 1. imitate the human CNS, 2. study human cognition or 3. “understand the common sense world in which people achieve their goals.” Another, related goal, has been to endow AI systems with NL understanding and production. It is easy to see that a system with both CSK and a NL facility would be able to provide smart advice as well as explanation of this advice. We see the relation of this in the early conceptualization of a smart advice taker system from McCarthy's work that it would have causal knowledge available to it for: • “a fairly wide class of immediate logical consequences of anything it is told and its previous knowledge.”     (2F5)

McCarthy further noted that this useful property, if designed well,  would be expected to have much in common with what makes us describe certain humans as “having common sense.” He went on to use this idea to define CSK - “We shall therefore say that a program has common sense if it automatically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows” 

In practice back in the 70s and 80s AI systems were not as the founders envisioned, They wee brittle with handcrafted production rules that encoded useful info about diseases, for example. In practice this rule knowledge was fragmented and opaque & would break down in the face of obvious errors due in part to a lack of common sense. It also came with a very simple, technical but not commonsense idea of what was called an “explanation facility.” This was a technical/proof trace of rule firings which was thought to provide an explanation. Proofs found by Automated Theorem Provers can provide a map from inputs to outputs. Indeed in a narrow, logical sense the “Gold Standard” concept of explanation is deductive proof done using a formal knowledge representation (KR). Indeed there are multiple formalizations of deductive models involving some form of knowledge and reasoning. But as made clear in out sessions there are other forms, styles or meanings of explanation. One concerns the provenance or source of some fact or statement. For example, “fact sentence F41 was drawn from document D73, at URL U84, in section 22, line 18.” This makes clear what is the documented source of data. That's important too and allows follow up. Another type of explanation is the transparency of the algorithmic inference process – is it using sub-classing or something more complex like abduction? In some cases it is very hard for a human to understand how the explanation was arrived at since the underlying structure is hidden. Other structuring of inference in presentation is possible using a clausal resolution graph or a Bayes net graph. But an important question to ask is “do these make something clear?” They may provide the “how” an answer was arrived at in steps and which rules were involved, but not the justifying “why” of a satisfactory explanation. If a tree structure is involved in an explanation process we might get more of a “why” understanding with the possibility of drilling down and browsing the tree, having a focal point of attention on critical information or have the option of displaying a graphic representation that a human can understand. An example provided by Niket Tandon concerns a vehicle controller AI system explanation of driving based on visual sensing. The system describes itself as “moving forward” as an activity while a human description is the more functional and common one of “driving down the street.” As to explanations the system says, “because there are no other cars in my lane” while the human explanation is “because the lane is clear.” These are similar but “clear” is a more comprehensive idea of a situation which might include construction, trees etc. A more elaborate example offered in how a smart system explains judgments of “what is a healthy meal.” As shown in the Figure below justifying explanations may point to specific items or qualify the overall variety of items in the meal in a multi-modal way. Commonsense assumptions and presumptions in a knowledge representation may be an important aspect of explanations and serve as a focus point. Ability to focus on relevant points may be part of the way a system is judged competent as is it's perceived correctness – that it provides a good answer and a good explanation. Involved in such a judgment may also be evaluation of ethicality, fairness, and, where relevant, legality and various roles involved such as. • Relational role • Processual role and • Social role An example of this is that the role of legal advice is different in the context of a banking activity compared to that of lying under oath. Part of the reason for limited explanations if not brittleness, was discovered by Clancey (1983) who found that Mycin's individual rules play different roles, have different kinds of justifications, & are constructed using different rationales for the ordering & choice of premise clauses. Since this knowledge isn't made explicit it can't be used as part of explanations. And there are structural and strategic concepts which lie outside early AI system rule representations. These can only be supported by appealing to some deeper level of (background) knowledge. One solution approach taken was to use ontologies to capture and formalize this knowledge. This made the argument that ontologies are needed to make explicit structural, strategic & support knowledge which enhanced the ability to understand & modify the system (e.g. knowledge debugging as part of KB development) as well as support suitable explanations. To some extent efforts like CYC which started up in the 80s was an effort to avoid these problems by providing a degree of commonsense and modular knowledge. CYC can provide a partial inference chain constructed in response to queries such as: “Can the Earth run a marathon?” In terms of a commonsense explanation we have a “no” because the Earth is not animate and the role capability of running a marathon is detailed by the knowledge in a sports module (called a micro-theory or MT). Around this time the issue for the development and application of ontologies was that the commonsense context was seldom “explicitly stated” and is difficult to do. But the need a formal mechanism for specifying a commonsense context had become recognized and some approach to it, such as CYC' microtheories arose. These descended from J. McCarthy's tradition of treating contexts as formal objects over which one can quantify and express first-order properties. In the 80s CYC-type knowledge was also seen as important to associate systems which was arguing that “ systems should not only handle tasks automatically, but also actively anticipate the need to perform them....agents are actively trying to classify the user’s activities, predict useful sub-tasks and expected future tasks, and, proactively, perform those tasks or at least the sub-tasks that can be performed automatically.” Wahlster and Kobsa (1989). Around this time another influential CSK development was Pat Hayes' “Naive Physics Manifesto” (1978) which proposed to develop a formal theory encompassing the large body of knowledge of the physical world. The vision was of an axiomatically dense KB using a unified conceptualization whose axioms were represented in symbolic logic. Emphasis was on formalizing foundational common “everyday” physical concepts, including: measurements and scales; spatial concepts (shape, orientation, direction, containment); substances and states (solids, liquids), physical forces, energy and movement; manufactured objects and assemblies. Much ontological work has followed the spirit of this idea if not the exact program outlined. More recent work, reflecting the ability of ML systems to learn about visual information and even text has lead to more distinctions being made about CS knowledge. An example (see Figure) provide by Tanden et al (2018) distinguished the visual modality expressing a type of knowledge (is it a simply observed property like color or some simple relation like part) from implications (shiny things imply smoothness and so less friction). As shown in the Figure below properties of agents such as emotions in difficult circumstances, while commonly known, are more implicit as are the actions involved in fixing a tire or avoiding a traffic jam.     (2F6)




The KB would be supported by commonsense reasoners which might include answering “why” questions in an understandable way. 2. Additional Concepts of Explanations & Commonsense Understandings It is worth noting in passing that CSK has been discussed in prior Ontology Summits. In particular the 2017 Summit on AI, Learning, Reasoning, and Ontologies and its track on "Using Automation and Machine Learning to Extract Knowledge and Improve Ontologies is relevant but not all of it will not be discussed here. One prominent example from 2017 was NELL (Never-Ending Language Learner). Central to the NELL effort is the idea that we will never truly understand machine or human learning until we can build computer programs that shares some similarity to the way human learn. In particular such systems, like people learn: • many different types of everyday knowledge or functions and thus many contexts, • from years of diverse, mostly self-supervised experience, • in a staged curricular fashion, where previously learned knowledge in one context enables learning further types of knowledge, • using self-reflection and the ability to formulate new representations and new learning tasks enable the learner to avoid stagnation and performance plateaus. As reported at the 2017 Summit NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidence weighted beliefs (e.g., servedWith(tea, biscuits)). NELL has also learned millions of features and parameters that enable it to read these beliefs from the web. Additionally, it has learned to reason over these beliefs to infer (we might say using commonsense reasoning) new beliefs, and is able to extend its ontology by synthesizing new relational predicates. NELL learns to acquire two types of knowledge in a variety of ways. It learns free-form text patterns for extracting this knowledge from sentences on a largescale corpus of web sites. NELL also exploits a coupled process which learns text patterns corresponding type and relation assertions, and then applies them to extract new entities and relations. In practice it learns to extract this knowledge from semi-structured web data such as tables and lists. In the process it learns morphological regularities of instances of categories, and it learns probabilistic horn clause rules that enable it to infer new instances of relations from other relation instances that it has already learned. Reasoning is also applied for consistency checking and removing inconsistent axioms as in other KG generation efforts. NELL might learn a number of facts from a sentence defining "icefield", for example:     (2F7)

           "a mass of glacier ice; similar to an ice cap, and usually smaller and lacking a dome-like shape;  	somewhat controlled by terrain." 

In the context of this sentence and this new "background knowledge" extracted it might then extract supporting facts/particulars from following sentences: "Kalstenius Icefield, located on Ellesmere Island, Canada, shows vast stretches of ice. The icefield produces multiple outlet glaciers that flow into a larger valley glacier." It might also note not only the textual situation relating extracted facts but the physical location (e.g.Ellesmere Island) and any temporal situations expressed in these statements. A context that is important is that AI systems increasingly use advanced techniques such as deep learning which may in turn require some additional techniques to make them more understandable to humans and system designers as well as trusted.     (2F8)

Some factors making CSK (and explanation) more understandable (and hence trusted) were mentioned by Benjamin Grosof as part of his session presentation. These include:     (2F9)

Influentiality which that heavily weighted hidden nodes and edges can affect discrimination of output, in a deep learning neural network (NN) or that some constructed approximation of an output, even if partial might provide a more understandable explanation. Such approximations may represent the top-3 weighted sub-models within some overall ML ensemble model. There is also the concept of “Model the model” wherein the learned NN model can then be learned as a secondary and simpler decision-tree model. Lateral relevance which involves interactivity for neighborhood, centrality, or top-k within categories exploration within graphs structures. 3. Contemporaty Applications & Benefits of Explanation (that make sense) There is an obvious benefit if semi- or fully-automatic explanations can be provided as part of decision support systems. This seems like a natural extension of some long used and understood techniques such as logical proofs. Benefits can be easily seen if rich and deep deductions could be supported in areas regarding policies and legal issues, but also as part of automated education and training, such as e-learning. An example of this is the Digital Socrates application developed by Janine Bloomfield of Coherent Knowledge. The interactive tutor system will provide an answer to a particular topical question but also provides the logical chain of reasoning needed to arrive at the correct solution. Knowledge reuse and transfer is an important issue in making such systems scalable. Some Examples from Industry Using Explanation Technology Among the examples offered by Benjamin Grosof were:     (2F10)

Coherent Knowledge – previously mentioned as providing eLearning systems with semantic logical deductive reasoning so proofs provide a natural deduction in programs using declarative, extended logic.  But to make explanations understandable the systems employ NL generation along with a drill-down capability and interactive navigation of the knowledge space.  Provenance of the knowledge is also provided.

Tableau Software provides a different ability with specialized presentation of information via bar etc. charts Kyndi – provides a more cognitive search in NL knowledge graphs (KGs). Here the capabilities include the focus on relevant knowledge including lateral relevance and provenance within an extended KG that is constructed using a combination of NLP + ML + knowledge representation and reasoning (KRR) A best practice architecture (see Figure) and functional example offered by Grosof et al (2014) concerning Automated Decision Support for Financial Regulatory/Policy Compliance using Textual Rulelog software technology implemented with ErgoAI Suite. The system encodes regulations and related info as semantic rules and ontologies which support fully, robustly automate run-time decisions and related querying.     (2F11)












Understandable full explanations are provided in English reflecting a digital audit trail, with provenance information. Rulelog helps handle increasingly complexity of real-world challenges such as found in data & system integration involving conflicting policies, special cases, and legal/business exceptions. For example it understands of regulated prohibitions of banking transactions where a counterparty is an “affiliate” of the bank.     (2F12)

Notable, Textual Rulelog (TR) extends Rulelog with natural language processing and uses logic to help do both text interpretation and text generation. With a proper use of rules mapping is much simpler and closer than with other KR’s and rulelog’s high expressiveness is much closer to NL’s conceptual abstraction level. As an added feature the system allowed what-if scenarios to analyze the impact of new regulations and policy changes.     (2F13)


4.Issues and Challenges Today in the Field of CSK (and Explanation) The Issue of Making the Case for CSK Modern ML and Deep Learning (DL) techniques have proven powerful for some problems such as in the field of computer vision such as employed for navigation or image identification. Research now reliable shows the value of transfer training/learning as part of this.  So in  pre-training a neural network model works on a known task, using stored images from a general source like ImageNet. The resulting rained neural network (with an implied “model”)  used for new, but related, purpose-specific model. Of course it can be difficult to find training data for all types of scenarios and specific situations of interest. There are problems of representativenes & the seduction of the typical to some generalizations such “shiny surfaces typically hard, but some are not. There is the problem of perspective. The moon in the sky and a squirrel under a tree, which may be in the same image may seem the same size, but we know from experience that they are at different distances. Many cognitive abilities developed in the first years of life provide the commonsense knowledge to handle these and problems like conservation of objects - if I put my toys in the drawer, they will still be there tomorrow. One may imagine handling such problems by building in commonsense knowledge or letting a system have training experience with conservation over time or place. These types of problems also arise with situational understanding – some important things are unseen but implied in a picture as part of the larger or implied situation such as an environmental or ecological one with many dependencies. An example offered by Niket Tandon was the implication of an arrow in a food web diagram which communicates “consumes” to a human (a frog consumes bugs). The problem for a deep net is that it is unlikely to have seen arrows used visually this way enough to generalize a “consumes” meaning. More recently related research has demonstrated that a similar “training” technique can be useful for many natural language tasks. An example is Bidirectional Encoder Representations from Transformers (or BERT). So there is some general usefulness here but the black box nature and brittleness/deistractiability of the NN knowledge calls for some additional work. For example, as speaker Niket Tandon suggested, adversarial training (see Li, Yitong et al, 2018) during learning could help with mis-identification and commonsense considerations is one way to generate adversarial data. There remain many problems with ordinary text understanding such as the implications and scope of negations and what is entailed. For example (Kang et. Al 2018,) showed the problems of textual entailment with sentences from the Stanford Knowledge Language Inference set and how guided examples of an adversarial commonsense nature could help reduce errors to sentences like “ The red box is in the blue box.” And we can borrow from the previous ecological example as a language understanding example. A trained NN would not have seen insects, frogs & raccoons in one sentence frequently. To a human the use of an arrow as indicating consumption may be communicated in one trial situational learning sentence – “this is what we mean by an arrow”. So again the higher level of situations and situational questions may require commonsense, to understood the circumstances. Handling focus and scale is another problem in visual identification. In a lake scene with a duck a ML vision system may see water features like dark spots as objects. Here as Niket Tandon argued deep neural nets can easily be fooled (see: evolvingai.org/fooling) so there may be a need for a model of the situation and what is the focus of attention – a duck object. Some use of commonsense as part of model-based explanations might help during model debugging and decision making to correct the apparently unreasonable predictions. In summary Niket Tandon suggests that the above reasons, commonsense aware models & representations may be assumed to be DL friendly may provide several benefits especially for the NN and DL efforts underway. They may: • help to create adversarial learning training data (for a DL) • help to generalize to other novel situations, alternate scenarios and domains • compensate for limited training data • be amicable to and facilitate explanation e.g. with intermediate structures     (2F14)

In the sub-sections below these points are further illustrated by some reasons for the challenges to DL.     (2F15)

To understand situations (hat exactly is happening?) a naive computational system has to track everything involved to a situation/event. This may involve a long series of events with many objects and agents. The ecological example provided before is illustrative as is visualizing a play in basketball even as simple as a made or missed dunk. Images of activities can be described by some NL sentences 1-3 He charges forward. And a great leap. He made a basket     (2F16)

But this may be understood in terms of some underlying state-action-changes. There are a sequence of actions such as jumping and there are associated but also implied states (1-3): 1. The ball is in his hands. (not actually said, but seen and important for the play) 2. The player is in the air. (implied by the leap) 3. The ball is in the hoop. (technically how a basket is made)     (2F17)

We can represent the location of things (1-3) simple as:     (2F18)

Location (ball) = player's hand Location (player) = air Location (ball) = hoop     (2F19)

The thing is that these all fit into a coherent action with the context of basketball and we know and can focus on the fact that the location of the ball at the end of the jump is a key result. On other hand as shown by Dalvi, Tandon, Clark a naive training activity it is expensive to develop a large train set for such activities and resulting state-action-change models have so many possible inferred candidate structures (is the ball still in his hand? Maybe it was destroyed.) that common events can evoke an NP complete problem. And without sufficient data (remember it is costly to construct) the model can produce what we consider absurd, unrealistic choices based on commonsense experience such as the player being in the hoop. But it is possible as shown in the images below.     (2F20)


A solution is to have a commonsense aware system that constrains the search for plausible event sequences. This is possible with the design and application of a handful of universally applicable rules. For example these constraints seem reasonable based on commonsense: • An entity must exist before it can be moved or destroyed. (certainly not likely in basketball) • An entity cannot be created if it already exists. In the work discussed by Niket Tanden these constrains were directly derivable from SUMO rules such as: MakingFn, DestructionFn, MotionFn. This provides preliminary evidence that ontologies even early ones such as SUMO could be good guides for producing a handful of generic hard constraints in new domains. So how much help do these constraints provide? Commonsense biased search improves precision by nearly 30% over State Of The Art DL efforts - Recurrent Entity Networks (EntNet) (Henaff et al., 2017). Query Reduction Networks (QRN) (Seo et al., 2017) and ProGlobal (Dalvi et al., 2018).     (2F21)

Deep learning can provide some explanations of what they identify is simple visual Dbs such as VQA and CLEVR.  They can answer questions like” What is the man riding on?” in response to being shown the figure below:


Commonsense knowledge is more important when the visual compositions are more dynamic and involve multiple objects and agents such as in VCR. Several were noted as part of talks, for example by Tanden & Grosof starting with 4.1. How to acquire CSK     (2F22)

As noted earlier it can be costly but some ruling constraints can be derived from existing ontologies. An extraction process from text using AI, ML or NLP tools may be less costly but much noisier and has the problem of placing knowledge in situational context. In some advanced machine learning cases prior knowledge (background knowledge) may be used to judge the relevant facts of an extract, which makes this a bit of a bootstrapping situation. However much of what is needed may be implicit and inferred and is currently only available in unstructured and un-annotated forms such as free text. NELL is an example of how NLP and ML approaches can be used to build CSK and domain knowledge but source context as well as ontology context needs to be taken into account to move forward. It seems reasonable that the role of existing and emerging Semantic Web technologies and associated ontologies is central to making CSK population viable and that some extraction processes using a core of CSK may be a useful way of proceeding. Another problem is that there are know biases such as reporting bias involved in common human understanding. Adding to this is the knowledge exist in multimodal and related forms which makes coverage challenging across different contextual situations. Some of these contextual issues were discussed as part of the 2018 Summit.     (2F23)


4.2. Confusion about explanation and commonsense concepts It is perhaps not surprising that this notable among non-research industry and the main street, but also the technical media covering computation and systems but not AI. Like many specialties this needs to be addressed first in the research community where useful consensus can be reached and a common way of discussing these be developed and scaled down to be understood by others. 4.3 The Range of Ontologies Needed for and Adequate CSK Michael Gruninger's work suggested some significant types of ontologies the might be needed to support something as reasonable as a Physical Embodied Turing Test? The resulting suit is called PRAxIS (Perception, Reasoning, and Action across Intelligent Systems) with the following components: • SoPhOs (Solid Physical Objects) • Occupy (location -Occupation is a relation between a physical body and an spatial     (2F24)

	region { there is no mereotopological relationship between spatial regions and physical bodies)

• PSL (Process Speci fication Language) • ProSPerO (Processes for Solid Physical Objects) • OVid (Ontologies for Video) • FOUnt (Foundational Ontologies for Units of measure)     (2F25)

4.4Mission creep, i.e., expansivity of task/aspect Again this is a common phenomena in AI among theorists and in hot, new areas like Deep Learning. One sees the expansion of the topic, for example, at IJCAI-18 workshop on explainable AI a range of topical questions drawn from several disciplines, including cognitive science, human factors, and psycholinguistics, such as: • how should explainable models be designed? • How should user interfaces communicate decision making? • What types of user interactions should be supported? • How should explanation quality be measured? A perennial problem is ignorance of past research and a sense of what’s already practical. For example in the area of deep policy/legal deduction for decisions is it difficult to provide full explanation of extended logic programs, along with NL generation and interactive drill-down navigation? We seem to have reached a point where we can have a good version of cognitive search that takes into account provenance along with focus and lateral relevance, using extended knowledge graphs. (See Kyndi research). 4.5. How to evaluate explanations or CSK – what is a quality explanations (or a good caption for an image?) Evaluation remains a research issue in both CSK and explanation. As we have seen there are many criteria for judging the goodness of automated process from simple labeling to full explanations. How do we validate the knowledge? It is not as simple as saying that a system provides an exact match of words to what a human might produce given the many ways that meaning may be expressed. And it is costly to test system generated explanations or even captions against human ones due to the human cost. One interesting research approach is to train a train a system to distinguish human and ML/DL system generated captions (for images etc.). After training one can use the resulting criteria to critiques the quality of the ML/DL generated labels (Cui et al 2018). A particular task is evaluating the quality of knowledge, both CSK and non) extracted from text. Provenance information or source context is one quality needed. In some cases, and increasingly so, a variety of CSK/information extracted is aligned (e.g. some information converges from different sources) by means of an extant (hopefully quality) ontology and perhaps several. This means that some aspect of the knowledge in the ontologies provides an interpretive or validating activity when building artifacts like KGs. Knowledge graphs can also be filled in by internal processes looking for such things as consistency with common ideas as well as from external processes which adds information from human and/or automated sources. An example currently used is to employ something like Freebase's data as a "gold standard" to evaluate data in DBpedia which in turn is used to populate a KG. We can again note that a key requirement for validatable quality of knowledge involves the ability to trace back from a KB to the original documents (such as LinkedData) and if filled in, from other sources such as humans to make it understandable or trustworthy. It is useful to note that this process of building such popular artifacts as KGs clearly shows that they are not equivalent in quality to supporting ontologies. In general there some confusing of equating the quality of extracted information fork text, KGs, KBs, the inherent knowledge in DL systems and ontologies. 4.6. Alternative values and disconnects between users, funders and investors Users often perceive critical benefits/requirements for things like commonsense explanation. Their context is that their current AI systems may produce confusing results with explanations that do no make sense and/or are complex and unwieldy, while often being misleading or inaccurate. Funders certainly consider big picture benefits such as more overall effectiveness of systems with less exposure to risk of non-compliance. But they should also look at research issues, such as agile development and knowledge maintenance involved and current methods for producing explanations and/or commonsense reasoning are expensive and perhaps not scalable or take too much time to update. Investors (both venture and enterprise-internal) may look for a shorter and more dramatic feature impact & business benefits compared to currently deployed methods. Among those benefits is accurate decisions but a case should be made for better explanations that can be cost effective by requiring less labor or the precious time of subject matter experts which now can participate in a closer knowledge refinement loop. Investors fail to perceive value of what they consider add-ons like explanation which seem costly, but especially commonsense reasoning which is seen as simple. 4.7. How to represent knowledge for explanations Both formal/logical and informal or commonsense knowledge of assertions, queries, decisions, answers & conclusions along with their explanations has to be represented. As noted in the work of Gruniger single ontologies are not likely to be suitable as work expands and more contexts are encountered. This will require multiple ontologies. Big knowledge, with its heterogeneity and depth complexity, may be as much of a problem as Big Data especially if we are leveraging heterogeneous, noisy and conflicting data to create CSK and explanations. it is hard to imagine that this problem can be avoided. The ontology experience is that as a model of the real world we need to select some part of it based on interest and conceptualization of that interest. Differences of selection and interpretation are impossible to avoid and it can be expected that different external factors will generate different context for any intelligent agent doing the sections and interpretation needed as part of a domain explanation. The Figure below from Tanden et al (2018) provides one view of some of the CSK, organized by visual modalities, available in a variety of forms. As can be seen CYC is well represented. Need In the context of applying highly structured kowledge such as ontologies to DL system knowledge there is an issue of how to first connect the noisy space of data to the perfect world of ontolgies/KBs and their modules, such as CYC microtheories, Such connections may not be scalable, and the effort not easily made crowdsourcable (Tanden et al, 2018). Various approaches exist for different forms of CSK and the integration of these is challenging. Linked data may view some formal knowledge as a set of linked assertions but for integration these may be linked to regular sentences expressing these assertions from which sentences may be generated. While commonsense is an important asset for DL models, its logical representation such as microtheories has not been successfully employed. Instead, tuples or knowledge graphs comprising of natural language nodes has shown some promise, but these face the problem of string matching i.e., linguistic variations. More recent work on supplying commonsense in the form of adversarial examples, or in the form of unstructured paragraphs or sentences has been gaining attention recently. But experience in this area suggests that construction is noisy/approximate when scaling up to significant amounts of diverse data and that task or application specific KBs may be more effective if scale is involved.     (2F26)

In the previously cited example (Financial Regulatory/Policy Compliance), NL-syntax sentence may have 1 or more logic-syntax sentences associated with it that formally encode the understood assertions and allow some reasoning (unlike free text which can represent CSK). For explanations they may also assert its provenance, or even represent its text interpretation in some other expression (this means that). Obviously logic-syntax sentences may have 1 or more NL-syntax sentences that can be expressively associated with it which can be output by a text generation process. There may also be a round trip view in that there are source sentence used in text interpretation, that produced a logical representation from which other expression of the knowledge may be produced. Work at this level is still in its early stages. 4.8. Robustness and Understandability How robust is commonsense reasoning and decision making? As noted there are now many ML applications which are increasingly looked on as mature enough to use for some ordinary tasks. Visual recognition is one of these but Shah et. al (2019), suggests that some such applications are not robust. Simple alternative NL syntactic formulations lead to different answers. For example, “What is in the basket” and “What is contained in the basket” (or “what can be seen inside the basket”) evoke different answers. Humans understand these as similar, commonsense meanings, but ML systems may have learned something different. Another problem is that DLs, by themselves, are black-box in nature. So while these approaches allows powerful predictions, by themselves their outputs cannot be directly explained. Additional functionality is needed including a degree of commonsense reasoning. Such focused, good, fair explanations may use natural language understanding even be part of a conversational dialogue human-computer interaction (HCI) in which the system uses previous knowledge of user (audience) knowledge and goals to discuss output explanations. Such “Associate Systems” may get at satisfactory answers because they include a capability to adaptively learn user knowledge and goals and are accountable for doing so over time as is commonly true for human associates. 4.9 Enhancing Ontology Engineering Practices We will need to arrive at a focuses understanding of CSK that can be incorporated into ontological engineering practices. For efforts like CSK base building this should include guidance and best practices for the extraction of rules from extant, quality ontologies. If knowledge is extracted from text and online information building of CSK will require methods to clean, refine and organize them probably with the assistance of tools. In light of this future work, will need to refine a suite of tools and technologies to make the lifecycle of CSKBes easier and faster to build.     (2F27)

9. Conclusions     (2F28)

It seems clear that both CSK and explanation remain important topics as part of AI research and its surging branch of ML. Further they are mutually supportive although explanation may be the more active area of diverse work just now. We seem close to AI systems that will do common tasks such as driving a car or give advice on common tasks like eating. It seems clear that such everyday tasks need to exhibit robust commonsense knowledge and reasoning to be trusted. Thus as intelligent agents become more autonomous, sophisticated,and prevalent, it becomes increasingly important that humans interact with them effectively to answer such questions as “why did my self-driving vehicle take an unfamiliar turn?” Current AI systems are good at recognizing objects, but can’t explain what it sees in ways understandable to laymen. Nor can it read a textbook and understand the questions in the back of the book which leads researchers to conclude it is devoid of common sense. We agree as DARPA’s Machine Common Sense (MCS) proposal put it that the lack of a common sense is “perhaps the most significant barrier” between the focus of AI applications today (such as previously discussed), and the human-like systems we dream of. And at least one of the areas that such an ability would play is with useful explanations. It may also be true, as NELL researchers argue, that we will never produce true NL understanding systems until we have systems that react to arbitrary sentences with ”I knew that, or didn't know & accept or disagree because X”.     (2F29)

Some General Recurring Questions we suggest are worth considering include: 1. How can we leverage the best of the two most common approaches to achieving commonsense? • formal representations of commonsense knowledge (e.g. encoded in an ontology's content as in Cyc or Pat Hayes’ Ontology of Liquids) vs. • strategies for commonsense reasoning (e.g. default reasoning, prototypes, uncertainty quantification, etc.) 2. How to best inject commonsense knowledge into machine learning approaches? Some progress on learning using taxonomic labels, but just just scratches the surface 3. How to bridge formal knowledge representations (formal concepts and relations as axiomatized in logic) and representations of language use (e.g. Wordnet)     (2F30)

Commonsense knowledge and reasoning could assist in addressing some challenges in DL as well as explanation and in turn interest in these can turn to CSK and reasoning for assistance.     (2F31)

Many challenges remain including those of adequate representative knowledge, how to acquire the proper knowledge and the number of different but consistent and supportive ontologies may be needed even for simple task. How to use these to mitigate the brittleness of DL models in light of various adversarial/confusing alternative inputs is a problem as is the brittleness explanations.. Among the problems that DL systems face as they address increasingly complex situations is producing generalization as humans seem to do for unseen situations when faced with limited training data/experience. Another problem is that DL systems are not aware of an overall context when processing subtle patterns such as social interactions. A recent commonsense aware DL model makes more sensible predictions despite limited training data. Here, commonsense is the prior knowledge of state-changes e.g., it is unlikely that a ball gets destroyed in a basketball game scenario. Commonsense knowledge and reasoning can compensate for limited training data and make it easier to generate explanations, given that the commonsense is available in an easily consumable representation.     (2F32)


The model injects commonsense at decoding phase by re-scoring the search space such that the probability mass is driven away from unlikely situations. This results in much better performance as a result of adding commonsense.     (2F33)

References: 1. Cui et al 2018, Learning to Evaluate Image Captioning 2.Dalvi, Tandon, & Clark: ProPara task: data.allenai.org/propara 3. Deep neural networks are easily fooled: evolvingai.org/fooling 4.Grosof, Benjamin, et al. "Automated decision support for financial regulatory/policy compliance, using textual Rulelog." Financial Times (2014) 5. Kang et. al 2018, AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples 6.Li, Yitong, Timothy Baldwin, and Trevor Cohn. "What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training." arXiv preprint arXiv:1805.06088(2018). 7. Shah et. al (2019), Cycle-Consistency for Robust Visual Question Answering 8. Tandon et. al: Commonsense tutorial at CIKM 2017     (2F34)

Resources     (2G)

Here are some ideas for a working synthesis outline for Explanations (Gary Berg-Cross)     (2G1)

1. Meaning of Explanation [An explanation is the answer to the question "Why?" as well the answers to followup questions such as "Where do I go from here?"] – there are range of these     (2G2)

  • Grosof deductive Proof , with a formal knowledge representation (KR) – is the gold standard, but there are many types with different representations     (2G3)
    • – E.g., natural deduction –HS geometry there is also probabilistic     (2G3A)
  • Causal model Explanations     (2G4)

There are a range of concepts related to explanation     (2G5)

Trending-Up concepts of explanation     (2G11)

  • Influentiality – , heavily weighted hidden nodes and edges effect     (2G12)
  • Reconstruction – simpler / easier-to-comprehend model     (2G13)
  • Lateral relevance – interactivity for exploration     (2G14)
  • Affordance of Conversational human-computer interaction (HCI)     (2G15)
  • Good explanations quickly get into issue of understanding     (2G16)
    • What does it mean to understand , follow and explain a set of instructions?     (2G16A)

2. Problems and issues     (2G17)

  • From GOFAI     (2G18)
    • An early goal of AI was to teach/program computers with enough factual knowledge about the world so that they could reason about it in the way people do.     (2G18A)
  • early AI demonstrated that the nature and scale of the problem was difficult.     (2G19)
  • People seemed to need a vast store of everyday knowledge for common tasks. A variety of background knowledge was needed to understand & explain     (2G20)
  • Do we have asmall is the common ontology that we mostly all share for representing and reasoning about the physical world?     (2G21)
  • Additional Aspects/Modifiers of explanation:     (2G22)
  1. it remains challenging to design and evaluate a software system that represents commonsense knowledge and that can support reasoning (such as deduction and explanation) in everyday tasks. (evidence from modified Physical Turing Tests)     (2G23)

3. From XAI     (2G25)

  1. Bridging from sub-symbolic to symbolic -ontologies help constrain options     (2G28)

4. Application areas     (2G29)

  1. Examples of successes? Rulelog’s Core includes Restraint bounded rationality     (2G32)

5. Relevance and relation to context     (2G33)

6. Synergies with commonsense reasoning     (2G35)

7. Success stories/systems     (2G37)

  1. Issues Today in the Field of Explanation /Questions     (2G39)
  • How do we evaluate these ontologies supporting explanations and commonsense understanding?     (2G40)
  • How are these explanations ontologies related to existing upper ontologies?     (2G41)

8. Conclusions     (2G42)

  • Benefits of Explanation (Grosof)     (2G43)
    • Semi-automatic decision support     (2G43A)
    • Might lead to fully-automatic decision making – E.g., in deep deduction about policies and legal – especially the business and medicine topics.     (2G43B)
    • Useful for Education and training, i.e., e-learning – E.g., Digital Socrates concept by Janine Bloomfield of Coherent Knowledge     (2G43C)
    • Accountability • Knowledge debugging in KB development     (2G43D)
    • Trust in systems – Competence and correctness – Ethicality, fairness, and legality     (2G43E)
    • Supports Human-machine interaction and User engagement (see Sowa also)     (2G43F)
    • Supports Reuse / and guide choice for transfer of knowledge     (2G43G)

9. Contemporary Issues     (2G44)

  • Confusion about concepts – Esp. among non-research industry and media – But needs to be addressed first in the research community     (2G45)
  • Mission creep, i.e., expansivity of task/aspect – Esp. among researchers. E.g., IJCAI-18 workshop on explainable AI.     (2G46)
  • Ignorance of what’s already practical – E.g., in deep policy/legal deduction for decisions: full explanation of extended logic programs, with NL generation and interactive drill-down navigation – E.g., in cognitive search: provenance and focus and lateral relevance, in extended knowledge graphs     (2G47)
  • Disconnect between users and investors     (2G48)
  • (Ignorance of past relevant work)     (2G49)

Previous Meetings     (2H)


Next Meetings     (2I)