Blog:Improving Machine Learning using Background Knowledge

Purpose

Use of Ontologies to Improve Machine Learning Techniques and Results

Organized by Mike Bennett and Andrea Westerinen

Machine Learning (ML) is based on defining and using mathematical models to perform tasks, predict outcomes, make recommendations, etc. Initial models can be specified by a data scientist, and/or constructed through combinations of supervised and unsupervised learning and pattern analysis. However, it has been noted that if no background knowledge is employed, the ML results may not be understandable [1, 2]. Also, there is a bewildering array of model choices and combinations. Background knowledge could improve the quality of ML results by using reasoning techniques to select learning models and prepare the training and examined data [3] (reducing large, noisy data sets to manageable, focused ones).

The objective of this Ontology Summit 2017 track is to understand:

Challenges in using different kinds of background knowledge in machine learning
Role of ontologies, vocabularies and other resources to improve machine learning results
Design/construction/content/evolution/... requirements for an ontology to support machine learning

We explore the problem space in the first Track B session on March 15, by way of two presentations - one on learning for decision support and a session on using domain ontologies to improve document analysis for situational awareness and forensic investigation. These presentations discuss the challenges in their respective areas, and how the results are improved by the use of ontologies.

In the case of decision support, the benefits of combining ontologies with ML include improving the quality of decisions, making decisions understandable, and adapting the decision making processes in response to changing conditions. In the case of digital forensics and situational awareness, concept extraction from natural language text is improved by using an ontology to isolate the meanings/semantics of the concepts and provide “artificial intuition” into the text. For example, a financial ontology (e.g. FIBO and extensions to that) could be used for financial regulatory compliance processing, or a legal ontology for analysis of lawsuits.

The goal is to start a conversation on the kinds of ontologies needed for ML and NLP, and their "ground rules" and requirements. A recurring question in this and other comparable problem spaces, is where does the human fit in the loop and what do they do?

In the second Track B session on April 12, we want to continue our earlier explorations regarding using ontologies to improve machine learning understandability and natural language processing. We have presentations on Machine-Based ML, discussing how to get meaningful output from existing machine learning techniques, and on the use of FIBO and corporate taxonomies to extract and integrate information in data warehouses, operational stores and natural language communications.

References

[1] Guo, Yunsong, and Selman, Bart. "ExOpaque: A Framework to Explain Opaque Machine Learning Models Using Inductive Logic Programming". Retrieved from http://www.cs.cornell.edu/~guoys/publications/ExOpaqueICTAI07.pdf.

[2] Falk, Courtney, and Stuart, Lauren. "Meaning-based machine learning for information assurance". Retrieved from http://www.sciencedirect.com/science/article/pii/S2352664516300207.

[3] Domingos, Pedro. "A Few Useful Things to Know about Machine Learning". Retrieved from https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf.

Ontolog Forum

Contents

Purpose

Use of Ontologies to Improve Machine Learning Techniques and Results

Organized by Mike Bennett and Andrea Westerinen

References