Actions

OntologySummitProcess and Talk:ConferenceCall 2016 03 31: Difference between pages

Ontolog Forum

(Difference between pages)
imported>KennethBaclawski
(Created page with "== Ontology Summit Process == The Ontology Summit is an organized thinking machine that works from January to April every year to brain-storm a topic of interest for the onto...")
 
imported>Garybc
 
Line 1: Line 1:
== Ontology Summit Process ==
Both syntactic and data structure-based interoperability among systems and applications has been worked on in recent decades.  A general, growing  belief is that there is a convergence on "standards" for interoperability components, including catalogs, vocabularies, services and information models. But along with this general movement is the recognition that  all such efforts require some degree of semantic agreements and techniques.  Increasingly these, in turn, have some degree of support from formal ontologies and related actitivities.


The Ontology Summit is an organized thinking machine that works from January to April every year to brain-storm a topic of interest for the ontology engineering community. Every Ontology Summit produces a Communiqué with the results of the brain-storming and discussion. Another important outcome is the synchronization of understanding of the topic of interest by the ontology engineering community.
= Some Items and Synthesis from our two session on Semantic Interoperability (SI) in the Earth Sciences. =
Big Science, Big Data and Big Industry provide many motivating challenges to achieve better system and data interoperability. The range of systems, data & semantic content is now broad but increasingly has to be integrated to be of use to Science & Society. Improved & integrated semantics is part of the resulting effort in the Geo and Earth Science supported by programs such as NSF's EarthCube.
== The Status of Ontologies in the Geo-Earth Sciences==


The Summit Communiqué is endorsed by Summit participants as well as the broader community and then published in the Applied Ontology journal, the premier journal in the field of Formal Ontology for information systems. The Communiqués are openly available on this wiki as well as in the public-facing website for each Ontology Summit. See the list of past Communiqués at [[OntologySummit]]. Each Communiqué has the co-chairs and track champions as editors. Producing the Communiqué is the final stage of each year's Summit.
* There are quite a few "ontologies" developed along the spectrum of semantic formality, but also comprehensiveness and completeness.
** See [https://docs.google.com/presentation/d/136Nu5rYogEakVVD7tvGdx7Z_u_FpTsdooNIp9uuInEo/edit?usp=sharing]
A classic and somewhat of a legacy effort is NASA's [https://sweet.jpl.nasa.gov/ SWEET (Semantic Web for Earth and
Environmental Terminology)] ontology with about 6000 concepts in over 200 separate, modular ontologies. SWEET can provide some basis for semantic tagging, however, it has few axioms to support reasoning and needs to be supplemented whenever used for advanced purposes. Such ontologies build on community efforts to develop standard vocabularies within domains to support data and system interoperability. It is generally recognized that these efforts lack formal semantics and that ontologies capturing domain understanding can help to address this limitation.


Most of the Ontology Summit is virtual, consisting of both synchronous and asynchronous processes.  The synchronous process consists of weekly virtual meetings.  The asynchronous process consists of a collection of mailing lists.
==Some General Problems ==
Despite the increasing number and quality of ontologies there are still what has graphically described as a "semantic mess."
Domain information is heterogeous and described in:
# multiple schemas,  
# different vocabularies & markup languages and  
# ontologies with different level of granularity in the data & different conceptualization.


The Ontology Summit concludes with a 2-day face-to-face Workshop and SymposiumThe Ontology Summit as a whole has general co-chairs, and the Symposium has its own co-chairs.
In the Earth Science we probably don't have the "right" mix of ontologies needed to routinely solve this challenges and systematically achieve interoperability across domains although we have made some progress with some modules & a reference ontology in a small fraction of the domain space. 
Within this imperfect but growing collection of ontologies and efforts at controlled vocabularies, we are often using them sloppily, often informally, and we don't have adequate mappings between concepts .
Part of this is because there is not entire agreement on well founded integration techniquesOne cannot just glue and stitch together a very large, all encompassing, master ontology.
In practice interoperability is difficult to achieve,even with the help of ontolgoies since applications across domains utilizes information in a different way, and the knowledge/ontology conceptualizations and  representation formalisms inherent in or used explicitly by these applications can also different.
There is general agreement that we lack of small, ontological building blocks and there does seem to be some growing interest exploring the use of modular, incremental approaches, early agreement and conceptualizations and well crafted reference ontologies.


Each year's topic is decomposed to 3-7 facets/aspects that are intensively discussed separately. Such a facet is called "track". Summit track discussions are facilitated by Track Champions and have a wiki page "Track Synthesis" as the main outcome artifact. The Track Synthesis page is edited by track champions on a basis of:
== '''Technical Challenges'''==
* The threaded discussion of the Community of Summit participants.
* Existing GeoScience standards, ontologies, models and associated terminologies such as shown in the earlier linked Figure (above) were typically developed in isolation.
* Postings of community of Summit participants to the Track Inputs wiki page.
* Vocabulary harmonization is a bottleneck and is impeded by lack of reference ontologies which might resolve heterogeneity.
* Results of the weekly track sessions.
===Standards===
* Results of online polls, Delphi studies, surveys, questionnaires and other thought-provoking techniques.
To a large degree we have been converging on the lower and middle level of what is needed for interoperabilty - common protocols and data
formats to ensure the proper exchange of data.  
There is some belief that we can ensure a perfect syntactic interoperability, e.g., via rigid standardization. But when we consider the higher, semantic level & semantic interoperability we rely on a common understanding of the messaging and exchanged data, i.,e., meaning remains invariant during the exchange between multiple systems.
This requires common reference systems and the problem is that:
** Various types of standards that do exist are, for the most part, heterogeneous, meaning they:
*** are mostly fragmented and disconnected, describing potentially relatable concepts
*** lack a grounding in foundational semantics.
*** may use the same or similar terms but with differences in semantics.
*** are described using different formal (or non-formal) languages.  


The Track champions also serve as co-editors for the Communiqué that captures the major ideas of the Track Synthesis pages of all Tracks.
As a result major problems exist when standard driven efforts and products are combined reflecting  differences in conceptualization, semantic drift upon new problem formulation.


The General Summit co-chairs are ultimately responsible for the thematic integrity of all Tracks and the Workshop and Symposium.
* * Some glue such as via a reference ontology  is needed to integrate and harmonize these.


Every track has one or more online 2-hour sessions that are comprised of 3-6 short slide presentations on track topics. These sessions also have a group chat and Q&A session with the panel of presenters. Audio, slides, chat transcripts and survey results will be published and used by track champions to prepare the Track Synthesis. Each session will usually have two co-chairs. The session co-chairs are responsible for inviting leading experts on relevant areas relevant to the session topic as speakers.  The session co-chairs are also responsible for facilitating the session chat and moderating the discussion among the panelists and other session participants.
*Upper-level and many domain ontologies are important to such SI but there are challenges:
**There seems to be no one taxonomic hierarchy we can agree on.
** Many of the upper and domain ontologies are hard to understand  or have too many terms, are too abstract, with too  complicated axioms to understand and yet remain too far from real data.


The Ontology Summit Organizing Committee consists of track champions, workshop and symposium co-chairs, lead co-editors, and general Summit co-chairs. Also the Organizing Committee has representatives sponsoring organizations, such as Ontolog, NIST, NCOR, NCBO, IAOA, NCO_NITRD.  The Ontology Summit should also have a public relations organizer.
*** Or they impose ontological commitments that may not be acceptable by all interested parties who have 'local' vocabularies and meanings.


All the logistics and overall organizing work in the virtual part of Summit (i.e. the tracks and virtual part of Workshop and Symposium) is coordinated by a designate from the Ontolog Forum that acts on behalf of and for the Organizing Committee. The same organization/logistics function for the real part of Workshop and Symposium is similarly provided by a coordinator designated by the Organizing Committee.
** Like AI systems in the past they may also be too brittle - small changes are not easily incorporated or compromise the semantics.


Thus ideas on current the Ontology Summit topic flow from:
== There are many '''Social/non-technical issues''' including: ==
* Ontology Summit discussion list
* no agreement or controlling body or process to coordinate efforts and/or to validate ontologies and their axioms
* Track Session presentations, panel discussions and chats; surveys and polls, Delphi studies
* How do we verify and validate these structures (ontology efficacy)? (i.e. if an ontology is created to do some thing, x; who verifies it actually does, x?)
* Track Input pages
* Who owns the ontology once it is published?
* Track Synthesis pages
Who maintains the ontology once it's released into the wild; i.e. published or... portaled?
* Workshop and Symposia final discussion
* Where do we put Earth science ontologies (or semantic models; the word ontology has kind of lost its meaning) once they have been created?
* Published Communiqu&eacute
**  e.g. LOV, ontology repositories
** ESIP, bio portal and OntoHubOOR
*How do we handle"
** inability/unwillingness to participate,
** fear of unanticipated cost, worry with major changes in their local system,
** skeptic about scalability


== Infrastructure ==
== Tool Issues ==


The Ontology Summit is a complex process that requires a supporting infrastructure.  This infrastructure continually evolves as new technology is made available.  The main current technologies are:
* There is a need for concept search supported by conceptual similarity like SEM+
* File-Sharing Workspace / Document Archives are on Amazon S3.  This can be accessed using an S3 client.  For example, the S3 Browser is available for free from [http://s3browser.com/ s3browser.com].  Access requires credentials.  These can be obtained on request from [[KenBaclawski|Ken Baclawski]].  This workspace includes:
* Agent Brokering employs central mechanisms to help resolve such things as disparate vocabularies, support data distribution requests, enforce translatable standards and to enable uniformity of search and access in heterogeneous operating environments.
** The slides of the track session and symposium presenters
**When searching for data current semantic brokers still yield:
** The audio recordings of the track sessions and symposia
#Invalid content or responses
** Archives of the mail lists
# Unidentifiable document types
** Backups of this wiki
# Empty metadata elements -especially required elements
* The Ontolog Community wiki
* We need integrated conceptual modeling and KE tools to build and bridge ontologies.
** Hosted on Digital Ocean by [[KenBaclawski|Ken Baclawski]]
** Openly accessible to anyone in the community
* The Ontolog Community mailing lists
** Hosted on googlegroups.com


== Acknowledgments ==
=Solutions=
===Ontology Design Patterns ===
One solution approach is the Ontology design pattern (ODP) which are potentially reusable solution of a frequently occurring modeling problem in the domain.  The idea is to use the simple pattern leveraging concrete domain notions for which there are data as a building block of a more complex ontology which adds commitments as needed.
 
To handle different interpretations we make appropriate mapping/alignment  between "local" vocabulary and the pattern. (Local ontology development is the glue)
A (local) view of the pattern allows us to connect a data source and the patterns via a specific and explicit mapping.
 
This "local view" employs a very minimalistic schema (class names, property names,simple domain and range axioms).  To facilitate this we
separate core conceptualization" from nomenclature" issues,  That is:
vocabulary terms in a local view may be data repository-speciffic and need not be the same as the the terms used in a pattern.
Mapping from data to the pattern can be expressed in rules that help populating the patterns.
Data providers can populate the global schema (pattern collection) by
simply populating a local view.
 
===  Reference Ontologies ===
Supplementing this is the use of a Reference Ontology.
 
This supports semantic integration between existing domains ontologies and schemas but requires:
* Translation between ontology languages.
* More rigorous specification of the semantics in each ontology.
* And perhaps deeper semantics
 
Such integration can currently be done only by manual integration of the ontologies.... But use of a suitable reference ontology may automate this.
 
=== Common conceptual models are needed as organizing ontologies like ViVO can also be useful. This allows ===
*  some chance of  leveraging of  existing ontologies  to reduce modeling effort
* constraints on ontology needs & possibilities by using information about
** Particular entity types & relationships
** Significant legacy dependencies


This page represents an attempt to capture the Ontology Summit process.  The original document was produced by
[[AnatolyLevenchuk]] for the [[OntologySummit2012|Ontology Summit 2012]].  The current version was produced by
[[KenBaclawski|Ken Baclawski]].  Additional input is always welcome.


[[Category:OntologySummit]]
[[Category:OntologySummit]]
[[Category:OntologySummit2016]]

Latest revision as of 14:15, 26 April 2016

Both syntactic and data structure-based interoperability among systems and applications has been worked on in recent decades. A general, growing belief is that there is a convergence on "standards" for interoperability components, including catalogs, vocabularies, services and information models. But along with this general movement is the recognition that all such efforts require some degree of semantic agreements and techniques. Increasingly these, in turn, have some degree of support from formal ontologies and related actitivities.

Some Items and Synthesis from our two session on Semantic Interoperability (SI) in the Earth Sciences.

Big Science, Big Data and Big Industry provide many motivating challenges to achieve better system and data interoperability. The range of systems, data & semantic content is now broad but increasingly has to be integrated to be of use to Science & Society. Improved & integrated semantics is part of the resulting effort in the Geo and Earth Science supported by programs such as NSF's EarthCube.

The Status of Ontologies in the Geo-Earth Sciences

  • There are quite a few "ontologies" developed along the spectrum of semantic formality, but also comprehensiveness and completeness.

A classic and somewhat of a legacy effort is NASA's [https://sweet.jpl.nasa.gov/ SWEET (Semantic Web for Earth and Environmental Terminology)] ontology with about 6000 concepts in over 200 separate, modular ontologies. SWEET can provide some basis for semantic tagging, however, it has few axioms to support reasoning and needs to be supplemented whenever used for advanced purposes. Such ontologies build on community efforts to develop standard vocabularies within domains to support data and system interoperability. It is generally recognized that these efforts lack formal semantics and that ontologies capturing domain understanding can help to address this limitation.

Some General Problems

Despite the increasing number and quality of ontologies there are still what has graphically described as a "semantic mess." Domain information is heterogeous and described in:

  1. multiple schemas,
  2. different vocabularies & markup languages and
  3. ontologies with different level of granularity in the data & different conceptualization.

In the Earth Science we probably don't have the "right" mix of ontologies needed to routinely solve this challenges and systematically achieve interoperability across domains although we have made some progress with some modules & a reference ontology in a small fraction of the domain space. Within this imperfect but growing collection of ontologies and efforts at controlled vocabularies, we are often using them sloppily, often informally, and we don't have adequate mappings between concepts . Part of this is because there is not entire agreement on well founded integration techniques. One cannot just glue and stitch together a very large, all encompassing, master ontology. In practice interoperability is difficult to achieve,even with the help of ontolgoies since applications across domains utilizes information in a different way, and the knowledge/ontology conceptualizations and representation formalisms inherent in or used explicitly by these applications can also different. There is general agreement that we lack of small, ontological building blocks and there does seem to be some growing interest exploring the use of modular, incremental approaches, early agreement and conceptualizations and well crafted reference ontologies.

Technical Challenges

  • Existing GeoScience standards, ontologies, models and associated terminologies such as shown in the earlier linked Figure (above) were typically developed in isolation.
  • Vocabulary harmonization is a bottleneck and is impeded by lack of reference ontologies which might resolve heterogeneity.

Standards

To a large degree we have been converging on the lower and middle level of what is needed for interoperabilty - common protocols and data formats to ensure the proper exchange of data. There is some belief that we can ensure a perfect syntactic interoperability, e.g., via rigid standardization. But when we consider the higher, semantic level & semantic interoperability we rely on a common understanding of the messaging and exchanged data, i.,e., meaning remains invariant during the exchange between multiple systems. This requires common reference systems and the problem is that:

    • Various types of standards that do exist are, for the most part, heterogeneous, meaning they:
      • are mostly fragmented and disconnected, describing potentially relatable concepts
      • lack a grounding in foundational semantics.
      • may use the same or similar terms but with differences in semantics.
      • are described using different formal (or non-formal) languages.

As a result major problems exist when standard driven efforts and products are combined reflecting differences in conceptualization, semantic drift upon new problem formulation.

  • * Some glue such as via a reference ontology is needed to integrate and harmonize these.
  • Upper-level and many domain ontologies are important to such SI but there are challenges:
    • There seems to be no one taxonomic hierarchy we can agree on.
    • Many of the upper and domain ontologies are hard to understand or have too many terms, are too abstract, with too complicated axioms to understand and yet remain too far from real data.
      • Or they impose ontological commitments that may not be acceptable by all interested parties who have 'local' vocabularies and meanings.
    • Like AI systems in the past they may also be too brittle - small changes are not easily incorporated or compromise the semantics.

There are many Social/non-technical issues including:

  • no agreement or controlling body or process to coordinate efforts and/or to validate ontologies and their axioms
  • How do we verify and validate these structures (ontology efficacy)? (i.e. if an ontology is created to do some thing, x; who verifies it actually does, x?)
  • Who owns the ontology once it is published?

Who maintains the ontology once it's released into the wild; i.e. published or... portaled?

  • Where do we put Earth science ontologies (or semantic models; the word ontology has kind of lost its meaning) once they have been created?
    • e.g. LOV, ontology repositories
    • ESIP, bio portal and OntoHubOOR
  • How do we handle"
    • inability/unwillingness to participate,
    • fear of unanticipated cost, worry with major changes in their local system,
    • skeptic about scalability

Tool Issues

  • There is a need for concept search supported by conceptual similarity like SEM+
  • Agent Brokering employs central mechanisms to help resolve such things as disparate vocabularies, support data distribution requests, enforce translatable standards and to enable uniformity of search and access in heterogeneous operating environments.
    • When searching for data current semantic brokers still yield:
  1. Invalid content or responses
  2. Unidentifiable document types
  3. Empty metadata elements -especially required elements
  • We need integrated conceptual modeling and KE tools to build and bridge ontologies.

Solutions

Ontology Design Patterns

One solution approach is the Ontology design pattern (ODP) which are potentially reusable solution of a frequently occurring modeling problem in the domain. The idea is to use the simple pattern leveraging concrete domain notions for which there are data as a building block of a more complex ontology which adds commitments as needed.

To handle different interpretations we make appropriate mapping/alignment between "local" vocabulary and the pattern. (Local ontology development is the glue) A (local) view of the pattern allows us to connect a data source and the patterns via a specific and explicit mapping.

This "local view" employs a very minimalistic schema (class names, property names,simple domain and range axioms). To facilitate this we separate core conceptualization" from nomenclature" issues, That is: vocabulary terms in a local view may be data repository-speciffic and need not be the same as the the terms used in a pattern. Mapping from data to the pattern can be expressed in rules that help populating the patterns. Data providers can populate the global schema (pattern collection) by simply populating a local view.

Reference Ontologies

Supplementing this is the use of a Reference Ontology.

This supports semantic integration between existing domains ontologies and schemas but requires:

  • Translation between ontology languages.
  • More rigorous specification of the semantics in each ontology.
  • And perhaps deeper semantics

Such integration can currently be done only by manual integration of the ontologies.... But use of a suitable reference ontology may automate this.

Common conceptual models are needed as organizing ontologies like ViVO can also be useful. This allows

  • some chance of leveraging of existing ontologies to reduce modeling effort
  • constraints on ontology needs & possibilities by using information about
    • Particular entity types & relationships
    • Significant legacy dependencies