Actions

Ontolog Forum

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Ontology Summit 2012: (Track-4) "Large-scale domain applications" Synthesis

Mission Statement

This track will help to ground the discussions in the other tracks and bring key challenges to light by describing current large-scale systems and systems of systems that either use, or could use, ontologies in their deployment. "Large-scale" can mean either very large data sets, very complex data sets, federated systems, highly distributed systems, or real-time, continuous data systems. Examples of large data sets might include scientific observations and studies; complex data sets could be technical data packages for manufactured products, or electronic health records; federated systems could include information sharing to combat terrorism, highly distributed systems includes items such as the smart electrical grid (aka Smart Grid), and real-time systems include network management systems. Of course, some big systems might include all five aspects.

see also: OntologySummit2012_Applications_CommunityInput


In implemented systems, ontologies are...

  • Strong for:
    • Supporting change and aggregation
    • Enabling community aggregation, annotation
    • Automated data ingestion
    • Data validation
    • Ensuring consistency of terms across many data sets (Distributed systems)
    • Supporting reasoning
    • Self describing systems
    • Systems with many complex constraints, rules, laws, with frequent changes (Dynamically changing systems)
    • Data mining / semantic signature extraction
    • Rapid system building
  • Weak for:
    • Being understandable by software engineers and customers
    • Query performance (compared to relational databases)

Needs

  • Need better standards for common elements:
    • Datatypes
    • Ontology patterns (e.g. whole/part patterns)
    • Collect ontological primitives from observation data
  • Need repositories
    • Repositories of ontological patterns could be more useful than repositories of ontologies
  • Need industrial strength semantic services resident in the cloud
  • Need better visualization tools and approaches
  • Need better tools to help interpret legacy systems, transform into semantic systems.
  • Need to establish feedback mechanisms from end users to ontology designers directly from point of use.

Recommendations

  • Look for the 80-20 rule of semantic development
  • Use well defined and narrow use cases to demonstrate benefits of semantic approaches
  • Having explicit vocabularies (classifiers) is a must in a distributed system;
  • Community should be included in the development and evolution of vocabularies
  • It is critical to capture and evolve domain knowledge in a form that the community is comfortable with
  • Transition from implicit domain knowledge to explicit encoding requires community consensus - and an organization to manage the consensus
  • Some have recommended exposing users to SKOS semantics; use more complicated constructs only on back end if necessary.

Other Observations / Lessons learned

  • UML to OWL is a common requirement for legacy systems
    • Starting from scratch is rare.
  • Ontology patterns are very helpful, and encourage model reuse
  • Semantic techniques work best when not compromised by implementation tradeoffs
  • Semantic methods are faster to implement and easier to maintain
  • Semantic approaches particularly suited to systems with many complex constraints, rules, laws, with frequent changes
  • Incremental implementation is possible through federation of datastores
  • Ontologies are not always applied to enable reasoners - sometimes just as a more rigorous data modeling approach
  • Engineers turned ontologists often don't have the necessary background/skills
  • Existing infrastructure supports traditional software development far better than large-scale ontology development
  • There are many ontologies of dubious quality
  • Service-oriented architectures allow separation of code and ontology updates
  • Reasoner and query engine performance is highly dependent upon the exact formulation of rules and queries
  • No single technology/tool currently provides the best solution across all large system use cases

--

maintained by the Track-4 champions: Steve Ray & Trish Whetzel ... please do not edit