EarthScienceOntolog: Panel Session-02 - Thu 2012-09-06     (1)

Mini-Series Theme: An Earth Science Ontology Dialog ("EarthScienceOntolog")     (1A)

Session Topic: Ontology Development and Application across Earth Science Systems Lifecycle     (1B)

Session Co-chairs: Dr. Gary Berg-Cross (SOCoP; Knowledge Strategies) and Dr. Naicong Li (University of Redlands) - intro slides     (1C)

Panelists / Briefings:     (1D)

  • Dr. RichardHooper (CUASHI) - "Implementing a community-governed ontology: Experiences from the Water Sciences Community" slides     (1E)
  • Professor PeterFox (RPI/TWC) - "Life cycle semantics in Earth and space sciences - what's worked (and not) and where are we... a decade in..." - slides     (1F)
  • Mr. MattiaSantoro (CNR, Italy) "A Semantic Broker for enhancing resources discovery," a work by StefanoNativi, Mattia Santoro, C. Fugazza, M. Craglia - slides     (1G)
  • Dr. NaicongLi (U of Redlands), Dr. PhilipMurphy (U of Redlands), Professor KrzysztofJanowicz (UC Santa Barbara) - "An ontology-driven framework and web portal for spatial decision support" - slides     (1H)

Abstract     (1J)

Ontology Development and Application across the Earth Science Systems Lifecycle - slides     (1J1)

This is the 2nd session of the Joint EarthCube-Ontolog Mini-series on "Ontology and Semantic Technology for the Earth Science Community" - a series of panel sessions dubbed: "EarthScienceOntolog" - an Earth Science Ontology Dialog.     (1J2)

This mini-series of events are co-organized/supported by members of the Earth Cube community, Ontolog community, SOCoP community, IAOA community.     (1J3)

This session is designed to contribute to the overall Earth Science Ontolog mini-series goal of exploring the current status and application of various semantic approaches and artifacts in order to help develop a semantically enabled cyberinfrastructure for the Earth Science Community.     (1J4)

A key focus of this session is to continue the process started in Session 1 (see kick-off session abstract) to enable meaningful dialog between members of both communities (Earth Science and ontology/semantics). To this end we seek to further an understanding of Earth Science system requirements expressed in such things as use cases, geo-science identified problems and issues, extant system architectures and their lifecycles, the status of relevant ontological engineering, architectures and approaches, and prospective tools.     (1J5)

Reflecting understanding achieved in earlier planning session, we believe it is important to start with an earth science perspective. We have invited speakers to discuss their needs and work reflecting actual problems facing that community. To the extent possible this discussion reflects past and current efforts on things such services, frameworks, and principles invloved in accessing and discovering data, problems bridging terminology differences insertion of semantic technology in current earth science infrastructure and the change of engaging relevant semantic technology and collaborating with the semantic community.     (1J6)

We expect that part of the process of reaching a better understanding will include focused response from the ontological and semantic technology community to those issues. To this end we have allow time for session participants to engage in Q&A and online chat as part of open discussion. We trust that this discussion will contribute to a foundation that can be built on in the remaining sessions.     (1J7)

More details about this mini-series at: EarthScienceOntolog (home page for this mini-series)     (1J8)

Briefings     (1J9)

  • RickHooper (CUASHI) - "Implementing a community-governed ontology: Experiences from the Water Sciences Community" - slides     (1J9A)
    • Abstract: The Consortium for the Advancement of Hydrologic Science, Inc. (CUAHSI) has been implementing the Hydrologic Information System (HIS), a standards-based services-oriented architecture for time-series data, over the past several years. A centerpiece of this effort is a metadata catalog that is drawn from the more than 90 published services and indexes tens of millions of time series. This discovery services is mediated by a list of search terms, a controlled vocabulary of "leaf concepts" that are arranged into a taxonomy from general to specific terms.     (1J9A1)
    • This is based on the work with Michael Piasecki (CCNY), and Ilya Zaslavsky (SDSC) - see an earlier presentation on CUAHSI given by Ilya during OntologySummit2012.     (1J9A2)
    • From focus groups and from implementing this system, we have observed the following:     (1J9A3)
      • 1. Users care most about the list of leaf concepts, and less about the taxonomy. Any logical taxonomy is acceptable and there is no ��right�� taxonomy. For convenience, users would like to group terms together into their own taxonomy, but this is a secondary consideration.     (1J9A3A)
      • 2. Although terms are ambiguous across disciplines (for example, between hydrology and atmospheric science), our most pressing problem within water science for discovery terms is a consistent implementation of an information model of the data across all components of the services-oriented architecture (e.g., servers, clients, and transmission of metadata). There is a tension between keeping search simple (but initial returns more ambiguous requiring secondary refinement and potentially leading to data misinterpretation) versus a more precise search. Clients must deal with multidimensional searches and missing metadata. Different data providers overload different fields to transmit metadata and the central catalog must reconcile these differences.     (1J9A3B)
      • 3. As we move from aquatic chemistry to solid-phase chemistry, metadata profiles shift. Aquatic chemistry variable names adhere to a naming convention that encodes critical methodological differences, whilst such naming conventions do not exist for solid-phase chemistry. Integrating these data into one system raises a number of challenges. This metadata problem exists for data derived from samples; in situ sensor data have a simpler and more consistent metadata profile.     (1J9A3C)
    • We are seeking guidance on how to address these problems. I will review some use cases and our initial approaches to solve these issues.     (1J9A4)
  • Professor PeterFox (RPI/TWC) - "Life cycle semantics in Earth and space sciences �� what��s worked (and not) and where are we�� a decade in��" - slides     (1J9B)
    • Abstract: In 2001, the promise of semantics was vibrant among a small group of geoscientists and a few even had started to partner with computer scientists. However, it wasn't until 2004 that languages and tools were mature enough to develop workable applications. The presentation gives a brief review of our approaches and practices since then and the domain/ application settings. Along the way, we refined a development methodology that we have now used for the last 6-7 years and has enhanced our collaborative semantic web efforts and broadened the developer base and these experiences will be discussed     (1J9B1)
  • Mr. MattiaSantoro (CNR, Italy) - "A Semantic Broker for enhancing resources discovery," a work by StefanoNativi, Mattia Santoro, C. Fugazza, M. Craglia - slides     (1J9C)
    • Abstract: Enhancing geospatial resource discovery capabilities can be achieved by augmenting the searchable descriptions of resources. Examples of additional descriptions (that is, something not searchable with typical geospatial discovery services) are: semantic information and user-generated annotations. This is the task assigned to a middleware component called "Semantic Broker" (a broker is a middleware service which implements intermediation services to make M client and N servers interoperable each others, in a transparent way for both sides). In particular, it will be discussed the Semantic Broker developed by [[EuroGEOSS]] and currently working in the GEOSS Common Infrastructure (GCI). This broker realizes a semantic-aware discovery by expanding all the textual terms contained in any "traditional" query and generates a "set of traditional queries" to be finalized. The semantic expansion is achieved by brokering a set of different controlled vocabularies, thesauri, and/or semantic engines (presently, the GCI makes use of: the GEMET thesaurus, the GEOSS categories vocabulary, the GCMD vocabulary, and the INSPIRE feature registry). This solution is also adopted to implement multi-lingual discovery capabilities.     (1J9C1)
  • Dr. NaicongLi (U of Redlands), Dr. PhilipMurphy (U of Redlands), Dr. KrzysztofJanowicz (UC Santa Barbara) - "An ontology-driven framework and web portal for spatial decision support" - slides     (1J9D)
    • Abstract: The data and models produced by research in Earth Sciences will have added values if they can be discovered and accessed easily as useful resources during large-scale planning and decision making processes. Automatic discovery of relevant and interoperable resources for the decision problem at hand needs formalization of decision problems and the approach adopted for problem solving, as well as semantic annotation of existing resources. In the past few years, the Spatial Decision Support (SDS) Consortium has been developing a spatial decision support ontology to synthesize and formalize the body of knowledge in SDS, establish a common vocabulary for the user community, and provide flexible entry points for searching SDS resources as well as for learning about SDS. Currently the SDS ontology is driving the SDS Knowledge Portal where the user can browse or search for SDS related concepts and a sample collection of SDS resources including planning process workflow templates, methods and techniques, data sources, software models and tools, and case studies. Together with domain data and process ontologies being developed in Earth Science disciplines, the SDS ontology can serve as an initial step towards enabling semantically enabled registration and discovery of resources for the purpose of planning and decision support. ...     (1J9D1)

Agenda     (1K)

  • Session Format: this is a virtual session conducted over an augmented conference call     (1K2)

Proceedings     (1L)

Please refer to the above     (1L1)

IM Chat Transcript captured during the session    (1L2)

see raw transcript here.     (1L2A)

(for better clarity, the version below is a re-organized and lightly edited chat-transcript.)     (1L2B)

Participants are welcome to make light edits to their own contributions as they see fit.     (1L2C)

-- begin in-session chat-transcript --     (1L2D)

  instructions: once you got access to the page, click on the "settings" button, and identify yourself (by modifying the Name field from "anonymous" to your real name, like "JaneDoe").

(as a Dial pad seems to be missing on Linux-based Skype v4.x for skype-calls.)     (1L2X)

Frank Chum, FrankDAgnese, Frank Olken, Francois de Beuvron, GaryBergCross, GenhanChen,

PhilipMurphy, Rich Keller, Rick Hooper, Robert Downs, Ruth Duerr, Scott Hills, ScottPeckham,

[09:47] Peter P. Yim: @Gary - in between speakers, would you please remind those on the call, who     (1L2AAT)

haven't logged into the chat-room, to join us there (we now have 48 people on the conference bridge,     (1L2AAU)

but only 37 in the chat-room     (1L2AAV)

[09:42] Peter P. Yim: == Rick Hooper presenting his talk ...     (1L2AAW)

[09:51] SiriJodhaKhalsa: why distinguish between ET from atmos energy budget and a pan measurement?     (1L2AAX)

[09:51] SiriJodhaKhalsa: conceptually?     (1L2AAY)

[09:55] SiriJodhaKhalsa: ScottPeckham - you've given a lot of thought to what should be in a     (1L2AAZ)

variable name, in the metadata and in an associated ontology. seems like these are getting mixed up     (1L2AAAA)

[10:01] ScottPeckham: Hi SiriJodhaKhalsa. The issues with data discovery and knowledge capture are     (1L2AAAB)

different than the "semantic mediation" issues that the CSDMS Standard Names were designed to     (1L2AAAC)

[10:03] ScottPeckham: The CSDMS Standard Names are described on our wiki at:     (1L2AAAE)

[09:56] Jack Ring: "When should scientist consult original source?" Always, and first. The use of     (1L2AAAG)

surrogate terms to represent context ALWAYS yields unacceptable Type 1 and Type 11 errors. Pls be     (1L2AAAH)

aware that you are being influenced by the presumption of von Neumann machines whereas modern data     (1L2AAAI)

relationship/flow machines are now available.     (1L2AAAJ)

[09:58] SiriJodhaKhalsa: Mattia - convey across communities or simply provide brokering service?     (1L2AAAK)

[10:04] Mattia Santoro: @SiriJodhaKhalsa - I think the two things are not in "contrast", brokering is     (1L2AAAL)

a way of homogenizing metadata and data models across communities and information systems     (1L2AAAM)

[09:59] Jack Ring: TKU, Rick. Well said.     (1L2AAAN)

[10:02] Rick Hooper: There is a cost to documenting data for re-use that data publishers tend not to     (1L2AAAO)

want to incur. It doesn't help them personally. We need to reward the effort required for data     (1L2AAAP)

documentation.     (1L2AAAQ)

[10:07] Frank Olken: Rick, How are you storing (and retrieving) the various hydrological time series:     (1L2AAAR)

a relational DBMS, HDF5 files, ???     (1L2AAAS)

[10:07] Rick Hooper: relational DBMS     (1L2AAAT)

[10:48] Frank Olken: Rick, Out of curiosity, which relational DBMS are you using for the hydrological     (1L2AAAU)

time series?     (1L2AAAV)

[10:55] Rick Hooper: @FrankOlken: This is all on the MS platform, so we are using MS SQL Server     (1L2AAAW)

[10:00] PhilipMurphy: what was the URL to the HIS project? [ ... --ppy/added     (1L2AAAX)

subsequently ]     (1L2AAAY)

[10:00] Peter P. Yim: == PeterFox delivering his brief ...     (1L2AAAZ)

[10:00] anonymous morphed into Rich Keller     (1L2AAAAA)

[10:00] anonymous1 morphed into Robert Downs     (1L2AAAAB)

[10:06] anonymous morphed into Sumit Purohit     (1L2AAAAC) are links to the two papers Peter mentioned. With some     (1L2AAAAE)

details of the ontology works     (1L2AAAAF)

[10:34] GenhanChen: Rick, on your slide 6 and 7, for the hierarchical data, do you just map "leaves"     (1L2AAAAG)

to variables for search? I am wondering how you deal with their parents' information (e.g. core     (1L2AAAAH)

concept, property or branch).     (1L2AAAAI)

[10:38] Rick Hooper: Genhan: Yes, but I should note that a leaf can be attached to more than one     (1L2AAAAJ)

branch, so it is not a strict hierarchy. There is no user input to the hierarchy--we control it.     (1L2AAAAK)

Does that answer your question?     (1L2AAAAL)

[10:41] GenhanChen: Rick, does it mean the parents' information will be lost?     (1L2AAAAM)

[10:55] Rick Hooper: @Genhan: Not sure what you mean. The parent information is not 'lost', but it is     (1L2AAAAN)

not transmitted to the user with the data or metadata return. Only the concept and the publisher's     (1L2AAAAO)

variable name is transmitted to the user.     (1L2AAAAP)

[11:03] GenhanChen: Thank you, Rick. That is what I would like to know.     (1L2AAAAQ)

[10:31] Jack Ring: What are the limitations of searching words in free text?     (1L2AAAAS)

[10:36] SiriJodhaKhalsa: automatic mapping is the crux, and most challenging aspect of 3-party     (1L2AAAAT)

[10:43] SiriJodhaKhalsa: question for Mattia: will the semantic broker be able to discover and     (1L2AAAAV)

utilize vocabularies/thesauri/ontologies associated with a data source?     (1L2AAAAW)

[10:46] SiriJodhaKhalsa: Mattia - does a new adapter have to be created for each new semantic     (1L2AAAAX)

[10:54] Mattia Santoro: @SiriJodhaKhalsa - A new adapter has to be created for each communication     (1L2AAAAZ)

protocol and/or data/metadata model     (1L2AAAAAA)

[10:46] Jack Ring: Mattia. Will you be measuring False Positives and False Negatives?     (1L2AAAAAB)

[10:56] Mattia Santoro: @JackRing - yes false positive is one thing we will measure to improve the     (1L2AAAAAC)

association strategy     (1L2AAAAAD)

[10:59] Mattia Santoro: @JackRing - the main issue with free text is in multidisciplinary contexts,     (1L2AAAAAE)

where the same word might have different meanings depending on the discipline (or sometimes even     (1L2AAAAAF)

depending on the language)     (1L2AAAAAG)

[11:05] Jack Ring: Mattia, TKU. How will you measure false negatives, i.e., things that existed but     (1L2AAAAAH)

[11:07] Jack Ring: Mattia, Cannot the ambiguity of what a word signifies be resolved by assessing     (1L2AAAAAJ)

context? Are you seeking to avoid combinatorial explosions in the computations?     (1L2AAAAAK)

[10:47] Peter P. Yim: == Naicong Li presenting ...     (1L2AAAAAL)

[11:03] Frank Chum: Great presentation, NaicongLi. Sorry I have to jump off to another meeting.     (1L2AAAAAM)

[11:09] Peter P. Yim: == open Q&A and discussion ... please click on "hand" button (lower right) to     (1L2AAAAAN)

notify the chair, and when recognized, announce yourself and make sure you can be heard before     (1L2AAAAAO)

[11:09] anonymous1 morphed into FrankDAgnese     (1L2AAAAAQ)

[11:14] Bobbin Teegarden: @all speakers What are you using for visualization, and is there any direct     (1L2AAAAAR)

interaction with the visualization itself?     (1L2AAAAAS)

[11:14] Rick Hooper: Startree viewer is what we are using, but there are problems with that.     (1L2AAAAAT)

[11:14] Krzysztof Janowicz: [ref. verbal question from SiriJodhaKhalsa] Are you asking about ontology     (1L2AAAAAU)

[11:17] GaryBergCross: >[Krzyszto] Are you asking about ontology alignment? - I don't think so.     (1L2AAAAAW)

[11:19] Krzysztof Janowicz: @SiriJodhaKhalsa: With respect to Spatial Data infrastructures there is     (1L2AAAAAY)

work on the automatic meditation of sensor metadata and observation data to ease the registration at     (1L2AAAAAZ)

Sensor Observation Services. I would consider this similar to a brokering with a reduced amount of     (1L2AAAAAAA)

human interaction.     (1L2AAAAAAB)

[11:23] anonymous morphed into Dalia Varanka     (1L2AAAAAAC)

[11:23] Jack Ring: Whereas each of us has a 25.000 to 75,000 word vocabulary many projects/companies     (1L2AAAAAAD)

run just fine with 2500 word vocabularies. Local ontologies are necessary. However Log(variety) =     (1L2AAAAAAE)

K*Log(extent) so as we seek interoperability among ever-larger communities then the ability to     (1L2AAAAAAF)

interoperate local ontologies becomes valuable.     (1L2AAAAAAG)

[11:29] Marcio Faerman: @JackRing, could you please explain what you mean by K, variety and extent in     (1L2AAAAAAH)

your previous statement? Thanks!     (1L2AAAAAAI)

[11:33] Jack Ring: @Marcio. As a system gets larger in scope (extent) it inexorably gets increasingly     (1L2AAAAAAJ)

heterogeneous (variety of elements and relationships). In complex systems research we often not that     (1L2AAAAAAK)

the relation is log(variety) = 0.7*log(extent) however many systems are messy so we even see that     (1L2AAAAAAL)

variety increases in a static system.     (1L2AAAAAAM)

[11:34] Marcio Faerman: Tks @JackRing!     (1L2AAAAAAN)

[11:25] Krzysztof Janowicz: [ref. RickHooper's remark that "some taxonomy is important, but it is not     (1L2AAAAAAO)

important to achieve a consensus on the taxonomy" (as different people use different logics to     (1L2AAAAAAP)

organize their terms, and their taxonomies will come out different)] I could not agree more.     (1L2AAAAAAQ)

[11:28] Krzysztof Janowicz: I agree that while terminologies are important, common agreement is not     (1L2AAAAAAR)

always important and difficult to archive.     (1L2AAAAAAS)

[11:28] Gary Berg-Cross: Krzysztof comment above was in the context of my question on value of     (1L2AAAAAAT)

taxonomies and modular ontologies.     (1L2AAAAAAU)

[11:29] GaryBergCross: I want mention the Dayton Vocamp this Sept 15-17 which may be of interest to     (1L2AAAAAAV)

some session attendees. Kno.e.sis Center at the Department of Computer Science and Engineering of     (1L2AAAAAAW)

Wright State University in Dayton, OH focus on the observation-driven engineering of geo-ontologies     (1L2AAAAAAX)

by developing small, self-contained, and reusable geo-ontology design patterns that can be used to     (1L2AAAAAAY)

[11:32] Peter P. Yim: == GaryBergCross wrapping up the session ...     (1L2AAAAAAAA)

[11:32] Peter P. Yim: join us again for session-3 of this mini-series on Oct-11 ... please mark your     (1L2AAAAAAAB)

[11:32] PhilipMurphy: Could we have such a Use Case <> Ontology development Ontolog session in the     (1L2AAAAAAAD)

[11:33] GaryBergCross: @PhilipMurphy - I'm all for it     (1L2AAAAAAAF)

[11:33] Marcio Faerman: Ditto, @PhilipMurphy     (1L2AAAAAAAG)

[11:34] Scott Hills: Re: Future Use Case <> Ontology session - I agree.     (1L2AAAAAAAH)

[11:33] Peter P. Yim: [added subsequently] @PhilipMurphy & All - Indeed! Also, the 5th session of this     (1L2AAAAAAAI)

mini-series slated for Dec-6 entitled: "Tutorial" to be chaired by Nancy Wiegand and Mike Dean, could     (1L2AAAAAAAJ)

well be part of what you are looking for. Come join us then, and let's try to refine on what kind of     (1L2AAAAAAAK)

session(s) would be most useful.     (1L2AAAAAAAL)

[11:33] Peter P. Yim: great session, thank you ALL, speakers and participants!     (1L2AAAAAAAM)

[11:34] Peter P. Yim: -- session ended: 11:34am PDT --     (1L2AAAAAAAN)

-- end of in-session chat-transcript --     (1L2AAAAAAAO)

Additional Resources     (1M)

For the record ...     (1M6)

Attendees     (1O)

