[auscope-geosciml] FW: GeoSciML vocabulary http URIs dereferencing old URIs and encoding errors in skos vocabularies

Simon.Cox at csiro.au Simon.Cox at csiro.au
Mon Feb 27 20:01:24 EST 2012


Hi Marcus, others:

As you are possibly aware we've been making a push on deploying RDF/SKOS vocabularies via SISSvoc.
Gillie is managing the CGI vocabs, but I'll make a few responses relating to your comments below:

0. Yes - the deployed vocabs should have
    oldURI owl:sameAs coolURI
and maybe
    oldURI skos:exactMatch coolURI
assertions included, then a request for the oldURI will yield a 303/200 (i.e. not a 404 not-found) and a link to the new identifier allowing the user to request that to get all the details.
We could run a reasoner and generate all the inferred triples so the oldURIs would work perfectly, but that is probably unwise if we want to encourage use of the coolURIs.

1. Yes - that is certainly an error

Simon

-----Original Message-----
From: auscope-geosciml-bounces at lists.arcs.org.au [mailto:auscope-geosciml-bounces at lists.arcs.org.au] On Behalf Of Ebner, Marcus
Sent: Monday, 27 February 2012 6:56 PM
To: auscope-geosciml at lists.arcs.org.au
Subject: Re: [auscope-geosciml] GeoSciML vocabulary http URIs dereferencing old URIs and encoding errors in skos vocabularies

Dear GeoSciML concept definition task group,

I am glad to see that you have come to a decision regarding the new (and hopefully persistent) URI scheme for the CGI vocabulary. Will the GeoSciML Server http://auscope-services-test.arrc.csiro.au/ that hosts the vocabulary be able to dereference (or redirect) the old URIs to the new cool ones? Since the old URIs are around for more than 2 years this would make sense (and would save me some work). In addition the authors of the INSPIRE data specification (Geology & Mineral Resources) make heavy use of CGI vocabularies although they currently ignore the identifier management or at least factor it out.... It would thus be very useful in terms of INSPIRE?

Apart from that I suggest that the development of the new version of the vocabularies could be used to get rid of some encoding errors  I have encountered while working with the currently available vocabularies:
1)
I have found that all ConceptSchemes have the following triples...
ConceptSchemeID  http://www.w3.org/1999/02/22-rdf-syntax-ns#type        http://www.w3.org/2004/02/skos/core#ConceptScheme
ConceptSchemeID  http://www.w3.org/1999/02/22-rdf-syntax-ns#type        http://www.w3.org/2004/02/skos/core#Concept

According the the Skos reference (see section 4.4 in http://www.w3.org/TR/2008/WD-skos-reference-20080125/#concept-schemes ) the classes Concept and ConceptScheme are DISJOINT!

2)
In the geologic unit morphology vocabulary
Sigmoidal vein and  vein shape are in a circular loop (i.e. they are broader and narrower concepts of each other) !!!

3)
The capitalization in the labels (especially for the concept schemes) is not consistent
e.g.    CGI Simple Lithology Categories
        CGI Lineation type categories
        CGI Alteration type Categories
        CGI Mapped Feature Observation Method terms
        ...
The following concept schemes lack  "CGI" in the prefLabel (and are sometimes in CamelBack notation)
        CompositionCategory
        ContactType
        EventEnvironment
        Fault Type Categories
        FoliationType
        Metamorphic facies categories
        Structure measurement convention code vocabulary



I suggest providing future vocabulary via a public sparql endpoint. This would be very useful distribute your vocabularies to a technical audience (developer)  in addition encoding errors can be tracked down easily.


Cheers,

Marcus


--
Dr. Marcus Ebner
Fachabteilung Geoinformation - Department of Geoinformation
Geologische Bundesanstalt - Geological Survey of Austria
Neulinggasse 38, A-1030 Vienna, AUSTRIA
Phone:  +43-1-712 56 74 414  FAX: +43-1-712 56 74 56
email: marcus.ebner at geologie.ac.at
web: www.geologie.ac.at





-----Ursprüngliche Nachricht-----
Von: auscope-geosciml-bounces at lists.arcs.org.au [mailto:auscope-geosciml-bounces at lists.arcs.org.au] Im Auftrag von auscope-geosciml-request at lists.arcs.org.au
Gesendet: Montag, 27. Februar 2012 01:58
An: auscope-geosciml at lists.arcs.org.au
Betreff: auscope-geosciml Digest, Vol 35, Issue 34

Send auscope-geosciml mailing list submissions to
        auscope-geosciml at lists.arcs.org.au

To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.arcs.org.au/cgi-bin/mailman/listinfo/auscope-geosciml
or, via email, send a message with subject or body 'help' to
        auscope-geosciml-request at lists.arcs.org.au

You can reach the person managing the list at
        auscope-geosciml-owner at lists.arcs.org.au

When replying, please edit your Subject line so it is more specific than "Re: Contents of auscope-geosciml digest..."


Today's Topics:

   1. Re: GeoSciML vocabulary http URIs (Simon.Cox at csiro.au)


----------------------------------------------------------------------

Message: 1
Date: Mon, 27 Feb 2012 08:58:14 +0800
From: <Simon.Cox at csiro.au>
To: <auscope-geosciml at lists.arcs.org.au>,
        <alistair.bh.ritchie at gmail.com>,        <Joachim.Gersemann at bgr.de>,
        <reh at bgs.ac.uk>, <Jan.Jellema at tno.nl>,
        <luca.olivetta at isprambiente.it>, <Maija.Pennanen at gtk.fi>,
        <snt at agiweb.org>, <y.filkin at mail.ru>
Subject: Re: [auscope-geosciml] GeoSciML vocabulary http URIs
Message-ID:
        <5D27281509882544A841804E4EBA914C3AA96DA919 at EXWA-MBX01.nexus.csiro.au>
Content-Type: text/plain; charset="utf-8"

Steve ?

While I?m strongly in favour of memorable URIs, I don?t think it is imperative that the URI itself be searchable.

Simon

From: auscope-geosciml-bounces at lists.arcs.org.au [mailto:auscope-geosciml-bounces at lists.arcs.org.au] On Behalf Of Stephen M Richard
Sent: Saturday, 25 February 2012 2:28 AM
To: 'Alistair Ritchie'; auscope-geosciml at lists.arcs.org.au; Gersemann, Joachim; Heaven, Rachel E; Jellema, J. (Jan); Luca Olivetta; Maija.Pennanen at gtk.fi; Sharon Tahirkheli; Yuri Filkin
Subject: Re: [auscope-geosciml] GeoSciML vocabulary http URIs

We are in the process of generating a new set of SKOS vocabularies that use all text-based tokens, e.g.
http://resource.geosciml.org/classifier/cgi/lithology/0260
will map to
http://resource.geosciml.org/classifier/cgi/lithology/tuff_breccia_agglomerate_or_pyroclastic_breccia

BUT there is a question as to what would be the best form of concept-specific token string. Here are the options (using the above example
Lower camel case:    tuffBrecciaAgglomerateOrPyroclasticBreccia
Upper camel case:  TuffBrecciaAgglomerateOrPyroclasticBreccia
All lower case: tuffbrecciaagglomerateorpyroclasticbreccia
Underscores: tuff_breccia_agglomerate_or_pyroclastic_breccia
Hyphens: tuff-breccia-agglomerate-or-pyroclastic-breccia

Strings containing underscores are treated as a single, continuous sting, while those containing hyphens as treated as being separated into words at each hyphen. Thus a web search for tuff would likely find http://resource.geosciml.org/classifier/cgi/lithology/tuff-breccia-agglomerate-or-pyroclastic-breccia, but not any of the other schemes.

Currently I?ve been converting using the underscore scheme (see rdf/turtle docs at https://www.seegrid.csiro.au/subversion/CGI_CDTGVocabulary/trunk/Vocabulary201202), so easiest thing is to stick to that. Speak up now if you have any opinions or tech reasons for favoring one of the other approaches.

Thanks
steve


Stephen M. Richard
Arizona Geological Survey
416 W. Congress St., #100
Tucson, Arizona, 85701   USA
phone: 520 209-4127
AZGS Main: (520) 770-3500.  FAX: (520) 770-3505
email: steve.richard at azgs.az.gov

From: Alistair Ritchie [mailto:alistair.bh.ritchie at gmail.com]
Sent: Thursday, February 23, 2012 6:12 PM
To: Stephen M Richard
Subject: Re: GeoSciML vocabulary http URIs

Hi Steve,

First up: glad you've made the shift to Turtle, it's an awful lot easier to cope with than RDF/XML.

I've become agnostic about underscores as separators - in large part because TopBraid Composer/EVN is bullying me into accepting them. For a while I liked the idea of pluses ('+') for spaces in the URI - after all, it is the space escape char so it means space - before realising that makes them an invalid part of a QName. Bit of a problem in the RDF space.

The next (only other?) thing I assume we need to be mindful of is whether we want a search engine to find words in a URI - a search on 'igneous rock' finds a resource containing the string 'http://resource.example.org/classifier/abhr/igneous-rock.' This seems to be an advantage of the memorable-ness of cool URIs. If we do, then the general underscore vs hyphen debate becomes important. Strings containing underscores are treated as a single, continuous sting, while those containing hyphens as treated as being separated at each hyphen.

Although this isn't unambigous:
barium-bearing carbonate chemistry -> http://resource.geosciml.org/classifier/cgi/compositioncategory/barium-bearing-carbonate-chemistry

This means I've a slight preference for hyphens,and have favoured them in my feature IDs, but but using underscores as per the original URN convention we can compare the tokens in the URIs with the URNs.

There is CamelCase as well, which Simon used for the age intervals. If we're going to have a single string, that seems best from a human-readable perspective.

This is a bit of a ramble ... is the ability to match URIs to phrases important? If not - I'd stick with the URNs.

Cheers,
Alistair

"XML is like violence. If it doesn't solve the problem, use more." - Unknown

On 24 February 2012 12:34, Stephen M Richard <steve.richard at azgs.az.gov<mailto:steve.richard at azgs.az.gov>> wrote:
HI?glad someone else has some time to put in this. If you look in the CGI_CDTGVocabulary/trunk/Vocabulary201202 subversion directory (https://www.seegrid.csiro.au/subversion/CGI_CDTGVocabulary/trunk/Vocabulary201202) you?ll see a number of vocabs I?ve converted to the word-based final token (cool???) URI?s. Only difference is mine have underscores between words (?_?). I just used the same text we had in the URNs in our first iteration.

I also switched to Turtle notation for the SKOS; it?s easier to manipulate in a text editor. I attached the python script I use to do the replacements, and a mapping file that the python uses. It replaces everything in the second column with what?s in the first column.
I?ve converted all the files, but there?s a one more pass required to check the ttl, update the historyNote, and check for other assorted errors. I?ll commit all the files if you might be able to do the final check. I?m using Prot?g? still for this kind of editing. I?m agnostic about the underscores in the concept-specific tokens. They could be eliminated with some global search and replace, or running the python replaceIDs routine again?

What do you think?
steve


Stephen M. Richard
Arizona Geological Survey
416 W. Congress St., #100
Tucson, Arizona, 85701   USA
phone: 520 209-4127<tel:520%20209-4127>
AZGS Main: (520) 770-3500<tel:%28520%29%20770-3500>.  FAX: (520) 770-3505<tel:%28520%29%20770-3505>
email: steve.richard at azgs.az.gov<mailto:steve.richard at azgs.az.gov>

From: Alistair Ritchie [mailto:alistair.bh.ritchie at gmail.com<mailto:alistair.bh.ritchie at gmail.com>]
Sent: Thursday, February 23, 2012 1:20 AM
To: Steve Richard
Cc: Simon Cox
Subject: GeoSciML vocabualry http URIs

Hi Steve

With the recent decsion to make the GeoSciML vocabulary http URIs cool again, I've tried to make sure that the reference dataset is cool as well (if it is, it'll be the first I've ever done anything cool).

Based on the format of relationship type vocabulary URIs you circulated recently I took a stab at providing new URIs according that template (from what I can tell: all lower case, no seperators, hyphens remain - eg 'continental-crustal').

Would you mind casting you eyes over the following list of URIs and tell me whether or not I got it right. I'd like to make sure that the reference DB conatins valid content with identifiers that can be derferenced.

Pretty sure I've got the ages right - straight from the new ontology.

Thanks,
Alistair

/*/gsml:samplingFrame
"http://resource.geosciml.org/feature/cgi/EarthNaturalSurface"

/gsml:MappedFeature/gsml:observationMethod
"direct observation";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/directobservation"
"inferred by indirect methods";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/inferredbyindirectmethods"
"inferred radiometric survey";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/inferredradiometricsurvey"
"observed elevation data";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/observedelevationdata"
"compilation";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/compilation"
"observed aerial imagery";"http://resource.geosciml.org/classifier/cgi/mappedfeatureobservationmethod/observedaerialimagery"

gsmler:lithology
"anthropogenic material";"http://resource.geosciml.org/classifier/cgi/lithology/anthropogenicmaterial"
"ash tuff, lapillistone, and lapilli tuff";"http://resource.geosciml.org/classifier/cgi/lithology/ashtufflapillistoneandlapillituff"
"basalt";"http://resource.geosciml.org/classifier/cgi/lithology/basalt"
"breccia";"http://resource.geosciml.org/classifier/cgi/lithology/breccia"
"clay";"http://resource.geosciml.org/classifier/cgi/lithology/clay"
"conglomerate";"http://resource.geosciml.org/classifier/cgi/lithology/conglomerate"
"diamictite";"http://resource.geosciml.org/classifier/cgi/lithology/diamictite"
"dolomitic or magnesian sedimentary rock";"http://resource.geosciml.org/classifier/cgi/lithology/dolomiticormagnesiansedimentaryrock"
"fine grained igneous rock";"http://resource.geosciml.org/classifier/cgi/lithology/finegrainedigneousrock"
"granite";"http://resource.geosciml.org/classifier/cgi/lithology/granite"
"granodiorite";"http://resource.geosciml.org/classifier/cgi/lithology/granodiorite"
"gravel";"http://resource.geosciml.org/classifier/cgi/lithology/gravel"
"hornfels";"http://resource.geosciml.org/classifier/cgi/lithology/hornfels"
"metamorphic rock";"http://resource.geosciml.org/classifier/cgi/lithology/metamorphicrock"
"mud";"http://resource.geosciml.org/classifier/cgi/lithology/mud"
"mudstone";"http://resource.geosciml.org/classifier/cgi/lithology/mudstone"
"peat";"http://resource.geosciml.org/classifier/cgi/lithology/peat"
"phaneritic igneous rock";"http://resource.geosciml.org/classifier/cgi/lithology/phaneriticigneousrock"
"pyroclastic material";"http://resource.geosciml.org/classifier/cgi/lithology/pyroclasticmaterial"
"sand";"http://resource.geosciml.org/classifier/cgi/lithology/sand"
"sandstone";"http://resource.geosciml.org/classifier/cgi/lithology/sandstone"
"sedimentary material";"http://resource.geosciml.org/classifier/cgi/lithology/sedimentarymaterial"
"shale";"http://resource.geosciml.org/classifier/cgi/lithology/shale"
"silt";"http://resource.geosciml.org/classifier/cgi/lithology/silt"
"tholeiitic basalt";"http://resource.geosciml.org/classifier/cgi/lithology/tholeiitic basalt"

gsmler:consoldiationDegree
"consolidated";"http://resource.geosciml.org/classifier/cgi/consolidationdegree/consolidated"
"indurated";"http://resource.geosciml.org/classifier/cgi/consolidationdegree/indurated"
"unconsolidated";"http://resource.geosciml.org/classifier/cgi/consolidationdegree/unconsolidated"
"unconsolidated, loose";"http://resource.geosciml.org/classifier/cgi/consolidationdegree/unconsolidatedloose"

gsmlga:eventEnvironment
"coastal plain setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/coastalplainsetting"
"contact metamorphic setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/contactmetamorphicsetting"
"continental-crustal setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/continental-crustalsetting"
"continental shelf setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/continentalshelfsetting"
"eruption centre environment";"http://www.opengis.net/def/nil/OGC/0/template"
"floodplain setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/floodplainsetting"
"glacier related setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/glacierrelatedsetting"
"lacustrine setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/lacustrinesetting"
"middle continental crust setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/middlecontinentalcrustsetting"
"playa setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/playasetting"
"river channel setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/riverchannelsetting"
"shoreline settings";"http://resource.geosciml.org/classifier/cgi/eventenvironment/shorelinesettings"
"submarine fan setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/submarinefansetting"
"swamp or marsh setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/swampormarshsetting"
"terrestrial setting";"http://resource.geosciml.org/classifier/cgi/eventenvironment/terrestrialsetting"

gsmlga:eventProcess
"contact metamorphism";"http://resource.geosciml.org/classifier/cgi/eventprocess/contactmetamorphism"
"deposition";"http://resource.geosciml.org/classifier/cgi/eventprocess/deposition"
"deposition from moving fluid";"http://resource.geosciml.org/classifier/cgi/eventprocess/depositionfrommovingfluid"
"eruption";"http://resource.geosciml.org/classifier/cgi/eventprocess/eruption"
"intrusion";"http://resource.geosciml.org/classifier/cgi/eventprocess/intrusion"
"mass wasting";"http://resource.geosciml.org/classifier/cgi/eventprocess/masswasting"
"material transport and deposition";"http://resource.geosciml.org/classifier/cgi/eventprocess/materialtransportanddeposition"
"mechanical deposition";"http://resource.geosciml.org/classifier/cgi/eventprocess/mechanicaldeposition"
"turbidity current deposition";"http://resource.geosciml.org/classifier/cgi/eventprocess/turbiditycurrentdeposition"
"wind erosion";"http://resource.geosciml.org/classifier/cgi/eventprocess/winderosion"

gsmlgs:contactType
"conformable contact";"http://resource.geosciml.org/classifier/cgi/contacttype/conformablecontact"
"contact";"http://resource.geosciml.org/classifier/cgi/contacttype/contact"
"disconformable contact";"http://resource.geosciml.org/classifier/cgi/contacttype/disconformablecontact"
"faulted contact";"http://resource.geosciml.org/classifier/cgi/contacttype/faultedcontact"
"igneous intrusive contact";"http://resource.geosciml.org/classifier/cgi/contacttype/igneousintrusivecontact"
"nonconformable contact";"http://resource.geosciml.org/classifier/cgi/contacttype/nonconformablecontact"
"unconformable contact";"http://resource.geosciml.org/classifier/cgi/contacttype/unconformablecontact"
"unknown";"http://www.opengis.net/def/nil/OGC/0/unknown"

gsmlgu:CompositionPart/gsmlgu:role
"concretions";"http://resource.geosciml.org/classifier/cgi/role/template"
"dominant component";"http://resource.geosciml.org/classifier/cgi/role/template"
"dykes";"http://resource.geosciml.org/classifier/cgi/role/template"
"interbedded component";"http://resource.geosciml.org/classifier/cgi/role/template"
"layers";"http://resource.geosciml.org/classifier/cgi/role/template"
"major component";"http://resource.geosciml.org/classifier/cgi/role/template"
"sole component";"http://resource.geosciml.org/classifier/cgi/role/template"

gsmlgu:GeologicUnit/gsmlgu:geologicUnitType
"biostratigraphic unit";"http://resource.geosciml.org/classifier/cgi/geologicunitrank/biostratigraphicunit"
"geologic unit";"http://resource.geosciml.org/classifier/cgi/geologicunitrank/geologicunit"
"lithodemic unit";"http://resource.geosciml.org/classifier/cgi/geologicunitrank/lithodemicunit"
"lithostratigraphic unit";"http://resource.geosciml.org/classifier/cgi/geologicunitrank/lithostratigraphicunit"


gsmlgu:GeologicUnit/gsmlgu:rank
"formation";"http://resource.geosciml.org/classifier/cgi/stratigraphicrank/formation"
"lithodeme";"http://resource.geosciml.org/classifier/cgi/stratigraphicrank/lithodeme"
"subgroup";"http://resource.geosciml.org/classifier/cgi/stratigraphicrank/subgroup"


"XML is like violence. If it doesn't solve the problem, use more." - Unknown


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.arcs.org.au/pipermail/auscope-geosciml/attachments/20120227/06b427af/attachment.html>

------------------------------

_______________________________________________
auscope-geosciml mailing list
auscope-geosciml at lists.arcs.org.au
http://lists.arcs.org.au/cgi-bin/mailman/listinfo/auscope-geosciml


End of auscope-geosciml Digest, Vol 35, Issue 34
************************************************
_______________________________________________
auscope-geosciml mailing list
auscope-geosciml at lists.arcs.org.au
http://lists.arcs.org.au/cgi-bin/mailman/listinfo/auscope-geosciml



More information about the GeoSciML mailing list