[auscope-geosciml] URI schemes

Sen, Marcus A mase at bgs.ac.uk
Sun Aug 22 06:59:57 EDT 2010

Some comments on URI schemes before I go on holiday :-)

Referring to the https://www.seegrid.csiro.au/twiki/bin/view/CGIModel/PersistentIdentifiersInGeoSciMLServices and https://www.seegrid.csiro.au/twiki/bin/view/CGIModel/PersistentIdentifiersInGeoSciMLServicesDiscussion pages:

The host part of HTTP URIs should be tightly coupled to the authority governing the resources in question. It is up to that authority to register the host domain (possibly using sub-domains) and undertake to maintain it. It looks like http://resource.geosciml.org/ is being settled on for CGI. The CGI domain should have things defined by CGI like CGI standard classifierScheme's.

Other things like features or organisation specific classifierScheme's might be created by any geological survey organisation in the world, commerical geological companies, university researchers etc. and responsibility for defining unique identifiers for them should lie with the organisation which creates them. It is their responsibility to register a domain for defining the things for which they are the responsible authority. The current CGI URN scheme has an authority segment which is an ad-hoc organisation acronym which has worked OK while there are just a handful of us doing test beds but this is not scaleable to all the organisations we would like to be using GeoSciML to deliver geologic features and there is no reason why we would want to make organisations delivering features in GeoSciML format register with CGI anyway.

Individual organisations will have to consider CGI specific and wider contexts in deciding on their URI construction policy. For example, BGS might need to decide that it should create and maintain a http://resource.bgs.ac.uk/ domain for CGI related and other resources it is responsible for but, as a UK public sector organisation, might also need to consider whether some of it's data will need URIs in a data.gov.uk domain.

It is worth-while to discuss and agree some shared best-practice for naming the components of the URI path following the host part which might be shared by people in the geoscience community (or wider if possible) for the purposes of making them human-friendly. Machine processing of URIs should not rely on knowledge of the path structure. Organisations delivering geological data may also want to supply other kinds of data with URIs which may have nothing to do with CGI standards and they probably won't want to devote a special domain to CGI only resources and may or may not be happy to name from the root part of the path information according to a CGI pattern. Using a CGI specific identifying top-level part like /uri-cgi would increase the chances but not guarantee this. The idea of comparing URIs for identity by removing the host part and comparing the rest of the string is completely outside all the principles of URI use and should not be pursued.

This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.

More information about the GeoSciML mailing list