[seek-dev] Re: metacatiness
Matt Jones
jones at nceas.ucsb.edu
Mon Nov 15 10:58:16 PST 2004
OK, well, this is a rich topic...
The simplest metacat operations can be expressed as a url, like this:
http://metacat.nceas.ucsb.edu/knb/metacat?action=read&qformat=xml&docid=knb-lter-gce.109.6
or for one of the HTML versions:
http://metacat.nceas.ucsb.edu/knb/metacat?action=read&qformat=knb&docid=knb-lter-gce.109.6
Obviously, changing the docid on the end of the URL gets you a different
object from the server.
Metacat stores a lot of EML documents that are metadata for data objects
that can be stored in the metacat too, or not. The EML tells you where
they are stored. So, in the above example, you can see a "distribution"
section that tells you to get the data object from a web site:
<distribution scope="document">
<online>
<url
function="download">http://gce-lter.marsci.uga.edu/lter/asp/db/send_file.asp?name=metacat-user&email=none&affiliation=LNO¬ify=0&accession=INV-GCEM-0305a1&filename=INV-GCEM-0305a1_1_1.TXT</url>
</online>
</distribution>
Or the EML documents may contain references to (deprecated) "ecogrid"
urls, which should be interpreted as meaning the document is locally
stored on the metacat server. For example, this EML document about
grasshoppers:
http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=xml&docid=sev.106.2
contains a reference to this data object:
ecogrid://knb/sev.13705.1
which can be accessed here:
http://metacat.nceas.ucsb.edu/knb/servlet/metacat?action=read&qformat=knb&docid=sev.13705.1
Note that any given EML document may in fact reference multiple data
objects (as does the grasshopper example above, which also contains a
reference to ecogrid://knb/sev.13703.1), so don't assume a 1:1 one
metadata->data correspondence when parsing EML.
As I mentioned in an earlier email, Metacat ID's should probably be
mapped to lsids like this:
sev.13703.1
urn:lsid:lsid.ecoinformatics.org:sev:13703:1
Ok, so that was the short answer. In addition to this simplistic URL
interface, there is 1) a Java client API, and 2) a perl client API that
allow you to access metacat programatically. And on top of that there
is the EcoGrid Grid Service API that can be used to retrieve all of the
same objects. These APIs need to be used when login is required, as
some of the documents in metacat are access controlled and a session
needs to be established to determine access rights. The metacat Java
client APIs can be seen in action in the metacat JUnit test that
demonstrates its use (checkout the metacat cvs module). The EcoGrid
client API can be seen in use in the Kepler code and some sample code in
the ecogrid project directory (checkout the kepler and seek modules).
There's a good developer's overview of Metacat in this set of slides:
http://knb.ecoinformatics.org/knbws/knbws-jones-metacat-20040927.ppt
Finally, I think it would be good to use LSIDs directly in metacat, so
that EML documents themselves might contain LSID identifiers. Could you
comment on what would be needed to make an LSID server be shipped as a
standard part of a metacat db to resolve identifiers for objects that
might be stored in that metacat?
Thanks,
Matt
dave thau wrote:
> Hey there!
>
> I'm going to try to tie a metacat server into the LSID thing. Is there a
> good server to tap? Is there a list of good queries anywhere? Any good
> examples of using the API to make a call and parse results?
>
> Dave
--
-------------------------------------------------------------------
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------
More information about the Seek-dev
mailing list