Lightweight Web Resource Catalogue
The web services catalogue space seems a little empty at the moment. UDDI and OGC Catalog Services are the only two standards that I’m aware of, and implementations are thin on the ground. I’ve been wondering how to fill the gap.
The interest in catalogues came from real world requirements at one of my clients. They have a few web services now, and are creating more. There is a growing need for a catalogue to help discover and manage their web services in a 24/7 operational environment. For example, four concrete requirements are:
- Publishing the location of primary and backup services. The intent is that applications that consume services can automatically “fail-over” to alternate services if the primary service is not responding.
- Pro-active monitoring of web services. An application somewhere should be able to obtain a list of services and regularly check availability and response times.
- Manual testing of services. If something goes wrong IT staff need to be able run simple tests to isolate the problem.
- Manual discover of services. As the number of web services, and web resources, grows a catalogue facilitates discovery.
In some ways all the above deal with managing change in the network topology. In theory, Cool URIs don’t change, but in practice that’s not easy to achieve.
Existing Standards
Standards like UDDI and OGC Catalog Services are different beasts. They are design by committee with vendor involvement, and the hope is that someone will provide an implementation. (On a related note, have a look at The Rise and Fall of CORBA.)
Frankly, I’m much more interested in standards that codify industry practice. In this case, a lightweight web resource catalogue that helps manage The Eight Fallacies of Distributed Computing in a heterogeneous environment.
Web Resources
To avoid confusion, the term “web resource” is being abused here to refer to:
- Any resource in a REST architecture,
- Web-style services (e.g. “XML over HTTP”, Web Map Services, ArcXML, etc), and
- SOAP web services.
A catalogue that only does SOAP/WSDL is not practical in the real world. I want to be able to use the same system for managing Web Map Services, or even just a single web page.
I’m thinking: simple, practical and web architecture. URIs. Behind the firewall.
Out of Scope
Personally, I don’t find the following of much interest in this domain:
- Storing extensive service descriptions, if any.
- Harvesting extensive metadata, if any.
- Support for automated discovery (UDDI tried that).
- Distributed searches using protocols like Z39.50.
URIs can point to service descriptions and other metadata. Automated discovery relies on the first two, and is nice theory, but in business people are involved. Distributed search can be an overlay/aggregation layer, if required.
Prior Art
Tim Bray advises not to invent new XML languages unless you have to. After Googling around, and giving this some thought, it seems that the following could be a good place to start:
- Atom Publishing Protocol for reading and writing to the catalogue.
- Atom Syndication Format for the message format.
- Opensearch for querying.
- GeoRSS for describing location.
- Google Data APIs for inspiration.
- Mapdex for inspiration.
Phil Windley has also mentioned LDDI – Lightweight Description, Discovery, and Integration. LDDI uses microformats, and is the subject of a PhD thesis, which should be interesting.
Web Services Inspection Language (WSIL) also lives in this space. Uptake has been slow, possibly due to it being related to WSDL and UDDI.
Where to From Here
For me the first step is looking at the interfaces to such a catalogue, and defining the document formats. An implementation is then on the table.
If anyone is interested in collaborating on this I’d be interested to hear from you. A GeoRSS-style (without the recent drama) effort would be cool.
Update, 2006-06-25: Added MapDex to the prior art list. Apologies to Jeremy for leaving out an obvious influence.
Update, 2006-06-28: Added GeoRSS to the prior art list. Thanks Mikel.
[tags]xml, catalog, rest, web services, uddi, opensearch, gdata, georss, mapdex[/tags]

I never really understood complex service discovery mechanisms- why not do something like mapdex.org?
You don't see many real world impls of that kind of directory standards stuff, because they don't add value relative to cost.
matt m — 25 June 2006, 18:32
Cheers
Mikel
Mikel — 27 June 2006, 12:10
I agree that DNS and monitoring are a good approach to managing service failure. My view is being coloured by the needs of a particular client.
In their case, the application that is consuming several web services is the core operational system in a 24/7 control centre. If there is a failure in any web service it needs to be able to switch to an alternate service immediately. i.e. There needs to be several manual and automated switch-over methods.
UDDI has shown that automated discovery of business services doesn't work because business relationships involve people, except for the purchase of the most commoditised and standardised products (a box of pencils).
In the spatial world I've seen several attempts at shared catalogues. All were hindered by a) the quality and descriptive power of the metadata, b) issues of private versus public metadata, and c) politics. The metadata used to decide a map service is fit for a particular purpose is difficult to encode in any machine readable format. Therefore, in my view, a pair of eyeballs looking at a web page seems to be the most effective approach - manual discovery.
Mikel,
I hadn't got to thinking about how to describe spatial extents of a web resource or web service. Thanks for putting GeoRSS at the top of the list! (GeoRSS has been added to the list of prior art.)
The remaining major issue with spatial web services is whether to record layer level details or just service level details. My current thinking is to only record the service details and point to another document that describes layers. e.g. OGC capabilities documents.
Thanks for your feedback.
Andrew
Andrew Hallam — 28 June 2006, 19:10
Also, have a look at the registry at http://ngistc1.agr.gc.ca/registry/main.html It is a client / service based on a subset of OGC's CAT 2.0 spec. It's not fully debugged, but I think it should meet most people's needs. It uses XMLDB with an Oracle back-end.
I for one think that a simple lightweight defacto implementation would make this whole problem go away.
Cheers, Peter
Peter Schut — 30 June 2006, 12:39
Another point to make is that some requirements above don't really 'need' a web service wrapper (i.e. health checking, etc.). At the end of the day, from a middleware perspective, I want to be able to publish, discover and bind to resources via a Catalog API.
At any rate, I agree; we need to lock down the catalog dilemma!
Tom Kralidis — 6 July 2006, 19:21
tommy’s scratchpad » Geospatial Catalog Development Brewing — 5 August 2006, 11:38