As you can tell from the clever name, this site is about an XML Resolver. (The code is over on github.) Many (Java-based) XML APIs include features for "resolvers" of various sorts. For example, many XML parsers allow you to define a "entity resolver" that can intercept attempts to load system identifiers. Schema processors provide a "URI resolver" that lets you intercept schema module URIs. Stylesheet and query processors have similar APIs for intercepting stylesheet and query modules.

The resolver APIs exist because it’s sometimes useful in applications to return a locally cached resource instead of the resource actually requested. It’s a significant feature of the web that you can dereference the URI

and find out that it’s the DTD for XHTML. It is not, however, desireable that everyone should always dereference that URI to get the XHTML DTD. It hasn’t changed in more than a decade and there’s no reason to believe it will ever change again.

I know, DTDs are unfashionable and XHTML has measles or some other disease against which the world should have been vaccinated, but I chose that example with care. The W3C web server gets so many requests for the XHTML DTD that it goes out of its way to make retrieving it painful.

Go ahead, download that DTD. You’ll find that the server introduces a significant delay before returning the data and if you get it often enough they’ll lock you out for 24 hours or something.

Point being: there are lots of URIs which you can usefully cache locally.

There are basically two approaches to local caching: you can setup a proxy server and have it cache things for you, or you can use XML Catalogs. Oh, I don’t dispute there might be other approaches, but those are the two common, obvious ones.

The advantage of the local caching proxy is that it’s automatic. It caches the resources you request according to whatever criteria you establish, it works transparently in the background. No muss, no fuss. Well, except for the fact that you have to install and setup a local caching proxy. You have to use it everywhere. You might have to chain it together with your corporate caching proxy. You also have to configure the criteria for local caching. I find its advantages are a lot more theoretical than practical.

The XML Resolver project is about doing it with catalogs.

XML Catalogs

Catalogs are straightforward, you provide an XML document that has mappings from identifiers that might appear in documents to local resources that should be returned for those identifiers.

Here’s an example:

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <system systemId=""

If you load that catalog, attempts to obtain the XHTML DTD from the W3C will be satisfied by a local copy of the DTD obtained from the /share/dtds/xhtml1-strict.dtd.

How to use XML Resolver

The simplest possible thing you can do is instantiate an instance of ort.xmlresolver.Resolver and use it as the resolver for your parser. The Resolver class implements the following resolvers:

  • org.xml.sax.EntityResolver the SAX1 interface used to load XML entities

  • org.xml.sax.EntityResolver2 the SAX2 interface used to load XML entities

  • javax.xml.transform.URIResolver used to load XSLT resources

  • used by the DOM to load resources

  • org.xmlresolver.NamespaceResolver an interface for loading namespace-based resources based or that never really took off, but there you go.

Another simple integration point is to instantiate as your XML parser.

Configuring XML Resolver

The Resolver classes use either Java system properties or a standard Java properties file to establish an initial environment. The property file, if it is used, must be called and must be somewhere on your CLASSPATH.[1]

The following features may be configured with properties.

The initial list of catalog files

  • System property xml.catalog.files

  • Property file property catalogs

A semicolon-delimited list of catalog files. These are the catalog files that are initially consulted for resolution.

Unless you are incorporating the resolver classes into your own applications, and subsequently establishing an initial set of catalog files through some other means, at least one file must be specified, or all resolution will fail.

Preference for public or system identifiers

  • System property xml.catalog.prefer

  • Property file property prefer

The initial prefer setting, either public or system.

Support relative catalog paths

  • Property file property relative-catalogs

If relative-catalogs is true, relative catalogs in the catalogs property list will be left relative; otherwise they will be made absolute with respect to the base URI of the properties file from which they came.

This setting has no effect on catalogs loaded from the xml.catalogs.files system property (which are always returned unchanged).

Cache documents

  • System properties xml.catalog.cache, xml.catalog.cacheUnderHome

  • Property file property cache, cacheUnderHome

The cache properties specify the directory in which the XML Resolver should attempt to cache files that fail to resolve locally. If, instead, one of the cacheUnderHome properties is set, the cache directory will default to $HOME/.xmlresolver/cache.

Schemes to cache

  • System property `xml.catalog.cache.`scheme

  • Property file property `cache-`scheme

Specifies whether or not URIs of type scheme will be cached. If not specified, the default is “true” for all schemes except file.

Example catalog properties file

My file looks like this:



# Always use semicolons in this list



See also

1. For backwards compatibility, the name may also be used. Use the system property to specify a name (or, technically, a semicolon separated list of names) explicitly.