As you can tell from the clever name, this site is about an XML Resolver. (The code is over on github.) Many (Java-based) XML APIs include features for “resolvers” of various sorts. For example, many XML parsers allow you to define a “entity resolver” that can intercept attempts to load system identifiers. Schema processors provide a “URI resolver” that lets you intercept schema module URIs. Stylesheet and query processors have similar APIs for intercepting stylesheet and query modules.
The resolver APIs exist because it’s sometimes useful in applications to return a locally cached resource instead of the resource actually requested. It’s a significant feature of the web that you can dereference the URI
and find out that it’s the DTD for XHTML. It is not, however, desireable that everyone should always dereference that URI to get the XHTML DTD. It hasn’t changed in more than a decade and there’s no reason to believe it will ever change again.
I know, DTDs are unfashionable and XHTML has measles or some other disease against which the world should have been vaccinated, but I chose that example with care. The W3C web server gets so many requests for the XHTML DTD that it goes out of its way to make retrieving it painful.
Go ahead, download that DTD. You’ll find that the server introduces a significant delay before returning the data and if you get it often enough they’ll lock you out for 24 hours or something.
Point being: there are lots of URIs which you can usefully cache locally.
There are basically two approaches to local caching: you can setup a proxy server and have it cache things for you, or you can use XML Catalogs. Oh, I don’t dispute there might be other approaches, but those are the two common, obvious ones.
The advantage of the local caching proxy is that it’s automatic. It caches the resources you request according to whatever criteria you establish, it works transparently in the background. No muss, no fuss. Well, except for the fact that you have to install and setup a local caching proxy. You have to use it everywhere. You might have to chain it together with your corporate caching proxy. You also have to configure the criteria for local caching. I find its advantages are a lot more theoretical than practical.
The XML Resolver project is about doing it with catalogs.
Catalogs are straightforward, you provide an XML document that has mappings from identifiers that might appear in documents to local resources that should be returned for those identifiers.
Here’s an example:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <system systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" uri="/share/dtds/xhtml1-strict.dtd"/> </catalog>
If you load that catalog, attempts to obtain the XHTML DTD from the W3C
will be satisfied by a local copy of the DTD obtained from the
The simplest possible thing you can do is instantiate an instance of
org.xmlresolver.Resolver and use it as the resolver for your parser.
Resolver class implements the following resolvers:
org.xml.sax.EntityResolverthe SAX1 interface used to load XML entities
org.xml.sax.EntityResolver2the SAX2 interface used to load XML entities
javax.xml.transform.URIResolverused to load XSLT resources
org.w3c.dom.ls.LSResourceResolverused by the DOM to load resources
org.xmlresolver.NamespaceResolveran interface for loading namespace-based resources based or RDDL that never really took off, but there you go.
Another simple integration point is to instantiate
org.xmlresolver.tools.ResolvingXMLReader as your XML parser.
The Resolver classes use either Java system properties or a standard
Java properties file to establish an initial environment. The property
file, if it is used, must be called
xmlresolver.properties and must
be somewhere on your CLASSPATH.footnote:[For backwards
compatibility, the name
catalogmanager.properties may also be used.
Use the system property
xmlresolver.properties to specify a
name (or, technically, a semicolon separated list of names) explicitly.]
The resolver searches for a property file by looking in the following places, in this order:
XMLRESOLVER_PROPERTIESenvironment variable (new in v0.99.0)
xmlresolver.propertieson your classpath.
The following features may be configured with properties.
A semicolon-delimited list of catalog files. These are the catalog files that are initially consulted for resolution.
Unless you are incorporating the resolver classes into your own applications, and subsequently establishing an initial set of catalog files through some other means, at least one file must be specified, or all resolution will fail.
The initial prefer setting, either public or system.
This setting allows you to toggle whether or not the resolver classes
<?oasis-xml-catalog?> processing instruction.
relative-catalogs is true, relative catalogs in the
property list will be left relative; otherwise they will be made
absolute with respect to the base URI of the properties file from
which they came.
This setting has no effect on catalogs loaded from the
xml.catalogs.files system property (which are always returned
cache properties specify the directory in which the XML Resolver
should attempt to cache files that fail to resolve locally. If, instead,
one of the
cacheUnderHome properties is set, the cache directory will
Specifies whether or not URIs of type scheme will be cached. If not
specified, the default is “true” for all schemes except
XMLResolver.properties file looks like this:
# XMLResolver.properties relative-catalogs=yes # Always use semicolons in this list catalogs=./catalog.xml;/home/ndw/Documents/catalog.xml prefer=public cache=/Users/ndw/.xmlresolver/cache