Configuring an XML Resolver

The Resolver classes use either Java system properties or a standard Java properties file to establish an initial environment.

The resolver searches for a property file by looking in the following places, in this order:

The following features may be configured with properties.

The initial list of catalog files

  • Feature ResolverFeature.CATALOG_FILES (type: List<String>)

  • System property xml.catalog.files

  • Property file property catalogs

A semicolon-delimited list of catalog files. These are the catalog files that are initially consulted for resolution. If no catalog files are specified, by default the resolver will attempt to use a file named catalog.xml in the current directory as a catalog.

A list of additional catalog files

  • Feature ResolverFeature.CATALOG_ADDITIONS (type: List<String>)

  • System property xml.catalog.additions

  • Property file property catalog-additions

A semicolon-delimited list of catalog files. This list is used in addition to the initial list of catalog files.

If you attempt to use both a system property and a property from a property file to create the initial list of catalog files, you’ll only get one or the other. (See prefer-property-file.)

This property provides a way to add to the current list of files. For example, suppose you use a global properties file to initialize the resolver, but for a particular application you want to search additional catalogs. You can specify them in the xml.catalog.additions system property and they’ll be appended to the list instead of replacing the list entirely as setting xml.catalog.files would.

Load catalogs from the classpath

  • Feature ResolverFeature.CLASSPATH_CATALOGS (type: Boolean)

  • System property xml.catalog.classpathCatalogs

  • Property file property classpath-catalogs

Load catalog files from the classpath. If this property is true, the resolver will search for all of the files named org/xmlresolver/catalog.xml on the classpath and add each of them to the end of the catalog list.

Preference for public or system identifiers

  • Feature ResolverFeature.PREFER_PUBLIC (type: Boolean)

  • System property xml.catalog.prefer

  • Property file property prefer

The initial prefer setting, either public or system.

Obey oasis-xml-catalog processing instruction

  • Feature ResolverFeature.ALLOW_CATALOG_PI (type: Boolean)

  • System property xml.catalog.allowPI

  • Property file property allow-oasis-xml-catalog-pi

This setting allows you to toggle whether or not the resolver classes obey the <?oasis-xml-catalog?> processing instruction.

If you never use the processing instruction, you can get a very tiny performance improvement by disabling this feature. (If this feature is enabled, the parser has to create a copy of the resolver configuration for every parse.)

Support relative catalog paths

  • Property file property relative-catalogs

If relative-catalogs is true, relative filenames in the catalogs property list will be made absolute relative to the current working directory; otherwise they will be made absolute with respect to the base URI of the properties file from which they came.

This setting has no effect on catalogs loaded from the xml.catalogs.files system property which are always made absolute with respect to the current working directory.

Cache documents

  • Features ResolverFeature.CACHE (type: ResourceCache), ResolverFeature.CACHE_DIRECTORY (type: String), ResolverFeature.CACHE_UNDER_HOME (type: Boolean)

  • System properties xml.catalog.cache, xml.catalog.cacheUnderHome

  • Property file properties cache, cache-under-home

The cache properties specify the directory in which the XML Resolver should attempt to cache files that fail to resolve locally. If, instead, one of the “cache under home” properties is set, the cache directory will default to $HOME/.xmlresolver/cache.

If the xmlresolver.offline system property is set, no documents will expire from the cache, regardless of their age.

Prefer property file values

  • Feature ResolverFeature.PREFER_PROPERTY_FILE (type: Boolean)

  • System property xml.catalog.preferPropertyFile

  • Property file property prefer-property-file

Prefer properties from the properties file. If a property file is loaded to configure the resolver and one of the properties in that file is also specified as a system property, the system property takes precedence. If you’d prefer to have the property file take precedence (as was the case in some earlier versions), set the “prefer property file” property to true.

Use URI entries for system resolution

  • Feature ResolverFeature.URI_FOR_SYSTEM (type: Boolean)

  • System property xml.catalog.uriForSystem

  • Property file property uri-for-system

Ignore the distinction between system identifiers and URIs. The distinction between external identifiers (the public and system identifiers that are used in DTDs) and general URIs (as might be used to load a RELAX NG Grammar or XML Schema, for example), is not supported uniformly by the parser APIs. The Xerces XML Schema implementation, for example, users the resolveEntity API to load XML Schema imports.

Ordinarily, system identifier resolution interrogates the system and public entries (and their related entries), but not the uri entries. If this property is true, the resolver will attempt to resolve system identifiers with uri entries (after attempting to resolve them with the system and public entries.

Merge http: and https: URI schemes

  • Feature ResolverFeature.MERGE_HTTPS (type: Boolean)

  • System property xml.catalog.mergeHttps

  • Property file property merge-https

Treat http: and https: URIs as equivalent for the purpose of resolution. The web used to be served over http: and many existing catalog files contain http: system identifiers. Today, the web is largely served over https: and many documents contain https: system identifiers. If this property is true, that distinction will be ignored during catalog lookup, http://example.com/sample.dtd will match https://example.com/sample.dtd.

Note: this has *no effect* on the URIs returned by the resolver or retrieved over the web. It only effects catalog lookup for system identifiers and URIs.

Mask jar URIs

  • Feature ResolverFeature.MASK_JAR_URIS (type: Boolean)

  • System property xml.catalog.maskJarUris

  • Property file property mask-jar-uris

Don’t return jar: or classpath: URIs. Most entity resolver APIs are defined such that if resolution succeeds, the base URI of the resource returned is the base URI of the actual, local resource. This can greatly simplify things because subsequent relative URIs can be resolved against the local resource directly.

However, the Java URI class does not treat jar: or classpath: URI schemes as hierarchical, so any subsequent attempts to resolve relative URIs will fail. If this property is true, the local resource will be returned but the URI will be left unchanged. That may require a more complete catalog, but it will avoid a situation which is guaranteed to fail.

Catalog loader class

  • Feature ResolverFeature.CATALOG_LOADER_CLASS (type: String)

  • System property xml.catalog.catalogLoaderClass

  • Property file property catalog-loader-class

Specify the catalog loader class. The default catalog loader ignores any errors encountered when loading catalogs. This is convenient for production use, but can be frustrating because it may not be obvious when resolution fails, especially if your internet connection is fast. A typo in a catalog file can easily go unnoticed.

If the value org.xmlresolver.loaders.ValidatingXmlLoader is specified for this property, catalog files will be validated when they are loaded and the resolver will throw an exception for any validity errors encountered.

The validating loader depends on having version 20181222 of Jing on your classpath. (This is an optional dependency in the Maven distribution of the resolver, so you may have to add it by hand.)

Parse RDDL documents

  • Feature ResolverFeature.PARSE_RDDL (type: Boolean)

  • System property xml.catalog.parseRddl

  • Property file property parse-rddl

Attempt to resolve RDDL resources in namespace URI lookup. If the namespace resolver is used, if a nature and purpose are specified, and if the resource returned is an HTML document, the resolver will attempt to find the RDDL resource description for the requested namespace and resolve that URI.

For example, the following API call will return the XML Schema for XML:

resolveNamespaceURI("http://www.w3.org/XML/1998/namespace",
                    "http://www.w3.org/2001/XMLSchema",
                    "http://www.rddl.org/purposes#schema-validation");

Attempting to resolve RDDL resources requires extra processing. If you know it will never succeed you can disable it by setting this property to false.

Specify an alternate class loader

  • Feature ResolverFeature.CLASSLOADER (type: ClassLoader)

If you are using the resolver in an environment where the default class loader (getClass().getCatalogLoader()) will not return useful class loader, you can specify an alternate loader with this feature.

Support catalog files in ZIP archives

This feature is new in XML Resolver 3.1.0. (Only available in a snapshot release at the time of writing.)

  • Feature ResolverFeature.ARCHIVED_CATALOGS (type: Boolean)

  • System property xml.catalog.archivedCatalogs

  • Property file property archived-catalogs

If archived catalogs are allowed, then you can place ZIP files of resources directly on the catalog path. The resolver will search inside the ZIP file for a catalog (/org/xmlresolver/catalog.xml or /catalog.xml) to use.

Access the underlying catalog manager

  • Feature ResolverFeature.CATALOG_MANAGER (type: CatalogManager)

The CatalogManager class provides some lower-level methods for mapping to URIs without returning sources of any kind.

Logging

The resolver uses SLF4J to do logging, so it should be configurable by all the usual means. But some systems appear to make configuring logging selectively more difficult, so as a convenience, the resolver categorizes log messages and allows you to change the logging level for categories selectively.

Suppose, for example, you want to know more about how the cache is being managed. At the level of logging configuration, change how org.xmlresolver.cache.ResourceCache log messages are presented. If you can’t easily do that, change how the resolver logs messages related to caching: make them into warning messages instead of info messages, for example.

The log categories aren’t as fine grained as the class hierarchy. There are six categories:

request

Information related to the request.

response

Information related to the response.

config

Information related to configuration.

cache

Information related to caching.

warning

Reported warnings.

error

Reported errors.

Any of these categories can be reported as “debug”, “info”, or “warn” logging messages. This is configured with the system property xml.catalog.logging. That property is interpreted as a comma delimited list of category:level pairs: for example, cache:warn,response:info, would log cache messages as warnings and response messages as info.

If you prefer to specify this in a configuration file, use the catalog-logging property. Note, however, that this setting only applies if the system property is not set. (Because the logging class doesn’t have access to the resolver configuration, it can’t apply the usual defaulting rules.)

Example xmlresolver.properties file

An xmlresolver.properties file might look like this:

1# xmlresolver.properties
 
relative-catalogs=yes
 
5# Always use semicolons in this list
catalogs=./catalog.xml;/Users/ndw/Documents/catalog.xml
 
prefer=public
cache=/Users/ndw/Library/Caches/xmlresolver.org/cache
10allow-oasis-xml-catalog-pi=no
prefer-property-file=false