Introduction
After having released a first implementation of EXPath Packaging System for eXist, here is a version for Saxon. You can read this previous blog entry to get more information on the packaging system; in particular, it says: "The concept is quite simple: defining a package format to enable users to install libraries in their processor with just a few clicks, and to enable library authors to provide a single package to be installed on every processors, without the need to document (and maintain) the installation process for each of them."
The package manager for Saxon is a graphical application (a textual front-end will
be
provided soon,) and is provided as a single JAR file. Go to the implementations page, or use this
following direct
link to get the JAR. Run it as usual, for instance by double-clicking on it or by
executing the command java -jar expath-pkg-saxon-0.1.jar
. That will launch the
package manager window.
Repositories
The implementation for Saxon differs from the one for eXist in a fundamental way:
Saxon
does not have a home directory where you can put the installed packaged, and you can
invoke
Saxon in so many different ways (while the eXist core is always started the same way.)
That
involves two different aspects regarding package management with Saxon: the package
manager
itself that installs and remove packages, and a way to configure Saxon itself, regardless
with
the way you invoke it. In addition, the homeless property of Saxon needs to introduce
the
concept of package repository
.
A repository is a directory dedicated to installing packages, and should only be modified through the package manager. It contains the packages themselves (under a form usable by Saxon) as well as administrative informations to be able to use them (like catalogs, etc.) The graphical package manager allows one to create a new repository directly from the graphical interface, as well as switching between different repositories (if you need to maintain several repositories for several purposes.)
Importing stylesheet
But as I said above, having a repository full of packages is not enough. You have to configure Saxon to use this repository. Because you can invoke Saxon in a plenty of ways, the configuration itself is implemented as a Java helper class that you can use in your own code if you invoke Saxon from within Java (for instance in a Java EE web application.) If you use Saxon from the command line, there is a script that takes care of configuring everything for you.
But before looking in details at how to configure Saxon to use a repository, let's have a look at how a stylesheet can use an installed package. This is the whole point of the packaging system, after all. The goal is simply to be able to use a public import URI in an import statement, this URI being automatically resolved to its local copy in the repository. Like a namespace URI is just a kind of identifier (it is just used as a string, your processor does not try to actually access anything at that address,) the public import URI is an identifier to a specific stylesheet. This machanism supports also having functions implemented in Java. So all you need to do is to use this public URI, like the following:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:h="http://www.example.org/hello" version="2.0"> <xsl:import href="http://www.example.org/hello.xsl"/> <xsl:template ...> ... <xsl:value-of select="h:hello('world')"/>
For XQuery, this is a bit different as XQuery does have a module system. But this
is
actually very similar. XQuery library modules are identified by their namespace URI.
Once
again, it can be seen as a public identifier for that XQuery module. So let's say
we have an
XQuery library module for the namespace URI http://www.example.org/hello
, then you
can simply write a module that imports it as following:
import module namespace h = "http://www.example.org/hello"; h:hello('world')
And that's it! In the package samples section below, you can see completes examples of such importing stylesheets and queries, as well as the packages they use.
Java configuration
To configure Saxon to use a repository from Java, you need to get a Configuration
object. This is a central class in Saxon, which is used almost everywhere in the Saxon
code
base. You can get it from a Saxon TransformerFactory
or from a S9API
Processor
. With that object on the one hand, and a File
object pointing
to the repository directory on the other hand, you can just call:
// the repo directory File repo = ...; // the Saxon config object Configuration config = ...; // the EXPath Pkg configurer ConfigHelper helper = new ConfigHelper(repo); // actually configure Saxon helper.config(config);
Besides the Java code itself, you have to be sure 1/ to have an actual repository
at the
location you pass to the ConfigHelper
constructor and 2/ to have the JAR files used
by and containing the extension functions written in Java into your classpath. The
only
exception to this rule is when you register such an extension function (written in
Java) to
Saxon 9.2; in this case EXPath Pkg will try to dynamically add the JAR files from
the
repository to the classpath. But playing with the classpath at runtime is not something
I
would recommend in Java.
Shell script
When using Saxon from the command line, EXPath Pkg comes with an alternate class to launch Saxon (this class automatically uses ConfigHelper to configure Saxon) as well as with a shell script to launch Saxon with the correct classpath.
To use this shell script (only available on Unix-like systems for now, including Cygwin
under Windows) you have to set the environment variables SAXON_HOME
to the directory
where you put the Saxon JAR files, EXPATH_PKG_JAR
to the EXPath Pkg JAR file, and
APACHE_XML_RESOLVER_JAR
to the XML Resolver
JAR file from Apache. Additionally, you can set EXPATH_REPO
to the
repository directory, to not have to explicitely give it as an option each time you
invoke
Saxon. If all the above environment variables have been correctly set, and the script
added to
your PATH, you can just invoke Saxon as usual: saxon -s:source.xml
-xsl:stylesheet.xsl
.
Use saxon --help
to get the usage help of this script. You can set the EXPath
repository (and thus override EXPATH_REPO
if it is set) with the option
--repo=
. You can add items to the classpath with the option --add-cp=
.
You can set the classpath (so overriding SAXON_HOME
and other environment variables)
with the option --cp=
. The script detects if Saxon SA is present, and if so will use
the SA version. You can force either B or SA version with either --b
or
--sa
. You can also set any option to the Java Virtual Machine by using
--java=
, for instance to set a system property, and --mem=
to set the
amount of memory of the virtual machine (shortcut for the Java option -Xmx
) And
finally, you can also set the HTTP and HTTPS proxy information with --proxy=host:port
(for instance --proxy=proxyhost:8080
.)
Package samples
The first example is
a packaged version of Priscilla Walmsley's FunctX. This
package contains both the XSLT and the XQuery versions of this library. Of course,
the XQuery
module defines a module namespace, but the XSLT stylesheet does not have any public
import URI
(as this is behind the standard.) I chose the URI
http://www.functx.com/functx-1.0.xsl
, but keep in mind this is not
official by any means, this is just the URI I chose. It is intended that
library authors package their own libraries and choose the public URIs themselves.
The package itself is a plain ZIP file. If you open it or unzip it with your preffered
tool, you can see that at the top level, there is a file named expath-pkg.xml
. This
is the package descriptor, that defines what the package contains (at least what
is publicly exported from the package, so what can be used from within a stylesheet
or a
query.) In the case of this FunctX package, this descriptor looks like:
<package xmlns="http://expath.org/mod/expath-pkg"> <module version="1.0" name="functx"> <title>FunctX library for XQuery 1.0 and XSLT 2.0</title> <xsl> <import-uri>http://www.functx.com/functx-1.0.xsl</import-uri> <file>functx-1.0-doc-2007-01.xsl</file> </xsl> <xquery> <namespace>http://www.functx.com</namespace> <file>functx-1.0-doc-2007-01.xq</file> </xquery> </module> </package>
To install the package, just download it to a temporary location, launch the package manager as explained at the beginning of this blog post, choose "install" in the file menu, and choose the package on your filesystem. To test if it is correctly installed, write the following stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:f="http://www.functx.com" version="2.0"> <xsl:import href="http://www.functx.com/functx-1.0.xsl"/> <xsl:template match="/" name="main"> <result> <xsl:sequence select="f:date(1979, 9, 1)"/> </result> </xsl:template> </xsl:stylesheet>
and/or the following XQuery main module (depending on what you want to test):
import module namespace f = "http://www.functx.com"; <result> { f:date(1979, 9, 1) } </result>
To evaluate them, make sure you configured the shell script correctly, as explained
above,
then open a shell and type one of the following command (or both) where style.xsl
is
the file where you saved the above stylesheet and query.xq
is the file where your
saved the above query:
$ saxon -xsl:style.xsl -it:main <result>1979-09-01</result> $ saxon --xq query.xq <result>1979-09-01</result> $
If you prefer to test from Java, just write a simple main class that evaluates the
above
stylesheet and/or query, taking care of using ConfigHelper
to set up the Saxon
Configure
object. For instance, if you want to use the S9API, you can configure
the Processor
object like the following (don't forget to add the EXPath Pkg and the
Apache XML resolver JAR files to your classpath):
// the repo directory File repo = new File("..."); // the EXPath Pkg configurer ConfigHelper helper = new ConfigHelper(repo); // the Saxon processor Processor proc = new Processor(false); // actually configure Saxon helper.config(proc.getUnderlyingConfiguration()); // then use 'proc' as usual...
The second sample
package provides a single function: ext:hello($who)
. It is written in Java.
Besides other stuff related to the packaging itself, it contains a JAR file with the
implementation of that extension function. To test it, just follow the same steps
as for the
FunctX package, except that you have to add the installed JAR file (from within the
repository) to your claspath (this is done automatically for you if you use the shell
script,
but not if you test it from a Java program.)
Conclusion
This is just a prototype implementation of a package manager for Saxon, which is consistent with the one for eXist. The main issue is the configuration of the classpath, but I think this is best let to the user than having to deal with the classpath, in particular within the context of a Java EE application. This issue shows up also in your IDE configuration. For now, I configure oXygen by adding the catalogs from the repository to the oXygen's main catalog list, and the extension JAR files to the oXygen classpath, so the built-in Saxon processors can be used exactly as usual. But such issues can be resolved by native support right into the processors ad IDEs.
Besides this classpath issue, I am convinced that package management will really improve the current situation, and maybe could be the missing piece to distribute real general-purpose libraries for XQuery and XSLT, and one of the basis to other systems, like an implementation-independent XRX system.
Posted by Florent Georges, on 2009-10-02T11:12:00, tags: expath, saxon, xquery and xslt.