Introduction
Writing an extension for Calabash in Java involves three different things: 1/ the Java class itself, which has to implement the interface XProcStep, 2/ binding a step name to the implementation class, and 3/ declaring the step in XProc.
Java
Let's take, as an example, a step evaluating a query using the standalone BaseX processor.
The goal is not to have a fully functional step, nor to have a best-quality-ever step
with
error reporting and such, but rather to emphasize how to glue all the things together.
The
step has one input port, named source
, and one output port, named result
.
The step gets the string value of the input port (typically a c:query
element) and
evaluates it as an XQuery, using BaseX. The result is parsed as an XML document and
sent to
the output port (it is a parse error if the result of the query is not an XML document
or
element). Let's start with the Java class implementing the extension step:
/****************************************************************************/ /* File: BasexStandaloneQuery.java */ /* Author: F. Georges - H2O Consulting */ /* Date: 2011-08-31 */ /* Tags: */ /* Copyright (c) 2011 Florent Georges. */ /* ------------------------------------------------------------------------ */ package org.fgeorges.test; import com.xmlcalabash.core.XProcException; import com.xmlcalabash.core.XProcRuntime; import com.xmlcalabash.io.ReadablePipe; import com.xmlcalabash.io.WritablePipe; import com.xmlcalabash.library.DefaultStep; import com.xmlcalabash.runtime.XAtomicStep; import java.io.StringReader; import javax.xml.transform.Source; import javax.xml.transform.stream.StreamSource; import net.sf.saxon.s9api.DocumentBuilder; import net.sf.saxon.s9api.SaxonApiException; import net.sf.saxon.s9api.XdmNode; import org.basex.core.BaseXException; import org.basex.core.Context; import org.basex.core.cmd.XQuery; /** * Sample extension step to evaluate a query using BaseX. * * @author Florent Georges * @date 2011-08-31 */ public class BasexStandaloneQuery extends DefaultStep { public BasexStandaloneQuery(XProcRuntime runtime, XAtomicStep step) { super(runtime,step); } @Override public void setInput(String port, ReadablePipe pipe) { mySource = pipe; } @Override public void setOutput(String port, WritablePipe pipe) { myResult = pipe; } @Override public void reset() { mySource.resetReader(); myResult.resetWriter(); } @Override public void run() throws SaxonApiException { super.run(); XdmNode query_doc = mySource.read(); String query_txt = query_doc.getStringValue(); XQuery query = new XQuery(query_txt); Context ctxt = new Context(); // TODO: There should be something more efficient than serializing // everything and parsing it again... Plus, if the result is not an XML // document, wrap it into a c:data element. But that's beyond the point. String result; try { result = query.execute(ctxt); } catch ( BaseXException ex ) { throw new XProcException("Error executing a query with BaseX", ex); } DocumentBuilder builder = runtime.getProcessor().newDocumentBuilder(); Source src = new StreamSource(new StringReader(result)); XdmNode doc = builder.build(src); myResult.write(doc); } private ReadablePipe mySource = null; private WritablePipe myResult = null; }
An extension step has to implement the Calabash interface XProcStep
. Calabash
provides a convenient class DefaultStep
that implements all the methods with default
behaviour, good for most usages. The only thing we have to do is to save the input
and output
for later use, and to reset them in case the step object is reused. And of course
to provide
the main processing in run()
. The processing itself, in the run()
method, we
read the value from the source port, get its string value, execute it using the BaseX
API, and
parse the result as XML to write it to the result port.
As you can see, there is nothing in the class itself about the interface of the step: its type name, its inputs and outputs, its options, etc. This is done in two different places. First you link the step type to the implementation class, then you declare the step with XProc.
Tell Calabash about the class
Linking the step type to the implementation class is done in a Calabash config file.
So
you have to create a new config file, and pass it to Calabash on the command line
with the
option --config
(in abbrev -c
). The file itself is very simple, and link the
step type (a QName) and the class (a fully qualified Java class name):
<xproc-config xmlns="http://xmlcalabash.com/ns/configuration" xmlns:fg="http://fgeorges.org/ns/tmp/basex"> <implementation type="fg:ad-hoc-query" class-name="org.fgeorges.test.BasexStandaloneQuery"/> </xproc-config>
Declare the step
Finally, declaring the step in XProc is done using the standard p:declare-step
.
If it contains no subpipeline (that is, if it contains only p:input
,
p:output
and p:option
children), then it is considered as a declaration
of a step the implementation of which is somewhere else; if it contains a subpipeline,
then
this is a step type definition, with the implementation defined in XProc itself. The
declaration can be copied and pasted in the main pipeline itself, but as with any
other
language, the best practice is rather to declare it in an XProc library and to import
this
library (composed only with step declarations) within the main pipeline using
p:import
. In our case, we define the step type to have an input port source, an
output port result
(both primary), and without any option:
<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:fg="http://fgeorges.org/ns/tmp/basex" xmlns:pkg="http://expath.org/ns/pkg" pkg:import-uri="http://fgeorges.org/tmp/basex.xpl" version="1.0"> <p:declare-step type="fg:ad-hoc-query"> <p:input port="source" primary="true"/> <p:output port="result" primary="true"/> </p:declare-step> </p:library>
Using it
Now that we have every pieces, we can write an example main pipeline using this new extension step:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:fg="http://fgeorges.org/ns/tmp/basex" name="pipeline" version="1.0"> <p:import href="basex-lib.xpl"/> <p:output port="result" primary="true"/> <fg:ad-hoc-query> <p:input port="source"> <p:inline> <c:query> <res> { 1 + 1 } </res> </c:query> </p:inline> </p:input> </fg:ad-hoc-query> </p:declare-step>
To run it, just issue the following command on the command line (where basex-steps.jar
is the JAR file you compiled the extension step class into):
> java -cp ".../calabash.jar:.../basex-6.7.1.jar:.../basex-steps.jar" \ com.xmlcalabash.drivers.Main \ -c basex-config.xml \ example.xproc
If you use this script, you can then use the following command (update: the script can now be found on GitHub):
> calabash ++add-cp .../basex-6.7.1.jar \ ++add-cp .../basex-steps.jar" \ -c basex-config.xml \ example.xproc
Packaging
Update: The mechanism described in this section has been implemented since then, see this blog entry.
If you want to publicly distribute your extension, you have to provide your users
with 1/
the JAR file, 2/ the config file and 3/ the library file. Thus the user needs to correctly
configure Java with the JAR file, to correctly configure Calabash with the config
file, and to
use a suitable URI in the p:import/@href
in his/her pipeline. This is a lot of
different places where the user can make a mistake.
The EXPath Packaging open-source implementation for Calabash does not support Java
extension steps yet, but it is planned to support them, in order to handle that configuration
part automatically. The goal is to have the library author to define an absolute URI
for the
XProc library (declaring the steps), which the user uses in p:import
, regardless of
where it is actually installed (it will be resolved automatically). The details (classpath
setting, XProc library resolving, and Calabash config) should then be handled by the
packaging
support. Once the package of the extension step has been installed in the repository,
one can
then execute the following pipeline (note the import URI has changed):
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:fg="http://fgeorges.org/ns/tmp/basex" name="pipeline" version="1.0"> <p:import href="http://fgeorges.org/tmp/basex.xpl"/> <p:output port="result" primary="true"/> <fg:ad-hoc-query> <p:input port="source"> <p:inline> <c:query> <res> { 1 + 1 } </res> </c:query> </p:inline> </p:input> </fg:ad-hoc-query> </p:declare-step>
by invoking simply the following command:
> calabash example.xproc
Posted by Florent Georges, on 2011-09-04T22:46:00, tags: basex, calabash, expath and xproc.