XForms over REST with Marklogic

Researching for an up and coming Marklogic project, I’ve been looking at the REST interface. I’d set one last month up on the laptop so that I could test whether we could push data in from our Apache Camel engine. That worked nicely and the thought occurred to me that I might be able to put an XForms interface directly over the top of the REST endpoints – a ‘one-tier’ application so to speak.

UntitledFor my demo, I took the idea of a herd book rather than the canonical name/address game.
For those who don’t keep animals, a herd book is simply a database of animals you have and what happens to them in terms of breeding, medicines etc. The very simplest ‘toy’ one  you might need for alpacas might be as shown here.

Setup: The Same Origin issue.

To set up for my experiment, all I did was alter my working REST interface in Marklogic admin, so that the ‘modules-database’ was on the file-system. That gave me somewhere tUntitledo put my XForm and the xsltforms library to drive it. I could of course have created a new Application Server on a different port, but that would have put the REST endpoint in a different domain and hence triggered same-origin security.

In a real application you’d use a proxy or CORS via headers or some such, but for this it was simply easiest to run it all together (this came to bite me later in searching though).

Creating documents via POST

First stop, now I had somewhere for my form to live was to get it to store a document. In REST, the endpoint for this lives at /v1/documents where you can PUT a document into the system at an address URI. However, that requires you to crete URIs and often you don’t really care what a new document is actually called, you just need a unique name. Marklogic has a nifty solution to this; you can POST a document instead and get your URI generated for you in a directory you provide (you can read about it here). In essence, you POST the data and you get a header back saying what it got called. Translated into XForms that gives this sort of <submission/>

 <xf:submission id="newanimal"
 resource="v1/documents?extension=xml&amp;directory=/my/herdbook/"
 method="post" replace="none" separator="&amp;" ref="instance('animal')">
 <xf:action ev:event="xforms-submit-done">
     <xf:setvalue ref="instance('control')/uri"
 value="substring-after(event('response-headers'),'=')" />
     <xf:message>
         <xf:output value="event('response-reason-phrase')" />
     </xf:message>
 </xf:action>
 </xf:submission>

Here I POST my form data, in ‘animal’ to REST and give it the target directory. I then use the ‘xforms-submit-done’ event to read the return header out via the event mechanism and put it in a ‘control’ instance, so I know the URI for editing. Success!

Updating documents via PUT

The simplest way to update a document is often just to write over it with a PUT to the same URI. That’s fine for documents if they’re small enough but it’s worth saying that the Marklogic REST interface also supports Partial Updates to both the document and the meta-data which might be better in some cases. Getting back to this form though, now I have a URI, if I update a document  I can simply replace the data with the following <submission/>:

<xf:submission id="upcase"
 method="put" replace="none" separator="&amp;" ref="instance('animal')">
<xf:resource value="concat('v1/documents?uri=',instance('control')/uri)"/>
<xf:action ev:event="xforms-submit-done">
    <xf:message>
        <xf:output value="'Updated'" />
    </xf:message>
</xf:action>
</xf:submission>

This is straightforward apart from the fact that I have to use an in-line <xf:resource/> element to construct the url with the URI I stored earlier. {Note to self. AVT?]

Searching documents

The last step in this little demo is getting lists of animals, searching etc. This bit took the most time due to a problem with how search works. In Marklogic, the modules-database gets used to store all the search options. Now this is great; you can pre-set a load of search templates, give them a name and put that in the search url in ‘options’. However, if you’ve hijacked the modules database for your code, well, you can’t. This is a pain because it doesn’t in my case allow you to override the snippet settings to get particular fields back. Now I thought that was that, until I found out that ML7 had the ability to send <options/> with a Structured Query. (thanks to @adamfowleruk).

Turns out all I needed for search was to POST a Structured Query format <instance/> using the accompanying <submission/> like so:

<xf:instance id="squery">
<search xmlns="http://marklogic.com/appservices/search">
<query/>
 <options>
 <return-metrics>false</return-metrics>
 <transform-results apply="metadata-snippet">
 <preferred-elements>
 <element name='name' ns=''/>
 <element name='dob' ns=''/>
 <element name='colour' ns=''/>
 </preferred-elements>
 </transform-results>
</options>
</search> 
 </xf:instance>

<xf:submission id="list all" action="v1/search?directory=/my/herdbook"
method="post" 
ref="instance('squery')" 
replace="instance" 
instance="results" 
omit-xml-declaration="yes" />

That gets me all the alpacas with their name,colour and dob so I can display them in a list. Note in this example the <query/> section is empty as I want every alpaca. In a real app, of course I could add something like a range query to get a particular animal linked to a search box. A bit of judicious code in some triggers and I can paginate using header data from the reply as well.

Untitled

Lastly, is a <action/> in the  list so that each row in the results triggers a fetch of that document so I can edit it if I want. Given the URI is in the search results, I simply need to copy it over to become my ‘current’ URI and send the instance using GET to /v1/documents. At the moment I send category=content (also the default) as I want the document, but I could expand that to category=meta easily to start working with the meta-data instead.

More to do?

UntitledThat’s pretty much it so far. I haven’t time currently to work this up into a full example at the moment, but it seems a very handy way to quickly put an XForms interface over REST, even if there are compromises, such as making the search a little harder.
Of course I’ve used Marklogic for this, but in theory it’s an approach that could work works with any REST system as long as it will take POST data in XML.
As to whether this is a viable way to write an application? I’d say probably yes for a prototype or something simple, especially as you get the REST endpoints thrown in for free.

Reading usb serial data with Apache Camel

I’ve done a couple of posts on using Apache Camel as part of my home-monitoring system, and why I think it’s a good fit as an IoT switchboard. However, one of the flaws in my master plan is that there isn’t a component for serial communications (Camel MINA does socket comms, but not serial). Note I’m not talking actual RS232 here but the version that comes in USB form. I’m using mostly JeeNodes, which talk RF12 between them, but even in this modern era, without a wi-fi shield or some similar way to get onto the LAN and TCP/IP, then at some point, you’re talking to a serial port on some box into which, in my case a controlling JeeLink, is plugged.

Now, I’ve got around this like I guess most people have, by creating a serial-tcp/ip bridge in code, and getting on with our lives. But, it isn’t, well elegant and it bugs the engineer in me. It would be nice to be able to link directly to Camel and use the considerable advantages it brings, like throttling and queues and all the other goodies. But sadly, my Java isn’t up to creating modules, and it’s a fairly niche use-case. So, up to now I’d pretty much given up on it.

What I hadn’t realised, was that the Stream component, not only lets you talk to the standard input and output streams but also files and urls. Reading further, Stream has support for reading a file continually, as if you were doing ‘tail -f’ on it. Now in Unix, famously “everything is a file”, so in theory, you should be able to read a serial device, like it was a file and if you could Stream should be able to as well. Cue ‘Road to Damascas’ moment.

For my test I quickly grabbed a JeeNode and plugged it into my MacBook. Then I wrote a short sketch that spat out a clock to the serial port:

#include <Stdio.h>
#include <Time.h>
char buffer[9] ="";
void setup()
{
   Serial.begin(9600); // Tried 19200 but didn't work!
   setTime(0,0,0,0,0,0); // reset starting time
}
void loop()
{
    digitalClockDisplay();
    delay(5000);
}
void digitalClockDisplay(){
    //Send a whole line at a time
    sprintf(buffer, "%02d:%02d:%02d",hour(),minute(),second());
    Serial.println(buffer);
 }
}

I could see it running in the Serial Monitor in Arduino, and I could see the device in /dev, but nothing if I tried “tail -f  /dev/cu.usbserial-A900acSz”

Turns out, the default baud rate is 9600 on Macs and also that you can’t alter them using Unix stty commands either!  After reading a whole lot of internet, it seemed that, at least for my demo, I’d just have to stick to 9600 or get pulled into a whole pile of Mac-mess. Also note, at least on a Mac I got /dev/cu.* and /dev/tty.*. Only the cu.* ones seemed to work. I think this is something to do with the tty.* ones expecting handshaking, but I’m happy to be corrected.

Once I’d altered the baud-rate every thing worked fine. I could tail -f onto my JeeNode device and it would happily spit out stuff like this:

00:00:20
00:00:25
00:00:30
00:00:35

On the Camel side, I made a copy of camel-example-console from the /examples folder and modified the camel-context.xml to hold my new routes. First a route to read from my JeeNode – nothing fancy, just get the data and put it in a SEDA queue as a string. Then a route to get it off the queue and send it to the screen via stdout:

 <route>
 <from uri="stream:file?fileName=/dev/cu.usbserial-A900acSz&amp;scanStream=true&amp;scanStreamDelay=1000"/>
 <convertBodyTo type="java.lang.String"/>
 <to uri="seda:myfeed"/>
 </route>
 <route>
 <from uri="seda:myfeed"/>
 <to uri="stream:out"/>
 </route>

Note: scanStream tells it to scan continuously (to get the tail -f effect) and the scanStreamDelay tells it to wait a second between scans.

A quick mvn clean compile exec:java later and it works! So, it seems that I can have my cake after all.

Now, there are caveats. For a start Camel gets pretty upset if you unplug the JeeNode and the device disappears. Also, it seems I’m stuck with 9600, at least on my Mac. But it does work and means a complete Camel solution is possible. Time for more experiments, but at least for the moment my inner engineer is quiet.

PS. It works the other way as well 🙂

<route>
 <from uri="stream:in?promptMessage=Enter something: "/>
 <transform>
 <simple>${body.toUpperCase()}</simple>
 </transform>
 <to uri="stream:file?fileName=/dev/tty.usbserial-A900acSz"/>
</route>

 

XForms Push with MQTT over Websockets

The idea…

I’ve been looking recently at a project that creates pdfs via a ConTeXt typesetter. The front-end is XForms, feeding the backend queue of jobs that get done with a delay of perhaps a few minutes. The problem is how do I tell the web app, that a particular job has finished, because the user isn’t waiting; they’re on to another document.

What I need is some sort of Push Notification that I can integrate into the XForms world.

Of course within XForms it’s possible to do a sort of polling with native <xf:dispatch delay=”number”/> calling a <xf:submit/> to see if anything interesting has happened, but it really doesn’t scale. At that point you need a ‘push’ technology that will notify you when something you are interested happens instead. Handily in HTML5, there is scope for that in the use of WebSockets. These allow you to create a two-way link back to your server along which you can send/receive pretty much anything. Now there is nothing to say you can’t write your own system but  there are plenty of fine libraries out there that will do the heavy lifting for you.

In my case Pub/Sub was the way to go as a behaviour, and MQTT specifically as a protocol because it’s one of the best (if you don’t believe me ask FaceBook). Pub/Sub in brief is a message pattern (2lemetry have a great Nutshell). Publishers send data (of any sort) to one or more topics, e.g. house/temperature/lounge to a Broker (that’s the server bit). Subscribers, without any knowledge of who is doing the publishing, subscribe to the broker to receive data from one or more topics, either explicitly i.e. house/temperature/lounge or by wildcard (+ = 1 level, #= n level), i.e house/# or house/+/temperature.

The Broker mediates between the publishers and subscribers and makes sure messages get delivered. In my use-case, for instance, for a user Fred, working on a Case 00123, you might assume a topic tree something like fred/docs/00123.  When fred logged onto the web app, he’d subscribe to  fred/# and start receiving anything on that branch and down. The reason not to subscribe to say fred/docs/# is that it leaves the door open to using the same tech for something like system messages targeted at him via fred/messages or some way to lock a form via fred/locks/00123 or track form changes via fred/forms/00123 etc etc. Note: One of the nice benefits to doing it the MQTT way is that there’s nothing to stop another system subscribing to say +/docs/# to watch for problems or gather stats!

However, before any of those bells and whistles, perhaps I need to solve the basic problem of getting it to work at all via a small example. All code and runnable examples are as usual on GitHub.

Demo v1: Connect/Disconnect.

Untitled

click to run example.

First thing is a MQTT broker. The nice people at DC Square (@DCsquare)  have a public broker running at mqtt-dashboard.com we can play on based on HiveMQ which I’ll use in the example. Please be nice to their broker and use responsibly. Want to set up your own broker? The Mosquitto Open Source broker comes highly recommended. Second we need a Javascript library to talk to the broker. The Eclipse Paho project has one that is pretty good. To get it you’ll have to clone their repository like so:

git clone http://git.eclipse.org/gitroot/paho/org.eclipse.paho.mqtt.javascript.git

Lastly some code…. I’m going for the simplest xsltforms XForm with my MQTT connection details for the DC Square broker on it’s web socket port, and an initial topic to subscribe to (unused just yet) in the <xf:instance/>:

<xf:instance id="connection" xmlns=""> 
<data> 
<host>broker.mqttdashboard.com</host> 
<port>8000</port> 
<clientId></clientId> 
<initialtopic>xforms/chat</initialtopic> 
<status>disconnected</status>
 </data> 
</xf:instance>

Note the clientId is empty. I’m using a bit of xpath in the xforms-ready event to set this to a random value. This makes sure I don’t get kicked off the broker by someone connecting with the same id. (I’ll come back to this later). I’ve two trigger/buttons to connect and disconnect, which each call Javascript using the <xf:load><xf:resource/> method. Here’s the connect call as an example:

<xf:trigger id="mqttconnect">
  <xf:label>Connect</xf:label>
  <xf:load ev:event="DOMActivate">
    <xf:resource value="concat('javascript:mqttstart(&quot;',host,'&quot;,&quot;',
    port,'&quot;,&quot;',clientId,'&quot;)')" />
  </xf:load>
</xf:trigger>

Having to put in quotes around the params makes it a bit messy and it’s can be a pain. On bigger forms I tend to move this sort of JS stuff up into an <xf:event/> at the top of the form and call it, rather than spread it out through the code. Now I just need some JS to call. The main mqttstart() function sets up a Messaging Client with our parameters. The MQTT library does pretty much everything via callbacks, so most of the code is references to other functions.

function mqttstart(host,port,clientId) {
client = new Messaging.Client(host, Number(port), clientId);
client.onConnectionLost = onConnectionLost;
var options = {
timeout: 30,
onSuccess:onConnect,
onFailure:onFailure 
};
 client.connect(options);
};

For this version they’re pretty simple and I’m not interacting with the XForm the other way at all, i.e.

function onConnect() {
alert('Connected');
};
function onFailure() {
alert('Failed');
};
function onConnectionLost(responseObject) {
alert(client.clientId + " disconnected: " + responseObject.errorCode);
};

Only onFailure() is at all fancy and that’s just to get an error code. I’ll call this Version1 – here it is.

Version 2 : Feeding events back into the XForm.

Untitled

click to run example

Now I know the library works, I need to integrate the results from JS, back to my XForm. This is done by calling an <xf:event/> (note this call format is xsltforms specific) . I’ve got a generic JS function for this that takes the name of the event and a payload for the properties. The code is below, along with my new onConnect() function to call it.

Important! The JS side needs to know what the <xf:model/> id is for the call. I’ve hardcoded it here. Make sure it’s set on your XForm as well or it won’t work.

On the XForms side, I’ve now got the <xf:event/> that sets the connection status in my instance(). To add a visual cue, I’ve added a <xf:bind> to set the instance to readonly once connected  and reset if we disconnect:

function call_xform_event(xfevent,xfpayload) {
  var model=document.getElementById("modelid")
  XsltForms_xmlevents.dispatch(model,xfevent, null, null, null, null,xfpayload);
}
function onConnect() {
call_xform_event("mqtt_set_status",{status:  "connected"});
};
<xf:bind nodeset="instance('connection')" readonly="status = 'connected'" />
<xf:action ev:event="mqtt_set_status">
<xf:setvalue ref="status" value="event('status')" />
<xf:reset model="modelid" if="event('status') = 'disconnected'" />
</xf:action>

This is a much cleaner system than calling JS. I can set any number of properties on the JS side as a JSON string and they’ll appear as event properties on in the XForm.

Lastly, I need to add a subscription. In this case I’ve chosen xforms/chat.  To subscribe I have to wait to be connected and of course I could have done that at the JS level. However, this is about XForms so I’ve added a <xf:dispatch> to my ‘mqtt-set-status’ event to call another event to do it for me if I’m connected.

 <xf:dispatch name="mqtt_sub_first" targetid="modelid" if="event('status') = 'connected'" />

Next, I need to add support for messages arriving from this subscription. This is done by adding another callback into mqttstart() for onMessageArrived and then fleshing it out so that the message and it’s topic can be sent into my XForm:

function onMessageArrived(message) {
  dispatch_xforms("mqtt_message_arrived",{
    message: message.payloadString,
    topic: message.destinationName
    }); 
};

On the Xform side, I now need the event code. I’m going to drop all the messages into a new top-down “messagelog” instance. So it doesn’t escape off the screen, I’ve limited the log to 15 rows before I prune the bottom row. This simply uses a <xf:repeat> to output onto the screen.

<xf:action ev:event="mqtt_message">
  <xf:setvalue ref="message" value="event('message')" />
  <xf:insert nodeset="instance('messagelog')/event" at="1" position="before" />
  <xf:setvalue ref="instance('messagelog')/event[1]" value="concat(now(),' : ',event('topic'),' : ',event('message'))" />
  <xf:delete nodeset="instance('messagelog')/event[last()]" if="count(instance('messagelog')/event) &gt; 15" />
</xf:action>

You can try out v2 here. Once you connect, you’ll see, hopefully, the classic “Hello”. This is what is called a Retained Message and is sent to anyone subscribing as a default.  If you now go to the mqtt-dashboard.com site and use their Publish option, you should be able to publish anything you like to the xforms/chat topic and see it appear on the XForm – neat!

Version 3 : Publishing.

Untitled

Click to run example

We’re almost there. The last thing is how to publish a message to the broker. That’s actually the easiest bit of all. All I need is another instance to hold the topic and the message, so we can put it in the XForm, along with a <xf:trigger/> to call a JS function with the two values to send it. However, I can make it a little bit more interesting by talking about adding QoS and CleanSession.

QoS, or Quality of Service, is a flag that gets sent with a message or a subscription that sets how hard the broker and client work to handle your message.

At QoS 0, the default, the client deletes the message as soon as it’s sent to the broker. The broker in turn only sends it on to clients that are connected – it’s fire and forget.

At QoS  1, the client keeps the message stored until it knows the broker got it. In turn the broker keeps a copy until not only those connected get it, but those not connected currently but subscribed as well – you’ll get the message at least once, but might get a duplicate if the  broker/client aren’t sure.

At QoS 2 the same happens at QoS 1, with added actions by the broker/client to make sure you only get the message once.

So how does the broker keep track? It uses the clientId as the key to handle messages and subscriptions for a given client. When the client connects, there is a flag that tells the broker to either carry on or reset all messages and subscriptions for this client – that’s cleanSession and it’s true by default.

It’s worth knowing about since for a practical system you’d want cleanSession to be false and have a fixed clientId. Also, you’d probably end up with your messages being transferred at at least QoS 1. As they’re just flags it’s easy to add to the XForm for experimentation as version 3

Wrap Up.

It’s been fairly easy to integrate XForms with MQTT as it is with most JS libraries. Now I’ve a working example I should be able to generate my notification system fairly simply. Pub/Sub is an great technology and opens the way for some interesting form/form and form/server interactions.

Resources.

  • MQTT.org All things MQTT and where to go for libraries and brokers.
  • Eclipse Paho. M2M and IoT protocols. Home to the reference MQTT clients.
  • XSLTForms. Great client-side XForms engine.

 

Creating MarkLogic content with Apache Camel

MarkLogic with JMS?

A few weeks back we started looking at MarkLogic  at work as a possible replacement for our mix of Cocoon and eXistDB systems. One of the side issues that’s been raised is how we would get messages to/from our ActiveMQ EIP system. Now, MarkLogic doesn’t have a JMS connector, although to be fair, it seems to have a pretty good system for slurping up content from URL’s, files etc. However, there is a Java API, which gave me the idea of using my old friend Apache Camel.

If I could get Camel to talk to MarkLogic then, not only could I talk to any sort of queue, I could also pull content into MarkLogic from the huge range of other things Camel will talk to, and I would get all the EIP magic thrown in for good measure.

The Camel Routes.

The easiest way to test this was of course to set up a simple Camel project to slurp up some data. The most expedient  producer I could think of was to get an RSS feed from somewhere; JIRA being the most obvious, as it would reliably produce something new at reasonably short intervals. This would need transforming into XML and pushing into Marklogic via a queue. The MarkLogic side would be handled by their Java API mounted as a Bean, in my case written in Groovy so I could work with it as a script. So much for the basic plan. Here’s the start route as Camel sees it :

<route id="jira">
    <from uri="rss:https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml
    /temp/SearchRequest.xml?jqlQuery=&amp;tempMax=100&amp;delay=10s"/>
    <marshal><rss /></marshal>
    <setHeaderheaderName="ml_doc">
        <simple>/twitter/${exchangeId}</simple>
    </setHeader>
   <to uri="seda:mlfeed"/>
</route>

The Camel RSS module calls a basic Jira RSS feed, in this case, polling every 10 seconds. I’ve used the module defaults, so each entry is separated out of the feed and passed down the route one at a time. At this point the message body is a Java SyndFeed object, not XML, so it has to be ‘marshalled‘. Now the message body is an XML string ready for upload, but before I can send it I need to make a URI for MarkLogic to use. Each run of the route or ‘exchange’ has a unique id, so I’ve used that via the inbuilt <simple/> language. Alternatively, I could have also parsed something out of the content, like the Jira id or made something up like the current date-time. Finally, the message is dropped into a queue via the SEDA module.
Note, this in-memory queue type isn’t persistent, like JMS or ActiveMQ, but it’s built into camel-core, so was just handy.

There is another route to pull messages from the queue and into MarkLogic.

<route id="marklogic">
    <from uri="seda:mlfeed"/>
    <to uri="bean:marklogic?method=upload_xml"/>
    <!-- <to uri="file:outbox"/> -->
</route>

This route takes messages off the queue and passes them to a Bean written using Camel’s Groovy support. Lastly there’s an optional entry to put the message into a file in an /outbox folder. This is handy if you can’t get the MarkLogic bit working and want to look at the input: comment out the bean instead and just drop the data into files.

The Groovy Code.

The Groovy Bean is mounted in the configuration file, along with some parameters needed to connect to MarkLogic.
Note. To get this working, you’ll need to supply your own parameters, and have a MarkLogic REST server listening, as REST is the basis of their API. You can get instructions here.

<lang:groovy id="marklogic" script-source="classpath:groovy/marklogic.groovy">
<lang:property name="host" value="YOURHOST" />
<lang:property name="port" value="YOURPORT" />
<lang:property name="user" value="YOURNAME" />
<lang:property name="password" value="YOURPASSWORD" />
</lang:groovy>

Once the Bean is running, you simple call it’s methods in the route. You get as input the entire Exchange, so you have access to everything, as well as the ability to alter it as you like. In this case, I’ve simply written the data out and not altered the massage at all. In real life it would probably be more complex. The salient bit of Groovy code (the Get’s for the parameters are not shown) is shown below. This is the MarkLogic basic example with a couple of mods to a) Get the header that has the URI in, and b) Get the body of the input Message as an InputStream:

public void upload_xml(Exchange exchange) {
    // Get the doc url from Camel
    String docId = exchange.getIn().getHeader("ml_doc");
    if (docId == null) docId = "/twitter/" + exchange.getExchangeId();
    // create the client
    DatabaseClient client = DatabaseClientFactory.newClient(host, port,
 user, password,
 Authentication.DIGEST);

    // try to make use of the client connection
    try {
        XMLDocumentManager XMLDocMgr = client.newXMLDocumentManager();
        //Get an InputStream from the Message Body
        InputStreamHandle handle = new InputStreamHandle(exchange.getIn().getBody(InputStream.class));

        //Write out the XML Doc
        XMLDocMgr.write(docId,handle);
        } catch (Exception e) {
            System.out.println("Exception : " + e.toString() );
        } finally {
            // release the client
            client.release();
        }
}

Note.  I’ve connected and disconnected to the MarkLogic database each time. I’m sure this can’t be efficient in anything but a basic use case, but it will do for the present. There’s nothing to stop me creating an Init() method that could be called as the Bean starts to create a persistent connection if that’s better, but all the examples I could find seem to do it this way [If I’ve made any MarkLogic Java API gurus out there wince, I’m sorry. Happy to do it a better way].

Putting it all together.

If you’ve got a handy MarkLogic server, you can try this all out. I’ve put the code here on GitHub as a Maven project, and all you need to do is pull it and run “mvn compile exec:java”. Ten seconds or so after it starts, you should see something similar to this on the console:

[l-1) thread #1 – seda://mlfeed] DocumentManagerImpl INFO Writing content for /jira/ID-twiglet-53205-1398451322665-0-2
[l-1) thread #1 – seda://mlfeed] DatabaseClientImpl INFO Releasing connection

On the MarkLogic side, if you go to the Query Console you can use Explore to look at your database. You should see the files in the database – query them to your heart’s content.

  • I’m using MarkLogic 7 and Java API 2.0-2 with Camel 2.12.0.
  • If you want to change the routes, you’ll find them in src/resources/camel-context.xml.
  • The Groovy script is in resources/groovy/marklogic.groovy.
  • Remember, if you want to use modules outside of camel-core, you’ll need them in the pom.xml!

Bells and Whistles.

Now I’ve got the basic system working there are a couple of other things I could do. As the MarkLogic component reads from the end of a queue, I could for instance add another route that puts messages into the same queue from another source, for example Twitter (for which there’s a module) assuming I had appropriate twitter oauth ‘keys’, like so:

<route id="twitter-demo">
    <from uri="twitter://search?type=polling&amp;delay=60&amp;keywords=marklogic&amp;consumerKey={{twitter.consumerKey}}&amp;consumerSecret={{twitter.consumerSecret}}&amp;accessToken={{twitter.accessToken}}&amp;accessTokenSecret={{twitter.accessTokenSecret}}" />
    <setHeader headerName="ml_doc">
        <simple>/twitter/${body.id}</simple>
    </setHeader>
    <log message="${body.user.screenName} tweeted: ${body.text}" />
    <to uri="seda:mlfeed"/>
</route>

Of course, once you start doing that, you need someway to make sure you can throttle the speed that things get added to the queue to avoid overwhelming the consumer. Camel has several strategies for this, but my favourite is RoutePolicy. With this you can specify rules that allow the input routes to be shutdown and restarted as necessary to throttle the number of in-flight exchanges. You simple add the Bean like so with an a approprite configuration:

<bean id="myPolicy" class="org.apache.camel.impl.ThrottlingInflightRoutePolicy">
 <property name="scope" value="Context"/>
<property name="maxInflightExchanges" value="20"/>
<property name="loggingLevel" value="WARN"/>
</bean>

and then add this policy to any route you wish to control, like so:

<route routePolicyRef="myPolicy">

Once there are more than 20 messages in-flight (‘context’ means all routes/queues) the inbound routes will be suspended. Once the activity drops below 70% (you can configure this) they’ll start up again – neat.

This only really skims the surface. Camel is a marvellous system and being able to use it to push content to MarkLogic is very handy (if I polish the code a bit).  Wiring routes up in Camel is so much easier, flexible and maintainable, than writing custom, one-off code.

Finally, of course, there’s nothing to say you couldn’t have a route that went away, got some key which was then sent to MarkLogic via Bean to retrieve some data instead and which that then got added to the body (Content Enrichment in EIP). That’ll have to be the subject of another day.

Resources.

  • MarkLogic Developer. http://developer.marklogic.com
  • Apache Camel. http://camel.apache.org
  • Enterprise Integration Patterns (nice hardback book) http://www.eaipatterns.com

Camel, Groovy and Beans

Last year I did an article on using Apache Camel as a switchboard for home monitoring – and a bit of a nod to IoT perhaps. One of the decisions I’d made was, as far as possible, I’d use Spring XML to configure rather than compile my solution, as I was interested in whether I could use Camel as a general tool. [more on Artisan use and tools here] So far, it’s worked out pretty well, until I wanted to upload some files via HTPP POST.

The plan.

Untitled

I’ve a Raspberry PI with a camera module, to take stop-motion images. There’s not much room on the PI, so it’s attached to the WiFi and uploads the photo after taking it, every minute or so. My Camel engine (2.12) is sitting on the server as a servlet inside a copy of Tomcat 7.

Now you might say that all I need is a bit of PHP or a servlet or similar to just dump the file. But, if I did that, not only would I get a ‘point-solution’ just for this need, I’m also reducing my choices as to what  can be done with the data afterwards. What if I want to send every 100th one to Flickr or SMS my phone if the PI stops sending? If I can pull the image (and it’s meta-data) into a Camel route, not only can I just save them, they’re ready for anything else I might want/think of to do with them later.

The technical problem is that the Camel Servlet component I’m using works fine for GET as you, well get, the parameters as Headers. If however you POST, as you need to to upload a file, then you get one long stream of data in the message body with everything mashed together as “Multipart Form-data” or RFC1867. What I need is a way to parse out the image file and headers myself and there’s even a Java library to do it called Commons File Upload. In the normal scheme of things you would create a Java Bean which would be called in the pipeline to do the work for you. But it seems a little against the configure-only theme, so, I need a way to write code, without writing “code”, i.e. script it in.

UntitledNote: This Bean doesn’t need to save the image, just move it into the body in a format where I can use modules like File to save it or route it somewhere else later.

Going Groovy

In my previous article, I’d already used a bit of Javascript in my route, just in-line with the XML. Now another Camel supported script language is Groovy. If you’ve not come across Groovy before, it’s worth a look, Java engine underneath but with the rough edges taken off and a rather nicer syntax. Happily, it also understands straight Java as well as cool Groovy constructs like closures, so you can simply drop code in and it works. You can use Groovy in-line in predicates, filters, expressions etc in routes and everything will be lovely, but it is also supported by Spring (and if you want, you can read about Dynamic Language Beans here) so you can create a Bean with it which seems just the ticket.
Dynamic, flexible and easily shared, drop-in Beans. That’s more like it.

I’m using the Camel Servlet-Tomcat example as the basis for my current engine. To use Groovy, you have to have the following added to your pom.xml :

<dependency>
 <groupId>org.apache.camel</groupId>
 <artifactId>camel-script</artifactId>
</dependency>

 <dependency>
 <groupId>org.apache.camel</groupId>
 <artifactId>camel-groovy</artifactId>
</dependency>

Build the war file and deploy it onto Tomcat. Try the built-in example to make sure it’s all working. Next add the following route in camel-config.xml (in WEB-INF/classes) :

<route>
    <from uri="servlet:///gmh" />
        <setBody>
        <groovy>
            def props = exchange.getIn().getHeaders()
            response ="&lt;doc&gt;"
            props.entrySet().each {
                response += "&lt;header key='" + it.getKey() + "'&gt;" + 
                it.getValue() +"&lt;/header&gt;"
            }
            response += "&lt;/doc&gt;"
            response
        </groovy>
        </setBody>
    <setHeader headerName="Content-Type">
    <simple>text/xml</simple>
    </setHeader>
</route>

Now try http://localhost:8080/[YOUR SERVLET]/camel/gme and you should get a nice XML list of headers and a warm feeling that Groovy is working.

Adding Beans

The problem with adding code in-line is that it not only becomes unwieldy pretty easily, you’re also limited in how you can put it together. This is where Beans come in. They not only allow you to hide/reuse even share groovy recipe code, but work at the Exchange level giving you far more options. To set up the code snippet above as a Bean is pretty easy. a) Create a folder in /classes called /groovy. b) In that create a file (or download) called gmh.groovy.
c) Into that file drop the following code:

import org.apache.camel.Exchange
class handler {
    public void gmh(Exchange exchange) {
        def props = exchange.getIn().getHeaders()
        def response = "<doc>"
        props.entrySet().each {
            response += "<header key='" + it.getKey() + "'>" + it.getValue() + "</header>"
        }
        response += "</doc>" 
        exchange.getIn().setBody(response)
    }
}

Notice that you now need to declare things that were previously hard-wired for in-line code, like the exchange, response etc. Also, there is now a class wrapped around the code which is itself now in a method plus I’ve had to set the body explicitly. But, you don’t need all that tedious XML escaping, and the whole Exchange is available to play with.

To wire gmh.groovy up into the route you need to add the following above the camelContext entry:

<lang:groovy id="gmh" script-source="classpath:groovy/gmh.groovy"/>

Note you may need to declare the “lang” namespace at the top of the file before it will work in which case add the following:

xmlns:lang="http://www.springframework.org/schema/lang"

Lastly, the route can be altered to get rid of the in-line code and use the bean instead:

<route>
 <from uri="servlet:///gmh" />
 <to uri="bean:gmh?method=gmh"/>
 <setHeader headerName="Content-Type"><simple>text/xml</simple></setHeader>
 </route>

Note it’s the bean:id that gets used in the uri, not the class name, – only the method name is in the route.

Now is you re-start the servlet and re-try the url, you should get the same answer as before. There’s of course a lot more you can do with this, there always is, but that’s the basics and it works. So, where does that leave me with my POST problem?

Getting the POST

Suffice to say, it’s not much more difficult, once Groovy is running, it’s just a bit more script. I used the Apache Commons File Upload libraries, and a bit of code (60 lines). Basically, it reads the encoded data in the body and does one of two things. a) If it’s a header, it creates a Exchange.Property called groovy.[header]. If it’s a file, it creates a property with the file name and turns the stream into byte array which gets put in the body.
That gave me a new postMe.groovy script which I wired into this route (Here’s the camel-config as well) :

<route id="POSTTest">
 <from uri="servlet:///posttest" />
 <to uri="bean:PostMe?method=getPost"/>
 <to uri="file:outbox"/>
 </route>

If you set this up and call curl with a file to upload like so:

curl -i -F filedata=@image.jpg http://localhost:8080/[SERVLET]/camel/posttest

Then you should see your file  in  the /outbox folder. You can also add headers like -F stuff=foo and throw in the previous bean to show them in the list as groovy.stuff=foo.

Last thoughts

Groovy is well, groovy and fixes my POST issue nicely. However, it’s proved that running Camel without doing some coding maybe optimistic. Having said that, once you have a Groovy-enabled .war deployed, it’s a great way to add code snippets into your route XML, and use it to create Beans to open out a much wider range of possibilities that can be shared. I can see it would be fairly easy to create a “toolkit” Camel with all these things in and perhaps SQL, SEDA (for queues) etc and a few of these Beans as well as ‘recipes’.

Keyboard events in XForms

A couple of weeks ago I came across a post by Stephen Pemberton on Mouse Events in XForms. It gave me the idea about capturing “key” events to enable my own forms to be a bit smoother. So, perhaps get the Esc key in an <xf:input/> to delete the contents, or Return to trigger a <xf:submission/>. Another was whether I could get the arrow keys to work in a <xf:repeat/>. Now I hasten to add, fiddling with the key events can break forms used on other devices like phones. In my case I’m only dealing with browsers in an office, so your milage may vary but the general idea may be of use.

Untitled

So, over a coffee, I put together a little demo on GitHub using XSLTForms 1.0 that you can run here.
I used Wikipedia as a source of my data, with a hat-tip to Alain Couteres  for his OpenSearch demo showing XSLTForms’ neat JSON handling that I used to get the submission working.
If you fire up the example, you’ll see that you can use Esc and Return in the search box, and if you click into the results, you’ll see you can use the up and down arrows to move in the list. If you get to the bottom of the list and use down-arrow again, you’ll trigger the next page. Up-arrow does the same for prev page. This is how it works:

First of all there is a little bit of XML and a <xf:submission/> that gets data from wikipedia using their api.
Note: Use text/jsonp as the media type. You need it so xsltforms can do it’s magic.

<xf:instance xmlns="" id="query">
 <data>
 <action>query</action>
 <list>search</list>
 <srprop>snippet</srprop>
 <srsearch>xforms</srsearch>
 <format>json</format>
 <sroffset>0</sroffset>
 <srlimit>10</srlimit>
 </data>
</xf:instance>
<xf:submission id="search" method="get" replace="instance"
 mode="synchronous" instance="results" ref="instance('query')"
 action="http://en.wikipedia.org/w/api.php" mediatype="text/jsonp"> 
</xf:submission>

Now, <srsearch/> is the field we’re querying for, so that’s bound in as an input. Normally, I’d use something like so:

<xf:input ref=’srsearch’ incremental=’false’>
<xf:send submission=”search” ev:event=”DOMActivate” />
</xf:input>

This gives you an input box that submits if you leave the field, press return etc. Also, normally there’d be a button as well, just in case. To alter this state of affairs, all I need to do is wrap the control I want to capture in an <xf:group/> and add some <ev:event/>s.
Note: You can’t put your event code inside the <xf:input/> which you’d think would be the place for it – it has to be wrapped to work.  The input now looks like this:

<xf:group ref="instance('query')" navindex="0">
<!-- ESC Key -->
 <xf:action ev:event="keydown" if="event('keyCode') = 27">
 <xf:setvalue ref="srsearch" value="''"/>
 </xf:action> 

 <!-- RETURN -->
 <xf:action ev:event="keydown" if="event('keyCode') = 13">
 <xf:dispatch targetid="searchme" name="DOMActivate"/>
 </xf:action>

 <xf:input ref='srsearch' incremental='false'/>
<xf:trigger id="searchme">
 <xf:label>Search</xf:label>
 <xf:action ev:event="DOMActivate">
 <xf:setvalue ref="instance('query')/sroffset" value="0"/>
 <xf:send submission="search" />
 </xf:action>
 </xf:trigger>

 </xf:group>

If you’re in the input , Esc will delete the contents and Return will trigger a search.  It’s worth noting that the key event is captured regardless and then the if statement acts to control the action, not the other way around. Since I’ve added a ‘Search’ <xf:trigger> now, I can use <xf:dispatch/> in my Return code to send the DOMActivate event to it. That way I don’t have to duplicate the submission code. It’s a great way to say, put prev/next buttons at the top and bottom of a results list, and just have the code in one set.

Next up is the results. I’ve use a standard <repeat/> with a CSS table layout again wrapped in a <xf:group/> with a couple of buttons to page through the results. Since the snippet element comes back as encoded html with the search term highlighted, I’ve used the media type flag in <xf:output/> to display it properly.

As with the input, I’ve caught the keys I want. For example, down-arrow:

<xf:action ev:event=”keydown” ev:defaultAction=”cancel”
if=”event(‘keyCode’) = 40″>
<xf:dispatch targetid=”list_next” name=”DOMActivate” if=”index(‘resultslist’) = 10″ />
<xf:setindex repeat=”resultslist” index=”index(‘resultslist’) +1″ if=”index(‘resultslist’) &lt;= 9″/>
</xf:action>

There are two actions here, driven by appropriate ‘if’ statements. The first calls the Next <trigger/> if you’re at the bottom of the list, i.e. index() = 10. The other increments the list index ‘if” you’re not at the bottom, i.e. index() <=9. Capturing up-arrow simply works in reverse.

There are a couple of things to note here:

  • The use of ‘navindex=1’ in the <xf:group/>. This makes it possible for the list, which isn’t a control, to get keystrokes. It won’t work without it.
  • Each <xf:event/> has ev:defaultAction=”cancel” set. This stops the event from propagating. Be careful if you use this with a input control as normal keys you don’t handle will get cancelled too.

That’s pretty much it and I hope it’s been useful as even if you don’t capture keys, it’s handy to be able to send events to <xf:trigger>s as they’re used so frequently. Enjoy.

SOLR with Tomcat 7

The SOLR search engine is a great project, which keeps getting better. Not having installed one for a while, I went through the usual motions of someone with Tomcat 7 running on Debian:

  • Downloaded the 4.6.0 zip onto the laptop.
  • Found the .war file.
  • Opened Tomcat on the server at the management screen and used the ‘deploy’ option to push it onto the server (saves fiddling with permissions on the server).
  • Added in a database from multicore in /examples
  • Told SOLR where it was in it’s web.xml
  • Restarted the servlet.

And at that point it didn’t work – wouldn’t start the servlet. Hmmm. Did the usual stuff, checked things, ran the downloaded .war in the Jetty engine it comes with – worked perfectly. Several hours went past as I explored whether it was my Java 6 or some config mistake. Finally found out the issue – the nice folks at Apache had done clever things with the logging in 4.3+.

According to this post on their site:

These versions do not include any logging jars in the WAR file. They must be provided separately. The Solr example for these versions includes jars (in the jetty lib/ext directory) that set up SLF4J with a binding to the Apache log4j library.

Basically you can’t just drop the .war onto a servlet container of choice and run it, which I’ve got to say seems to me a little against the spirit of the whole .war gig. Now I can see that it’s very clever to have a really flexible, separate logging mechanism, but annoying that you have to get involved with it by default.

Untitled

So, the fix is?

Well they provide you with the logging jars in  a folder handily named  “solr-4.6.0/example/lib/ext” which you need to copy into the Tomcat /lib.  Next, you need to copy “solr/example/resources/log4j.properties” into the class path e.g. dropped into /lib as well.  Restart the container and you should be good to go.

Now this is fine if you know what you’re doing etc, but what’s wrong with a .war already set up for those who aren’t? Or perhaps, even if you do know, something in the README to let you know you needed to do something extra and a pointer into the wiki?