MarkLogic magic in Jupyter Notebook

Jupyter Notebook (and it’s alternatives) are being seen more outside the confines of the data-science space. People have realised that you can do much more with them than Markdown, Python and MatLab, though not to say those things aren’t interesting! I’ve been looking at using Jupyter more as a way to capture documentation and code – largely through the use of ‘magics‘, cell level interpreters, that for instance let you execute some ‘R’, run queries in a SQL database, plus a host of other things large and small. I found the bash magic to run commands and the dot magic to create diagrams particularly useful.

Untitled

But, wouldn’t it be even more useful to be able to call out to a MarkLogic server via the REST interface – even better if the output could be captured for subsequent use in the Notebook? Of course, it’s pretty easy in Python to POST out to somewhere with the requests library and get results back, but also far from elegant.  Why not build a magic?

A huge hat-tip to Nicolas Kruchten (@nicolaskruchten) who’s fabulous Pycon15 talk “Make Jupyter/IPython Notebook even more magical with cell magic extensions!” showed me how easy it was to make a magic.
Oh, and that Jupyter had it’s own editor (who knew)? 

So, make a cup of tea, watch his video (it’ll be 30 min well spent, but skip to about +17:15 in if you’re impatient) and come back….

import requests
from requests.auth import HTTPDigestAuth
from requests_toolbelt.multipart import decoder
from IPython.core.magic import needs_local_scope
import json

def dispatcher(line,cell):
 #Split the URI up
 r = requests.utils.urlparse(line)
 session = requests.session()
 session.auth = HTTPDigestAuth(r.username,r.password)
 payload = {r.scheme: cell}
 uri = 'http://%s:%s/v1/eval' % (r.hostname,r.port)
 r = session.post(uri, data=payload)
 # Output is a list of dict
 out = []
 if r.status_code == 200 and 'content-type' in r.headers:
 if r.headers['content-type'].startswith("multipart/mixed"):
 multipart_data = decoder.MultipartDecoder.from_response(r)
 for part in multipart_data.parts:
 ctype = part.headers['Content-Type']
 data = json.loads(part.content) if (ctype == 'application/json') else part.content
 out.append({'data' : data, 'type' : ctype})
 return out 
 
 
def load_ipython_extension(ipython, *args):
 ipython.register_magic_function(dispatcher, 'cell', magic_name="marklogic")

def unload_ipython_extension(ipython):
 pass

Interesting isn’t it? Now you have a good idea how magics work (you watched the video didn’t you?) and the code above should make some sense.

Encouraged by his example and a read of the docs  it was pretty straightforward to create a basic magic for MarkLogic REST in about 30 lines of code. If you want to play along, use the build-in editor (still can’t get over that) , create a file called sample_ext.py in the same folder as your notebook and drop the code above in.

The meat is in the dispatcher method:

  • It takes the first line in the cell and then the rest of the cell as it’s arguments. It assume the first line is a connection-string and the rest of the cell is the code.
  • The connection string is in the format xquery://admin:admin@localhost:8000 which is then split up into uri components.
  • The requests lib is used to construct the call, sent to the standard REST eval endpoint (using the default XDBC port 8000 in this case).
  • The http ‘scheme’ part of the uri; either xquery or javascript tells the eval what sort of code is being sent (sparql might be nice too, but I didn’t get to it) .

There isn’t anything special with the output; a couple of basic checks and then each part of the multipart response is made into a dictionary and added to the list (if it’s JSON, it’s eval’d first, otherwise it’s as it comes). The list is returned as the output from the cell. Certainly, not production grade, but good enough to play with.

Untitled

Next you load or reload the magic and it’s ready to use. Above you can see the results from a trivial bit of XQuery being run on the default REST port on my local MarkLogic server with the results in the output of the cell. One of the reasons for using list/dict as the return format is that it makes it trivial to create a Pandas DataFrame out of the result, which in turn allows all sorts of subsequent data munging and charting shenanigans. Notice especially how ‘the cell-above’ is referred with by “_ “. Both In[] and Out[] variables are maintained by the notebook for all the cells, so Out[196] could just as easily been used.

It works fine with javascript too, with the added ease of use that JSON brings to the table:

Untitled

Now it’s possible to include output from a MarkLogic server a few things come to mind going forward where this capability might be handy; from server monitoring and project configuration (especially matched with cell input controls) to developing code, not to forget simply having access to all that data.

Now it probably isn’t best to start pulling millions of lines of data from your MarkLogic server into a Notebook as a DataFrame. However, what you might do is use MarkLogic to do the heavy lifting across your structured/unstructured data that Jupyter can’t do: search for instance, or BI analytics or semantic inference and then pull that resultant dataset forward into the Python or R space to do further statistical analysis, machine learning, fancy charting, dashboards etc.

Advertisements

Data transforms in Apache Camel with BeanIO

Apache Camel has so many ways of making your life easier; here’s one.

I needed to import a fixed-format file, the kind of thing that reminds you to hug XML and even give JSON a break every so often. In this case, I was importing the Yale “Bright Star Catalogue“, featuring a load of numbers about all the visible stars, about 9000 all told. Not a huge database, but a pain to parse, like all fixed format data, and I needed output in XML.

I looked at what Camel had to offer and came across the BeanIO component, which handles CSV, Fixed and XML formats. Now this immediately made life easier, for a start there’s an external XML mapping file to tell the parser what fields to expect and what to do  (for all the options, see here). Here’s the first few fields in my star data :

<?xml version="1.0" encoding="UTF-8"?>
<beanio xmlns="http://www.beanio.org/2012/03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.beanio.org/2012/03 
http://www.beanio.org/2012/03/mapping.xsd">

<stream name="stars" format="fixedlength">
     <record name="star" class="java.util.HashMap">
         <field name="HR" length="4" trim="true" />
         <field name="NAME" length="10" trim="true" />
         <field name="DM" length="11" trim="true" />
         <field name="HD" length="6" trim="true" />
     </record> 
 </stream>

I’m using vanilla Spring XML in my Camel, so I’ve no class to map my data to, hence I’m using HashMap, but if you’ve got one, your class goes in the record. Also, as I’m hoping for XML output, I’m trimming each field so I don’t get a file full of spaces.

BeanIO can run as either a DataFormat, or as a Component, I’m using the former. Now al I needed was a folder to put the file in and a bit of Camel:

<dataFormats>
 <beanio id="stars" mapping="mapping.xml" streamName="stars" 
ignoreUnexpectedRecords="true"/>
</dataFormats>
<route>
 <from uri="file:inbox?noop=true"/>
 <split streaming="true" parallelProcessing="true">
 <tokenize token="\r\n|\n" xml="false" trim="true" />
 <to uri="direct:stars"/>
 </split>
 </route>
<route>
 <from uri="direct:stars"/>
 <unmarshal ref="stars"/>
</route>

This is standard stuff, the dataFormat section points to the mapping file and tells it what stream definition I want from it, then I split the file, send each line on and “unmarshal” it into a HashMap using that definition.

Now at this point I was fairly happy, the split was simple, but I was still faced with having to create some sort of Groovy bean to assemble the HashMap into the XML I wanted. I actually started down that road and then came across the following in the docs:

Our original mapping file from Section 2.1 can now be updated to parse XML instead of CSV with only two minor changes. First, the stream format is changed to xml. And second, the hire date field format is removed…

Lightbulb moment. All I needed was to add a second steam format using the fields I wanted in my XML, and BeanIO would “marshall” it for me. No bean, no mess no fuss. Again there’s a load of options, you can rename elements, make some things attributes, change format. I just needed the plain version with a couple of tweaks to suppress the <?xml… header and format the output just for readability’s sake:

<stream name="stars2" format="xml" xmlType="none">
 <parser>
    <property name="suppressHeader" value="true" />
    <property name="indentation" value="2" />
</parser>
 <record name="star" class="java.util.HashMap">
    <field name="HR" />
    <field name="NAME" />
    <field name="DM" />
    <field name="HD" />
</record>
</stream>

Now I just need to add in the DataFormat and mod my route a little so my filename comes from the data:

<dataFormats>
  <beanio id="stars" mapping="mapping.xml" streamName="stars" 
  ignoreUnexpectedRecords="true"/>
  <beanio id="stars2" mapping="mapping.xml" streamName="stars2"/>
 </dataFormats>
 
 <route>
 <from uri="file:inbox?noop=true"/>
 <split streaming="true" parallelProcessing="true">
 <tokenize token="\r\n|\n" xml="false" trim="true" />
 <to uri="direct:stars"/>
 </split> 
 </route>
 
 <route>
 <from uri="direct:stars"/>
 <unmarshal ref="stars"/>
 <setHeader headerName="CamelFileName">
     <simple>${body[0].get('HD')}.xml</simple>
 </setHeader>
 <marshal ref="stars2"/>
 <to uri="file:filebox"/> 
 </route>

That’s it, 9000 XML files in a few lines of configuration:

<star>
<HR>3</HR>
<NAME>33 Psc</NAME>
<DM>BD-06 6357</DM>
<HD>28</HD>
</star>

Now the neat thing about this, is that the is file is one of dozens of astronomy data files – many in fixed format. So the same code, add a new stream to the mapping file and you’re parsing out the “ACT Reference Catalog ” of 100,000 stars.

 

 

 

 

RDFa Email signatures

Hot from reading about RDFa Lite this morning (notwithstanding the vile respelling of ‘light’), I started thinking about ways to employ the ideas of publicly available vocabularies to make information  machine reusable. One obvious usage was the ubiquitous email signature. These days, a lot of them are html anyway; what if we just added some tags?

Take for example the famous Sherlock Holmes. Imagine him emailing from his sitting room with the following signature:

Sherlock Holmes
Consulting Detective
221b Baker Street
London
mobile: +44 800 221221
email: sholmes@plotsfoiled.com

Anyone getting this, say Professor Moriarty, would probably actually see the following due to their email client pattern matching for email address and phone numbers:

Sherlock Holmes
Consulting Detective
221b Baker Street
London
mobile: +44 800 221221
email: sholmes@plotsfoiled.com

It’s a bit sad really. Despite all the recent strides forward in social media and ‘linked-data’ that’s as far as email engines have got. It’s no idea about the brave detective’s name, job or where he lives, they’re invisible.

Now imagine for a moment a rather smarter mailer, sending and receiving these messages. Both the sender and receiver might benefit from the invisible tagging that allows the rest of the signature to be brought into play. RDFa Lite is a subset of the full spec and goes out of it’s way to make it easy to add hints as to what things are and how they relate to one another. Let’s start with the simple signature in html:

<span>
<p>Rgds,</p>
<p >
<span>Sherlock Holmes</span><br/>
<span>Consulting Detective</span><br/>
<span>221B BakerStreet</span>
<span>London</span>
mobile: <span>+44 800 221221</span><br/>
email: <span>sholmes@plotsfoiled.com</span>
</p>
<span>

First off, we need to add the vocabulary header to the second paragraph:

<p  vocab="http://schema.org/" typeof="Person">

I’ve use “typeof” to set the vocabulary we’re talking about, i.e. a Person. You can see all the vocabularies at https://schema.org/docs/schemas.html. – there are all sorts. Next off, simply add a property for each line in the signature that describes the item, like so.

<p vocab="http://schema.org/" typeof="Person">
<span property="name">Sherlock Holmes</span><br/>
<span property="jobTitle">Consulting Detective</span><br/>
<span property="address" typeof="PostalAddress">
    <span property="streetAddress">221B BakerStreet</span>
    <span property="addressLocality">London</span>
</span>
mobile: <span property="telephone">+44 800 221221</span><br/>
email: <span property="email">sholmes@plotsfoiled.com</span>
</p>

Note the address needs an enclosing span as addresses have properties of their own. The wrapper declares the parent type and then the address components nest inside

Now, our smart email program will know that London is a locality; what will it do with it? I’ve no idea: perhaps show it to you on a map, tell you how far they were away, but something! The important things are smart email programs will recognise what it is and be able to link it to something else: that’s the semantic web in action.

Sadly, as far as I know there isn’t an emailer that does this and it’s really difficult to get an emailer to even send html marked up like this as most thouroughly mangle the html you give them, Outlook certainly did, as did Gmail. In fact the only program that sent the signature as it looked above was Mac Mail. If however, you want to get a glimpse into the semantic future, then you have to look no further than the Google Structured Data Testing Tool. Simply copy the marked up signature above into the window and voilà!

Untitled

 

 

 

Perhaps if this became common practice, as html signatures before it. Mainstream programs would include this sort of linkage. I’ll live in hope.

 

 

CheerLights by Camel

Cheerlights

It’s that time of year again, when, up and down the country, people are sticking together electronics and lights for the CheerLights project. If you don’t know of it, then it a wheeze from  ioBridge Labs to connect us at this festive season. Essentially, if you tweet one or more of a set of colours using the #cheerlights tag, their server will pick it up and publish it to a ThingSpeak channel. Once there, a simple API makes it possible to construct something that sets your lights to the current colour. It’s a simple idea, but very powerful when you think of the thousands of lights and gadgets, all changing colour simultaneously across the world in response to a tweet.

UntitledLast year, I went virtual with a bit of Processing, but this year, I’m looking to do a light based on a Ciseco RFµ328 board. It’s basically a tiny Arduino, but with an SRF radio. So, it’s CheerLights API -> Raspberry Pi (SRF Dongle) ->  RFµ328 + RGB LED. What could be simpler?

Well, it started out ok. I did a Tcl script that polled the ThingSpeak API and got the current colour every 10 seconds, spat that out to the RFµ and wrote a little bit of code on that to set the RGB LED. The problem then is that you have to wait 10s for it to notice changes, by which time it might have missed some tweets if it’s busy; or, you are constantly sending ‘red’ over SRF when it’s quiet. Plus, some clever folk send out things like “#cheerlights red blue green red” and of course, you’ll just get the last one. That’s the problem with REST, it’s a polling technology.

Now, they’ve a longer feed which gives you a bit of a history, but you’re going to have to parse it and work out where in the list your light is, plus store some sort of status between polls etc. It’s getting more complex, and with a fixed poll interval still not ideal as the other end, the twitter end, is an unknown. You might of course be thinking “Get a life, it’s a light” and you’d be right in some ways. However, as an engineer, it’s an interesting problem and to be honest, you never know when you might want to use Twitter to control/inform some other process when you’ve little control over the tweeters.

Let’s start by bringing the problem under our control, by looking at Twitter ourselves. Now the steps are:

  • Tell the Twitter API what we’re searching for, i.e. the #cheerlight hashtag. It’s an Event api, so we’ll get results only as they’re tweeted. That neatly fixes the polling issue, whilst still getting us tweets as they happen.
  • Pull any colours out of the tweet – bit of regex here perhaps.
  • Send those colours out to the widget. That doesn’t change

Ok, it’s a bit more complex, especially the Twitter side, but we’ve got a Camel on standby, so lets ride!

Using Camel Routes

Now Apache Camel has a Twitter component and a very nice example of it’s use, so I won’t go into the process of creating Twitter keys. Suffice to say, they’re in a properties file and I can use them in a route to get the tweets.

Our starting route is therefore:

<route id="twitter-feed">
  <from uri="twitter://streaming/filter?type=event&amp;keywords=#cheerlights&amp;consumerKey={{twitter.consumerKey}}&amp;consumerSecret={{twitter.consumerSecret}}&amp;accessToken={{twitter.accessToken}}&amp;accessTokenSecret={{twitter.accessTokenSecret}}" />
<!-- Body is Twitter4J Status Object -->
  <log message="${body.user.screenName} tweeted: ${body.text}" />
<!-- Queue them up -->
  <to uri="seda:statusFeed"/>
</route>

One of the things to like about Camel is the ability to build up a problem in pieces; it’s ‘loosely coupled’, which is good. This route watches for #cheerlights and returns the tweet – it does just one job. Notice the body isn’t a string, but a tweet object with full data like author, georef, replies etc etc.  Here I’ve dropped the results in a queue, but I could have started with a file, or simply printed it out. And, once the route works, I can go on to the next part in confidence.

Next step is get any colours. Time for a bit of Groovy here.

<route id="colours">
  <from uri="seda:statusFeed"/>
<!-- Find the colours and create delimited string as new body. Groovy rocks for this! -->
  <setBody>
    <groovy>request.getBody().getText().toLowerCase().findAll(/(purple|cyan|red|
      green|blue|white|warmwhite|yellow|orange|magenta|pink|oldlace)/).join(',')
    </groovy>
  </setBody>
  <log message="colours ${body}" />
<!-- Drop each colour into the colour queue -->
  <split>
    <tokenize token=","/>
    <to uri="seda:colourFeed"/>
  </split>
</route>

Here I replace the body of the message with a delimited string of any colours in it e.g. the tweet “#cheerlights set everything to blue. Now some red and green” becomes “blue,red,green” via a bit of Groovy regex-less magic. Since I might get one colour or ten in a given tweet next I use the Splitter to drop each colour as a separate message into a new queue to be consumed by the widget driver. Note because of the queues, each route doesn’t know anything or depend the others apart from there needing to be consumers. This is pretty handy as I can for instance feed the colours from a file, rather than test-tweeting. And, because the original full-fat tweet is preserved in the initial queue, I can pick out other facts, process them and reuse the information if I wanted to: there could be a database of tweet lat/lon pairs, or an analysis of tweeters or a mashup of colours picked. All just by altering the routes slightly to tap into the information flow at the right point.

The last bit of the puzzle is outputting the right data over SRF. Now the folks at Ciseco, have made it pretty easy. You send serial data to the USB dongle on the Pi, and it turns up on the RFµ328. But, they also have a neat protocol called LLAP that’s ideal for this sort of stuff and handles a lot of the housekeeping for you  . It uses 12-character messages, which is fine for us if we send an RGB string. So, I’ll create a new message type called PWM and send it an RGB colour to my RFµ which has the address “AC”. All LLAP messages start with an ‘a’, so the message for blue would be:

aACPWM0000FF

All the final route needs to do is read a colour, turn it  into RGB via a smidgen more of Groovy and then send it via the Stream component to the USB port the dongle is on.

 <route id="changer">
   <from uri="seda:colourFeed"/>
     <!-- throttle 1 messages per sec -->
     <throttle>
       <constant>1</constant>
       <log message="switching light to ${body}"/>
       <transform>
         <groovy> def cmap = [red:"FF0000",green:"008000",blue:"0000FF",cyan:"00FFFF",white:"#FFFFFF",warmwhite:"FDF5E6",purple:"800080",magenta:"FF00FF",yellow:"FFFF00",orange:"FFA500",pink:"FFC0CB"]
 "aACPWM" + cmap.get(request.getBody())
         </groovy>
       </transform>
       <log message="Sending LLAP Msg ${body}" />
       <to uri="stream:file?fileName=/dev/tty.usbmodem000001"/>
     </throttle>
 </route>

Notice I’ve wrapped the route in a call to the Throttler component so that the colour doesn’t change more than once a second. This makes sure that tweets of “red green blue” don’t end up as just a flicker and then blue. The input route could be throttled in a similar way so only so many colours were in the queue in case there was a flurry. See RoutePolicy for details..

Wrap up.

20141221_210057I’ve left the Arduino/ RFµ328 side out of this post – it’s easy enough to get something like this with a few lines of code and a bit of soldering:

All the Groovy is inline in this example. It’s not the most efficient method, really it should be a Bean so things like the array are only initialised once.

The point is more that Camel is a fantastic environment for the IoT’er 🙂

Camel and CSV when you need XML

It’s no secret that the public sector are in love with CSV. You only have to look at sites like data.gov.uk to see that. If it’s not CSV then it’s Excel, which comes down to pretty much the same thing. The reason is simple, there’s loads of CSV and you can create and consume it easily with office level tools. However, in the IT world CSV tends to be an intermediate format to something like SQL, or in my case XML. I often get situations where the ‘seed’ data in a project comes in as CSV from which N XML files need to be made, one from each row, e.g. people.csv into N <person/> files. The follow-on problem to this is that some of the original CSV files can be big. That’s not big as in Big Data big, but too large to load into an editor or run as a process with the whole thing in memory i.e.. “We’ll have to send you the csv, zipped, it’s 2 million rows.” irritatingly big.

Now of course most of the platforms that you might want to use to consume this stuff comes with tools, but you need to know them and if you want to as I do turn the CSV into XML as well there might be couple of places you need to explain this and specific idioms to remember from the last time that you didn’t write down. All these things tend to come to a head when you’ve a day to create some whizzy demo site from a file someone emailed you.

If I get a file even vaguely in XML and I want another XML then I tend to use XQuery or XSLT. If not I tend to use Ant or Apache Camel. These days Camel is my favourite as it neatly merges the modern need for both transport and transformation into one system. So, I’ve a CSV file on the system, what to do next?

First choice is whether you can consume you file whole or you need to read it line by line or in chunks. The latter is the normal situation, it’s not often you get just a few hundred lines to read and streaming it in allows you to read any size of file. Whichever way you go, you can use the CSV data format as your base (there’s also the  heavy hitter Bindy module I’m ignoring for this post). This adds the ability to marshal (or transform) data to and from a Java Object (like a file) to CSV in a <route/>. At it’s simplest, it means you can read a file from a folder into memory and unmarshall it into a Java List (actually a List inside a List) like so:

<route id="csvfileone">
    <from uri="file:inbox" />
    <unmarshal><csv delimiter="|" skipFirstLine="true"/></unmarshal>
    <log message="First Line is ${body[0][0]}"/>
    <to uri="seda:mlfeed"/>
</route>

Here I’ve used the option to ignore the header line in my file and use pipe as delimiter rather than comma. The whole thing is sent to a seda queue and I’ll assume something is processing the List at that end. Just to prove it really is a List (you can talk to Lists in <simple/> which is also cool), I’ve logged the first line. Now if you want to read a small file and pick out say the first, second and fourth field from a given line this might be all you need. The problem with this approach is that you don’t need a huge file before memory and performance become issues.

If you’re looking at a big file, then what you can do is use the Splitter to, well split it into lines (or blocks of lines) first and then unmarshall each line afterwards. This is ideal, if as I do, each line is to become a separate file in your output later. Now the route looks like this:

<route id="csvfilereader">
    <from uri="file:inbox" />
    <split streaming="true" parallelProcessing="true">
       <tokenize token="\r\n|\n" xml="false" trim="true" />
           <filter>
               <simple>${property.CamelSplitIndex} &gt; 0</simple>
               <unmarshal><csv/></unmarshal>
               <to uri="seda:mlfeed"/>
           </filter>
    </split> 
</route>

To reduce memory the splitter is told to stream the file into chunks. Note a side effect of this is that the lines in the file won’t necessarily turn up in the order they were in the input file.  The splitter has also been told to process each chunk in parallel which speeds up the process. The Tokenize language is used to tell the splitter how to perform the split. In this case, it’s to use either Windows or Unix line endings (got to love that) and to trim the results. Each line is then fed into our queue unmarshalled as before. Note I couldn’t use skipFirstLine here as each entry is only one line so I’ve added a <filter> based on the counter from the split instead. One of the things I like about Camel is the way you can start of with a simple route and then add complexity incrementally.

Now I’ve a simple and robust way to suck up my CSV file, I just need to turn each record into XML by transforming the data with a bit of self-explanatory Groovy:

class handler {
    public void makeXML(Exchange exchange) {
    def response= "";
    def crn = "";
 
 /* Example data
CRN, Forename, Surname, DoB, PostCode
[11340, David, Wright, 1977-10-06, CV6 1LT]' 
*/
 
    csvdata.each {
        crn = it.get(0)
        response = "<case>\n"
        response += "<crn>" + crn + "</crn>\n"
        response += "<surname>" + it.get(1) + "</surname>\n"
        response += "<forename>" + it.get(2) + "</forename>\n"
        response += "<dob>" + it.get(3) + "</dob>\n"
        response += "<postcode>" + it.get(4) + "</postcode>"
         response += "</case>"
    } 
    exchange.getIn().setBody(response)
    exchange.getIn().setHeader(Exchange.FILE_NAME,crn + '.xml')
}

As a bonus, I’ve dropped the unique id (CRN) field into a Header so the it will get used as the filename and each output file will be called something like 11340.xml. Last of all, I need to wrap the code up in a route to read the queue, create the file and spit it out into a folder:

 <route id="xmlfilewriter">
     <from uri="seda:mlfeed"/>
     <log message="hello ${body[0][0]}"/>
     <to uri="bean:gmh?method=makeXML"/>
     <to uri="file:outbox"/>
 </route>

Of course, in the real world, you’d probably not store the file this way and it would go straight to Hadoop, or MarkLogic etc. Also, of course, it could stay in CSV and you could do other cool things to it. That’s what I like about Camel, flexibility.

 

 

Simple Arduino code on the EMFCamp Badge. Part1.

Untitled
One of the high points of this summer for me was going to  EMF Camp. Apart from an amazing experience, I came away with a rather smart electronic badge called a TILDAe.  Back home I’ve had a look to see what I could do with it.

There are detailed instructions on the emfcamp wiki on how to download the code and set up a development environment (it uses Arduino 1.5.7) so you can re-flash the badge or extend the code on it. Having given it a go,  I found I could reload the code that was on the badge perfectly. But, to do anything useful would involve putting my C++ hat on, which, to say the least, is pretty dusty these days. Also, doing anything in the Arduino IDE was going to quickly get pretty painful. So, time to move over to Eclipse and get the books out? Then I remembered  reading about it being  ‘Arduino Due’ compatible:

At its heart the badge is an Arduino Due compatible 32bit ARM Cortex M3. A rechargeable battery will keep it running for days, and you can charge it over USB when the juice runs out. We added a 128×64 pixel LCD screen, two RGB LEDs, a radio transceiverjoystickaccelerometergyroscopespeaker,infrared, and all sorts of other fun parts. It’s compatible with Arduino shields and has dedicated connectors for electronic textiles.

So instead, I started to look at how to set up a straightforward, native, Arduino environment, that would let me talk to the badge, put stuff on the screen, work the radio*, flash leds and get information from the gyro/accelerometer. What I wanted was *simple* and maybe be useful enough to others to get a few more of these things out of drawers.

*If you want to play with the radio, you’ll need one of these £20 SRF/USB sticks from Ciseco or another badge – I’d recommend the stick as they’re dead easy to use.

Setup.

  1. Download the badge code from github https://github.com/emfcamp/Mk2-Firmware and unzip it somewhere for reference later.
  2. Download the Arduino 1.5.7beta code (this has support for the Due). http://arduino.cc/en/main/software
  3. Make yourself a new directory, open the arduino IDE and in preferences change the Sketchbook folder to this directory.
  4. Restart Arduino. [This gives you a clean distinct environment to play in]
  5. Plug in your badge and turn it on.
  6. Tools/Board. Change to Arduino Due (NativeUSB Port)
  7. Tools/Port. Pick your usb port. Mine says /dev/tty/usbmodem1d111(Arduino Due (NativeUSB Port))
  8. You’re ready to go.

A word about ports.

On there Due, there are two USB ports. If you have spent the last few years writing Serial.println(“some debug message”) in your code, you’ll find nothing happens. That’s because the port attached to the IDE is actually called SerialUSB in Arduinospeak. So, if you want debug messages, it’s SerialUSB.println(). Now you might ask,  what does Serial do? Well in this case it’s attached to the SRF radio (so your messages maybe going somewhere after all). I’ll come back to that.

Hardware/wiring.

The badge code if installed as per the wiki instructions contains new board definitions you can then pick in the IDE. One of the side effects of this is that it sets the definitions for where the various Arduino pins point. So, for instance it lets you simply use LED1_RED without knowing what pin it actually is on the badge. In this case I haven’t used it as a) I wanted to know and b) It comes with ‘baggage’, such as the libraries, header files etc – remember the simple environment goal? Not to mention that a lot of the Arduino folk seem to think if you’re doing board definitions, you’re past the point where you should be using Eclipse or similar anyway. The result, though, is that sketches need to be told what thing is attached to what pin, but since it’s all neatly listed in this Header file, that’s not really an issue. If you’re looking for where the joystick is, or what buttons do what, it’s all in there and more.

Note. If you’re interested in the actual hardware, there’s also TiLDA Mk2 Prototype v0.333.pdf, which shows how it’s all put together.

Test one: Blink.

No Arduino writeup would be complete without a Blink sketch. There’s no led on pin 13 on the badge, but there are a couple of tri-colours available and I’ll use one of them instead. Here’s the sketch:

/*
Tilda LED test.
Hacked from Adafruit Arduino – Lesson 3. RGB LED
*/

//Pulled from the Hardware definition
#define LED1_BLUE (37u)
#define LED1_GREEN (39u)
#define LED1_RED (41u)
 
void setup()
{
 pinMode(LED1_RED, OUTPUT);
 pinMode(LED1_GREEN, OUTPUT);
 pinMode(LED1_BLUE, OUTPUT); 
}
 
void loop()
{
 setColor(255, 0, 0); // red
 delay(1000);
 setColor(0, 255, 0); // green
 delay(1000);
 setColor(0, 0, 255); // blue
 delay(1000);
 setColor(255, 255, 0); // yellow
 delay(1000); 
 setColor(80, 0, 80); // purple
 delay(1000);
 setColor(0, 255, 255); // aqua
 delay(1000);
}
void setColor(int red, int green, int blue)
{
 analogWrite(LED1_RED, red);
 analogWrite(LED1_GREEN, green);
 analogWrite(LED1_BLUE, blue); 
}

Upload this to your badge and a single LED should flash. Don’t worry if your screen is now blank. That’s next.

LCD Screen.

It’s no good having code on the badge if you can’t put stuff on the screen. The LCD used on the badge is an ST7565, but I couldn’t get the provided GLCD library to work, which is what I suspected (and no doubt due to lack of C++ skills on my part). Luckily, a scout around on the internet led me to the heroic Kimball Jojnson (@drrk) who had a GLCD library for the badge that didn’t reference the rtos software and which you can get from https://github.com/drrk/glcd-tilda. Once downloaded, Sketch -> Import Library -> Add Library will add it into your /library folder as glcd-tilda. 

You need to make one change. Remember what I said about the board definition? At the moment the library won’t compile since it doesn’t know what pins the LCD connects to. So, in /library/glcd-tilda folder, find glcd_Device.cpp , and right after “#include “SPI.h” on line #31 add the following lines and then save the file.

#define LCD_CS (52u)
#define LCD_POWER (40u)
#define LCD_BACKLIGHT (35u)
#define LCD_A0 (38u)
#define LCD_RESET (34u)

That’s it. Now we can go for the classic “Hello World”.

Test two: Hello World.

Start a new sketch and put in the following. Notice we need the SPI library from the standard Arduino  library as well, since the LCD is a serial device:

#include <SPI.h>
#include <glcd.h>
#include <fonts/allFonts.h>
#define LCD_POWER (40u)
#define LCD_BACKLIGHT (35u)
void setup() {
 //Turn LCD On :-)
 pinMode(LCD_POWER, OUTPUT);
 digitalWrite(LCD_POWER, LOW);
 //Turn Backlight On
 pinMode(LCD_BACKLIGHT, OUTPUT);
 digitalWrite(LCD_BACKLIGHT, HIGH);
 //Init LCD
 GLCD.Init(NON_INVERTED); 
 GLCD.SelectFont(System5x7);
 GLCD.print("Hello, world!");
 GLCD.display();
}
void loop() {
 GLCD.CursorTo(0, 1);
 // print the number of seconds since reset:
 GLCD.print(millis()/1000);
 GLCD.display();
 delay(1000);
}

Notice, that I had to physically turn the LCD on (that foxed me for a while) as well as the backlight if needed. Also,  the LCD code actually writes to a 1k buffer first, so nothing will happen until you call GLCD.display() which writes the buffer to the actual device. If you upload this sketch, the screen should spring into life in classic style.

It’s likely you’ll want to do something more complicated than just ‘Hello’. To find out what’s available, have a look in the docs at http://playground.arduino.cc/Code/GLCDks0108 as this library is modified from glcd-arduino (GLCDv3)  and has the same functions.

This probably isn’t the only library that will work, and with more effort the provided GLCD one might as well. That would be handy, as this library doesn’t have some niceties like screen rotation. I’ve heard that the UG8Lib library works with the Due/ST7565 combo, so that might also be worth a try.

Something useful? Basic Clock.

It’s nice to end on something that’s a little bit useful. The M3 chip at the heart of the board has a handy Real-Time-Clock or rtc so a simple clock is the obvious choice. It uses another library https://github.com/MarkusLange/Arduino-Due-RTC-Library from the pen of Markus Lange, which is in the badge code or you can get it from the link above. Add into your library as you did for the LCD and you’re ready to go. Start a new sketch and put in the following:

#include <rtc_clock.h>
#include <SPI.h>
#include <glcd.h>
#include <fonts/allFonts.h>
#define LCD_POWER (40u)
#define LCD_BACKLIGHT (35u)
// Select the clock source
RTC_clock rtc_clock(RC);
int old_unixtime;
char* daynames[]={"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"};
 
void setup() {
 SerialUSB.begin(9600);
 //Turn LCD On :-)
 pinMode(LCD_POWER, OUTPUT);
 digitalWrite(LCD_POWER, LOW);
 //Turn Backlight On
 pinMode(LCD_BACKLIGHT, OUTPUT);
 digitalWrite(LCD_BACKLIGHT, HIGH);
 //Init LCD
 GLCD.Init(NON_INVERTED);
 rtc_clock.init();
//This doesn't work as RTC setting don't survive - not sure why
//So, default is last compile Date/Time
 if (rtc_clock.date_already_set() == 0) {
  rtc_clock.set_time(__TIME__);
  rtc_clock.set_date(__DATE__);
 }
//Small fomt for header
 GLCD.SelectFont(System5x7);
 GLCD.print("EMF Camp TILDAe Clock");
//Big font for time
 GLCD.SelectFont(fixednums15x31);
 GLCD.display();
 }

void loop() {
 if ( rtc_clock.unixtime() != old_unixtime) {
 old_unixtime = rtc_clock.unixtime();
 char buffer[10];
 sprintf( buffer, "%02d:%02d:%02d",
 rtc_clock.get_hours(),
 rtc_clock.get_minutes(),
 rtc_clock.get_seconds());
 GLCD.CursorTo(0, 1);
 GLCD.print(buffer);
 GLCD.display();
 }
}

UntitledNote. Even though there’s a battery on the badge it doesn’t remember the RTC settings if you turn it off. I reset it in the code to the last compile date/time as default. More investigation needed.

What next?

  • There’s no way to set the time yet. I’ll leave that to you – there’s buttons aplenty.
  • No date display.

Wrap Up.

The TILDA badges are fine bits of kit. With a few mods, I’ve a nice simple environment on the Arduino IDE, the basics working and the beginnings of a handy clock. Next time I’ll have a look at the accelerometer, gyroscope and the radio. Until then, as always,  it’s all on github…. https://github.com/Tingenek/EMFBadge

Review: Apache Camel Developer’s Cookbook

A week or so ago, the nice people at Packt Publishing offered me a chance to review “Apache Camel Developer’s Cookbook” I’m always happy to read another book on my favourite integration pattern (and get a free ebook 🙂 as you always learn something new. I’m also glad to report that this is a fine effort by Scott Cranton (@scottcranton) and Jakub Korub (@jakekorab) and well worth getting to go alongside the canonical “Camel in Action” by Claus Ibsen (@davsclaus).

Where CiA dives deep into the guts of Camel, ACDC is presented as a recipe book. You can read it section by section, starting with the basics of routes and endpoints, and moving on through various message patterns to security, testing and performance.  or, you can drop in to the section you want, to pick out a given recipe, as long as you have some Camel already under your belt.

It was heartening to see most of the recipes described not only using the Java DSL, but also in Spring XML. It might be more verbose, but it made it a lot easier to read for people like myself coming from the XML document side with only a smattering of Java and using Camel more as a tool.

Each recipe, is arranged identically:

  • The goal of the recipe is described.
  • Getting Ready. Prerequisites and how to set up the example code.
  • How to do it. Detailed steps for the recipe.
  • How it works. The background/architecture description in Camel.
  • There’s more. Further steps/more advanced use.
  • See also. Links to resources.

It’s a neat layout that reads easily, with only a couple of places where the material felt a little coerced, and each recipe is backed up with code ready to run via a ‘drop-in’ maven project.

The recipes are grouped into themed chapters:

  1. Structuring Routes.
  2. Message Routing.
  3. Routing to your code.
  4. Transformation
  5. Splitting and Aggregating
  6. Parallel Processing
  7. Error Handling and Compensation
  8. Transactions and Idempotency
  9. Testing
  10. Monitoring and Debugging
  11. Security
  12. Web Services

These were all informative, and showcase how a wide variety of problems can be addressed in Camel with some background on the EIP message patterns they represent. The chapters on error handling, testing and monitoring are excellent and provided a practical balance while the chapter on Parallel Processing addresses some of the issues of scale. If I had a complaint, and it’s probably just my take on Camel use, it would be that some of the recipes went straight for the more complex offering, e.g Bindy for CSV handling rather than starting with a data handler and a POJO.  It shows that Camel is ready for the big time, and it is, but I think it obscures the great flexibility of Camel as a framework for not only complex problems but doing, perhaps simpler or mundane, things really well.

All in all it’s a good, informative book. If you’ve used Camel before there’ll be a few things you haven’t seen and some good examples of best practise. If you haven’t, it’s got a good mixture of background and drop-in code to get you started.