Archive for January, 2009

Meeting with Talis

Had a really interesting meeting this afternoon with a couple of folk from Talis. Mark Bush, who is new to the company, contacted me a few weeks ago to arrange it. The purpose of the meeting was to gain greater understanding of different approaches to reading list management. Mark was accompanied by Ian Corns who despite his job title of “User Experience Champion” didn’t arrive wearing a cape or with his underpants over his clothes.

The first part of the meeting turned into a show and tell: with me detailing the birth of LORLS and our ongoing project to redevelop the system, and Mark showing me a little of Aspire, Talis’ replacement for their existing Talis List product. The thing that struck me was how similar in concept their new solution is to what we’re doing with LORLS.

The second half of the meeting was given over to discussing the various possible strategies to use when implementing a reading list solution. Obviously selecting an appropriate system is important. However many of the key issues that will determine whether it is success are not necessarily system dependant.

For a listing of some of these key issues send £12.99 and a stamped self-addressed envelope to Ga… OK OK I’ll tell, please just stop hitting me Jon 🙂

  1. Involve all stakeholder as soon as possible in the implementation process – pretty obvious I know but still important to remember
  2. From a Library perspective it’s much easier to work with academics if you’re not seen as the ones forcing the system upon them
  3. Pump priming the system with academics’ existing (often Word based) reading lists can be a real winner – once a list is on the system is much easier to get academics to update it or at least be aware if they haven’t so you can then nag them about it!
  4. Training, training, training
  5. Local champions can often do more for the success of a project than official promotions – identify your champions and support them
  6. It’s important to know the lie of the land – what may work with one department won’t necessary work with another. For example Engineers have a very different approach to reading lists than Social Scientists.
  7. Competition between academic departments or faculties can be a useful means of encouraging adoption of the system but needs to be done with care
  8. Use every opportunity to stress the importance of reading lists to academic departments, for example: bad module feedback, that’s because of your lack of reading lists on the system; external review approaching, why not invest some time in updating your reading lists to demonstrate clear communication between academics, librarians and students.

Why formatting Perl is held in the database

The current ER model includes tables that hold formatting information in order to allow the layout of reading lists in different formats (HTML, BibTeX, Refer, etc). This formatting information is currently planned to be snippets of Perl code, allowing some serious flexibility in formatting the output.

However the question is, “why is this code in the database?”. Here at Loughborough we could just as easily have it stored in the filesystem along with all the other Perl code for the back end. The reason for having the Perl fragments held in the database is that it will potentially make life easier at other institutions, where there is a separation between the folk who run servers and install the software, and the folk who actually manage the library side of the database. In some cases getting software updates loaded on machines can be tricky and time consuming, so having the formatting code held in the database will allow the library administrators to tweak it without having to jump through hoops.

New Blood

Introducing the latest member of our development team: Jason Cooper!

Jason  has been around for a while and is a survivor of our data design wars (it’s all just structural units! why do we need data type groups?). We’ve given him the task of developing a new wizzy interface which will communicate with Jon’s APIs.

Like Jon, Jason is a PhD which means I’m now in the middle of a doctor-doctor joke.

REST versus SOAP

We want LUMP to have a well separated client-server architecture, so the we can deploy several different types of front end and also so that third party developers can implement their own front (and back!) ends independently. We’ve pretty much decided on using XML to return the results (as most client programming environments have some sort of XML handling these days, and if the worst comes to the worst there are network XSLT transformation services that can turn the XML into renderable HTML). However we needed to decide how to send requests from the client to the server in the first place.

The initial plan was to have clients lovingly hand craft XML documents for the request and then send these as a single parameter to a CGI script. This XML document would contain all the authentication, version, metadata for the LUMP protocol, as well as the method being requested and any parameters required. It would work OK, but meant that the clients needed to generate an XML document, even for a relatively simple request.

Next we considered the Simple Object Access Protocol (SOAP). SOAP basically does the XML packaging of parameters and results in a standardised manner that is supposed to allow different programming languages on different platforms to remotely access objects and methods as though the were part of the local application. That’s the theory at least. SOAP works better when coupled with the Web Services Description Language (WSDL) but there’s also where things start to come unravelled.

Firstly WSDL appears to have real difficulties in handling dynamic data structures. Most of the examples of WSDL on the web pass a string or two into a routine and then get out a single string, floating point or integer. But we’d like to return far more complex data structures such as nested hash of hashes, arrays of hashes, etc, etc. We could describe those in terms of an XML DTD for our hand crafted responses but not easily in WSDL for use with SOAP. WSDL also imposes an extra transfer over head on clients and/or requires that the client cache the WSDL file (which makes developing a pain, where the WSDL would have to be in constant flux normally and thus uncachable).

There were also some practical problems with SOAP. The main development of the LUMP server is likely to be in Perl and, whilst there is a SOAP Perl module available, its nowhere near as well developed as many of the other Perl modules available. Secondly we found that sometimes we could get SOAP to work with, say, Perl, but then not work with PHP (PHP being the other major target language, this time for the Moodle client block). Fixing access for the PHP client could then cause problems for the Perl client. Why things some times worked and sometimes didn’t wasn’t particularly clear and therefore this didn’t show SOAP to be particular platform independent. And debugging the SOAP transactions was also very hard – even with debugging turned on in the SOAP libraries the interactions can be very complicated and you end up having to sit with the SOAP specification trying to work out which bit of the libraries that were supposed to make life easy for you were actually making it much harder!

So we opted for a third choice, and one we were already quite familiar with: REpresentational State Transfer (REST). APIs that use REST (so called “RESTful” APIs) make use of the existing web infrastructure such as HTTP, SSL, web server authentication techologies and CGI scripts. A RESTful API uses a base URI pointing at some sort of program (such as a CGI script or mod_perl instance) combined with URI query parameters to specify the operation required, user authentication information, request details, etc. Many programming environments have built in support for accessing URIs and so these RESTful APIs can reuse existing technologies without needing new, complex and potentially buggy libraries. Generating the URIs is relatively simple and straight forward in most languages.

The results of a REST transaction can be anything you like, so there was not reason why it couldn’t be the sort of XML result documents that we looked at originally. This gave the result some platform neutrality, whilst still allowing us to return what were effectively dynamic data structures. It also means that you can easily do funky things with XSLT still if you want to. The clients can process XML returned fairly easily these days (as evidenced by the AJAX mash ups out there) but they wouldn’t need to go through the overhead of generating a new XML document for the outgoing request itself (unless it was required for a complex parameter perhaps – that’s flexibility for you!).

RESTful APIs also have a big conceptual win over a SOAP based API: the interface gets real URLs for each request. This means that not only get you retrieve object details using a suitably crafted URI, but also create, update and delete them (assuming you have rights to obviously). Having such URIs available means that the other parts of the W3C’s web standards development dovetail in nicely with RESTful APIs: things like RDF and XPointer only really become useful if they can reference resources using URIs.

Having chosen (for the moment!) REST as API mechanism there appeared to be one decision still outstanding. That is where to draw the line between the “base” URI and the parameters. One option is to have a single base URI for all operations and then a parameter (amongst others) that describes the operation we required. For example a FindSuid call might look something like:

http://example.com/cgi-bin/LUMP?operation=findsuid&what=Module&aspect=Module%20Code&value=08LBA100

The other option is to have different base URIs for each of the “methods” that the API has. For example the base URI for the above operation might be:

http://example.com/cgi-bin/LUMP/FindSuid

and then we’d add parameters to the end of that, giving a URI of:

http://example.com/cgi-bin/LUMP/FindSuid?what=Module&aspect=Module%20Code&value=08LBA100

There’s not really much in it as far as we can see at the moment. On the one hand having a single base URI might mean that the client’s configuration is very, very slightly simpler. On the flip side having separate URIs might translate to having lots of small, simple CGI scripts, as opposed to a single monolithic script. The small scripts might or might not be easier to maintain, depending on how much code reuse and modularisation we can come up with. Also if we use something like mod_perl with the scripts to speed things up, the smaller units will hog less memory in the web server, take less to get running and can be updated with less impact on a running service.

Either of these would appear to be RESTful as they both result in workable URIs. Luckily we can have our cake and eat it in this case. If we opt for individual URIs for each method then we can take advantage of the back end benefits, but also then use existing web technologies to make them appear as a single monolithic interface with a single base URI and an “operation” parameter if we wish to. This could either be through a wrapper CGI script or one of the web server URL rewriting technologies (though Apache may have issues with rewriting parameters).

Go to Top