Posts tagged Perl

Prototyping Goodreads integration

Part of my day job is developing and gluing together library systems. This week I’ve been making a start on doing some of this “gluing” by prototyping some code that will hopefully link our LORLS reading list management system with the Goodreads social book reading site. Now most of our LORLS code is written in either Perl or JavaScript; I tend to write the back end Perl stuff that talks to our databases and my partner in crime Jason Cooper writes the delightful, user friendly front ends in JavaScript. This means that I needed to get a way for a Perl CGI script to take some ISBNs and then use them to populate a shelf in Goodreads. The first prototype doesn’t have to look pretty – indeed my code may well end up being a LORLS API call that does the heavy lifting for some nice pretty JavaScript that Jason is far better at producing than I am!

Luckily, Goodreads has a really well thought out API, so I lunged straight in. They use OAuth 1.0 to authenticate requests to some of the API calls (mostly the ones concerned with updating data, which is exactly what I was up to) so I started looking for a Perl OAuth 1.0 module on CPAN. There’s some choice out there! OAuth 1.0 has been round the block for a while so it appears that multiple authors have had a go at making supporting libraries with varying amounts of success and complexity.

So in the spirit of being super helpful, I thought I’d share with you the prototype code that I knocked up today. Its far, far, far from production ready and there’s probably loads of security holes that you’ll need to plug. However it does demonstrate how to do OAuth 1.0 using the Net::OAuth::Simple Perl module and how to do both GET and POST style (view and update) Goodreads API calls. Its also a great way for me to remember what the heck I did when I next need to use OAuth calls!

First off we have a new Perl module I called Goodreads.pm. Its a super class of the Net::OAuth::Simple module that sets things up to talk to Goodreads and provides a few convenience functions. Its obviously massively stolen from the example in the Net::OAuth::Simple perldoc that comes with the module.

#!/usr/bin/perl

package Goodreads;

use strict;
use base qw(Net::OAuth::Simple);

sub new {
    my $class = shift;
    my %tokens = @_;

    return $class->SUPER::new( tokens => %tokens,
                               protocol_version => '1.0',
                               return_undef_on_error => 1,
                               urls => {
                                 authorization_url => 'http://www.goodreads.com/oauth/authorize',
                                 request_token_url => 'http://www.goodreads.com/oauth/request_token',
                                 access_token_url => 'http://www.goodreads.com/oauth/access_token',
                               });
}

sub view_restricted_resource {
    my $self = shift;
    my $url = shift;
    return $self->make_restricted_request($url, 'GET');
}

sub update_restricted_resource {
    my $self = shift;
    my $url = shift;
    my %extra_params = @_;
    return $self->make_restricted_request($url, 'POST', %extra_params);
}

sub make_restricted_request {
    my $self = shift;
    croak $Net::OAuth::Simple::UNAUTHORIZED unless $self->authorized;

    my( $url, $method, %extras ) = @_;

    my $uri = URI->new( $url );
    my %query = $uri->query_form;
    $uri->query_form( {} );

    $method = lc $method;

    my $content_body = delete $extras{ContentBody};
    my $content_type = delete $extras{ContentType};

    my $request = Net::OAuth::ProtectedResourceRequest->new(
        consumer_key => $self->consumer_key,
        consumer_secret => $self->consumer_secret,
        request_url => $uri,
        request_method => uc( $method ),
        signature_method => $self->signature_method,
        protocol_version => $self->oauth_1_0a ?
            Net::OAuth::PROTOCOL_VERSION_1_0A :
            Net::OAuth::PROTOCOL_VERSION_1_0,
        timestamp => time,
        nonce => $self->_nonce,
        token => $self->access_token,
        token_secret => $self->access_token_secret,
        extra_params => { %query, %extras },
    );
    $request->sign;
    die "COULDN'T VERIFY! Check OAuth parameters.n"
        unless $request->verify;

    my $request_url = URI->new( $url );

    my $req = HTTP::Request->new(uc($method) => $request_url);
    $req->header('Authorization' => $request->to_authorization_header);
    if ($content_body) {
        $req->content_type($content_type);
        $req->content_length(length $content_body);
        $req->content($content_body);
    }

    my $response = $self->{browser}->request($req);
    return $response;
}
1;

Next we have the actual CGI script that makes use of this module. This shows how to call the Goodreads.pm (and thus Net::OAuth::Simple) and then do the Goodreads API calls:

#!/usr/bin/perl

use strict;
use CGI;
use CGI::Cookie;
use Goodreads;
use XML::Mini::Document;
use Data::Dumper;

my %tokens;
$tokens{'consumer_key'} = 'YOUR_CONSUMER_KEY_GOES_IN_HERE';
$tokens{'consumer_secret'} = 'YOUR_CONSUMER_SECRET_GOES_IN_HERE';

my $q = new CGI;
my %cookies = fetch CGI::Cookie;

if($cookies{'at'}) {
    $tokens{'access_token'} = $cookies{'at'}->value;
}
if($cookies{'ats'}) {
    $tokens{'access_token_secret'} = $cookies{'ats'}->value;
}
if($q->param('isbns')) {
    $cookies{'isbns'} = $q->param('isbns');
}
my $oauth_token = undef;
if($q->param('authorize') == 1 && $q->param('oauth_token')) {
    $oauth_token = $q->param('oauth_token');
} elsif(defined $q->param('authorize') && !$q->param('authorize')) {
    print $q->header,
    $q->start_html,
    $q->h1('Not authorized to use Goodreads'),
    $q->p('This user does not allow us to use Goodreads');
    $q->end_html;
    exit;
}

my $app = Goodreads->new(%tokens);

unless ($app->consumer_key && $app->consumer_secret) {
    die "You must go get a consumer key and secret from Appn";
}

if ($oauth_token) {
    if(!$app->authorized) {
        GetOAuthAccessTokens();
    }
    StartInjection();
} else {
    my $url = $app->get_authorization_url(callback => 'https://example.com/cgi-bin/good-reads/inject');
    my @cookies;
    foreach my $name (qw(request_token request_token_secret)) {
        my $cookie = $q->cookie(-name => $name, -value => $app->$name);
        push @cookies, $cookie;
    }
    push @cookies, $q->cookie(-name => 'isbns',
                              -value => $cookies{'isbns'} || '');
    print $q->header(-cookie => @cookies,
                     -status=>'302 Moved',
                     -location=>$url,
                     );
}

exit;

sub GetOAuthAccessTokens {
    foreach my $name (qw(request_token request_token_secret)) {
        my $value = $q->cookie($name);
        $app->$name($value);
    }
    ($tokens{'access_token'},
     $tokens{'access_token_secret'}) =
        $app->request_access_token(
            callback => 'https://example.com/cgi-bin/goodreads-inject',
            );
}

sub StartInjection {
    my $at_cookie = new CGI::Cookie(-name=>'at',
                                    -value => $tokens{'access_token'});
    my $ats_cookie = new CGI::Cookie(-name => 'ats',
                                     -value => $tokens{'access_token_secret'});
    my $isbns_cookie = new CGI::Cookie(-name => 'isbns',
                                       -value => '');
    print $q->header(-cookie=>[$at_cookie,$ats_cookie,$isbns_cookie]);
    print $q->start_html;

    my $user_id = GetUserId();
    if($user_id) {
        my $shelf_id = LoughboroughShelf(user_id => $user_id);
        if($shelf_id) {
            my $isbns = $cookies{'isbns'}->value;
            print $q->p("Got ISBNs list of $isbns");
            AddBooksToShelf(shelf_id => $shelf_id,
                            isbns => $isbns,
                           );
        }
    }

    print $q->end_html;
}

sub GetUserId {
    my $user_id = 0;
    my $response = $app->view_restricted_resource(
                       'https://www.goodreads.com/api/auth_user'
                   );
    if($response->content) {
        my $xml = XML::Mini::Document->new();
        $xml->parse($response->content);
        my $user_xml = $xml->toHash();
        $user_id = $user_xml->{'GoodreadsResponse'}->{'user'}->{'id'};
    }
    return $user_id;
}

sub LoughboroughShelf {
    my $params;
    %{$params} = @_;

    my $shelf_id = 0;
    my $user_id = $params->{'user_id'} || return $shelf_id;

    my $response = $app->view_restricted_resource('https://www.goodreads.com/shelf/list.xml?key=' .  $tokens{'consumer_key'} . '&user_id=' . $user_id);
    if($response->content) {
        my $xml = XML::Mini::Document->new();
        $xml->parse($response->content);
        my $shelf_xml = $xml->toHash();
        foreach my $this_shelf (@{$shelf_xml->{'GoodreadsResponse'}->{'shelves'}->{'user_shelf'}}) {
            if($this_shelf->{'name'} eq 'loughborough-wishlist') {
                $shelf_id = $this_shelf->{'id'}->{'-content'};
                last;
            }
        }
        if(!$shelf_id) {
            $shelf_id = MakeLoughboroughShelf(user_id => $user_id);
        }
    }
    print $q->p("Returning shelf id of $shelf_id");
    return $shelf_id;
}

sub MakeLoughboroughShelf {
    my $params;
    %{$params} = @_;

    my $shelf_id = 0;
    my $user_id = $params->{'user_id'} || return $shelf_id;

    my $response = $app->update_restricted_resource('https://www.goodreads.com/user_shelves.xmluser_shelf[name]=loughborough-wishlist',
        );
    if($response->content) {
        my $xml = XML::Mini::Document->new();
        $xml->parse($response->content);
        my $shelf_xml = $xml->toHash();
        $shelf_id = $shelf_xml->{'user_shelf'}->{'id'}->{'-content'};
        print $q->p("Shelf hash: ".Dumper($shelf_xml));
    }
    return $shelf_id;
}

sub AddBooksToShelf {
    my $params;
    %{$params} = @_;

    my $shelf_id = $params->{'shelf_id'} || return;
    my $isbns = $params->{'isbns'} || return;
    foreach my $isbn (split(',',$isbns)) {
        my $response = $app->view_restricted_resource('https://www.goodreads.com/book/isbn_to_id?key=' . $tokens{'consumer_key'} . '&isbn=' . $isbn);
        if($response->content) {
            my $book_id = $response->content;
            print $q->p("Adding book ID for ISBN $isbn is $book_id");
            $response = $app->update_restricted_resource('http://www.goodreads.com/shelf/add_to_shelf.xml?name=loughborough-wishlist&book_id='.$book_id);
        }
    }
}

You’ll obviously need to get a developer consumer key and secret from the Goodreads site and pop them into the variables at the start of the script (no, I’m not sharing mine with you!). The real work is done by the StartInjection() subroutine and the subordinate subroutines that it then calls once the OAuth process has been completed. By this point we’ve got an access token and its associated secret so we can act as whichever user has allowed us to connect to Goodreads as them. The code will find this user’s Goodreads ID, see if they have a bookshelf called “loughborough-wishlist” (and create it if they don’t) and then add any books that Goodreads knows about with the given ISBN(s). You’d call this CGI script with a URL something like:

https://example.com/cgi-bin/goodreads-inject?isbns=9781565928282

Anyway, there’s a “works for me” simple example of talking to Goodreads from Perl using OAuth 1.0. There’s plenty of development work left in turning this into production level code (it needs to be made more secure for a start off, and the access tokens and secret could be cached in a file or database for reuse in subsequent sessions) but I hope some folk find this useful.

LORLS’s Installer

We’re getting to the point with LUMP and CLUMP where Jason and I are thinking about installers.  If we were just going to use the code at Loughborough this would be a moot point – we’d just copy the various directories around and tweak under the hood to get the thing to work as we want, just as we have done during development.

However its not going to just be us that use it (or at least hopefully it won’t be!) so we need to think of a nice way for people to install it.  The previous versions of LORLS have a Perl command line script that I wrote years ago that asks lots of questions and then does its funky thang.  It works, but it could be friendlier, especially for the newbie admins.  I’ve also seen a number of PHP based packages that allow you to do most of the installation via a web browser which is rather nice for the new admins, so I thought, “Hey, why not make an installer CGI script that knows very little about the system other than there is Perl there, a MySQL database server is running, the CGI.pm module (which has been part of the Perl distribution since 5.4 so should be there already on most machines installed in the 21st century) and is an executable CGI script?”  Why not indeed…

Obviously the prospective LUMP admin needs to install (or have installed for him) Perl, MySQL and a webserver configured to execute CGI scripts in the directory that we tell them to unpack LUMP into, but once that’s done surely we can then check to make sure that all the other modules are present, install them if they aren’t, set up the MySQL database ready for LUMP to use and then configure all the LUMP scripts so that they are ready to go, all with a nice point and drool web interface?

Checking if a Perl module is available is relatively easy, thanks to the eval { } function in Perl.  For example, say we want to check if Data::Dumper is installed.  That can be done using some code such as:

eval { use Data::Dumper; };
if($@) {
  # Module missing
} else {
  # Module present, and available for use
}

Shimples!

However things start to get “interesting” if the module isn’t installed.  I thought this would be easy, as the Comprehensive Perl Archive Network (CPAN) has a nice module that I’ve used for years to interactively install and update Perl modules with – things like:

perl -MCPAN -e install Data::Dumper

It has a programmatic option as well as an interactive side, so we’re sorted right?  Well no, we’re not.  Unfortunately the programmatic side really just generates the stream of stuff you see when you run the interactive side.  If you include something like:

my $result = CPAN::Shell->install('Data::Dumper');

in your CGI script, eventually you’ll get a result in the web browser of a load of unformated raw text from this command interspersed with your active HTML.  The $result variable on the other hand will stay completely empty, with no indication as to whether the installation has worked or not.   Pants – not what we want here.

The long and short of it is, to get round this “feature” in CPAN::Shell it seems that you have to do a bit of fork() action.  In other words folk off a child process to run the CPAN::Shell method in and then, back in the parent, capture its STDOUT stream into a variable which can then be scanned with a regexp or two looking for signs of success in the output text.  Not terribly clean but it works.

There’s another “gotcha” as well: the web server is unlikely to be running as root (or at least it shouldn’t be!) and so the CGI script can’t install Perl modules into the system library directories.  This is a more minor pain: you can tell CPAN::Shell that it should do a local installation to a place that the user executing it can write to.  So that’s another requirement for running this CGI script: create a directory that’s readable and writable by the user running the web server (usually called something like apache or http) but which isn’t part of the web server document root.  In other words, if the web server document root is /var/www/html, we might want to put this LUMP specific CPAN directory tree in /var/ww/LUMPCPAN where it can’t be seen by web browsers.  You have to hack up a MyConfig.pm to put in this directory and then point @INC and $ENV{‘PERL5LIBS’} towards it, but that can be done automagically by the CGI installer script once the directory exists.

Now some readers (we do have readers, right?) might be wondering why I don’t use one of the fancy pants CPAN modules such local::libs or Module::Install to do this rather than tackling CPAN::Shell head on.  Well its a chicken and egg problem really: I wanted to have the minimum requirements for the installer script to run and, if I’d have asked the user to install a load of libraries and modules to make using CPAN easier I might as well have just given them a list of modules to install.  Which actually I have done anyway, just in case they want to install them system wide and/or not trut my installer script to get them right.  But nevertheless I’d like to give the newbies (and also folk who have sluggish server admins if they don’t run their own boxes) the option of installing via the CGI installer script.  More work for me, but hopefully less for them.

So, its a bit of kludge at the moment but it seems to be the way to go unless anyone else can come up with a better scheme.  Once I’m happy it can get the required modules in place, my next trick will be to ask for database server details (database name, username, password, server, port) and then check if the database already exists, or create a fresh one.  In the former case I’m also planning on checking the schema in the database against the one in this LUMP distribution and then offer to update it if needs be, and allow fresh databases to either be empty, have some simple test data in them or copy data from another database.  Hopefully I can also bolt the importer from older LORLS versions in here so that there’s no command line interaction required at all.  With a bit of luck all of those should be alot less hassle than CPAN::Shell has proved to be.

I hope… 🙂

Go to Top