{"id":1652,"date":"2013-03-25T17:29:55","date_gmt":"2013-03-25T17:29:55","guid":{"rendered":"https:\/\/copyright.lboro.ac.uk\/lorls\/?p=1652"},"modified":"2013-03-25T17:29:55","modified_gmt":"2013-03-25T17:29:55","slug":"purchasing-prediction","status":"publish","type":"post","link":"https:\/\/blog.lboro.ac.uk\/lorls\/lorls\/purchasing-prediction","title":{"rendered":"Purchasing Prediction"},"content":{"rendered":"<p>For a while now we&#8217;ve been feverishly coding away at the top secret LORLS bunker on a new library management tool attached to LORLS: a system to help predict which books a library should buy and roughly how many will be required. \u00a0The idea behind this was that library staff at Loughborough were spending a lot of time trying to work out what books needed to be bought, mostly based on trawling through lots of data from systems such as the LORLS reading list management system and the main Library Management System (Aleph in our case). \u00a0We realised that all of this work was suitable for &#8220;mechanization&#8221; and we could extract some rules from the librarians about how to select works for purchase and then do a lot of the drudgery for them.<\/p>\n<p>We wrote the initial hack-it-up-and-see-if-the-idea-works-at-all prototype about a year ago. \u00a0It was pretty successful and was used in Summer 2012 to help the librarians decide how to spend a bulk book purchasing budget windfall from the University. \u00a0We wrote up our experiences with this code as a <a href=\"http:\/\/www.dlib.org\/dlib\/march13\/knight\/03knight.html\" target=\"_blank\">Dlib article<\/a> which was published recently. \u00a0That article explains the process behind the system, so if you&#8217;re interested in the basis behind this idea its probably worth a quick read (or to put it in other terms, I&#8217;m too lazy to repeat it here! \ud83d\ude09 ).<\/p>\n<p>Since then we&#8217;ve stepped back from the initial proof of concept code and spent some time re-writing it to be more maintainable and extensible. \u00a0The original hacked together prototype was just a single monolithic Perl script that evolved and grew as we tried ideas out and added features. \u00a0It has to call out to external systems to find loan information and book pricing estimates with bodges stuffed on top of bodges. \u00a0We always intended it to be the &#8220;one we threw away&#8221; and after the summer tests it became clear that the purchase prediction could be useful to the Library staff and it was time to do version 2. \u00a0And do it properly this time round.<\/p>\n<p>The new version is modular&#8230; using good old Perl modules! Hoorah! But we&#8217;ve been a bit sneaky: we&#8217;ve a single Perl module that encompasses the purchase prediction process but doesn&#8217;t actually <em>do<\/em> any of it. \u00a0This is effectively a &#8220;wrapper&#8221; module that a script can call and which knows about what other purchase prediction modules are available and what order to run them in. \u00a0The calling script just sets up the environment (parsing CGI parameters, command line options, hardcoded values or whatever is required), instantiates a new <strong>PurchasePrediction<\/strong> object and repeatedly calls that object&#8217;s\u00a0DoNextStep() method with a set of input state (options, etc) until the object returns an untrue state. \u00a0At that point the prediction has been made and suggestions have, hopefully, been output. \u00a0Each call to DoNextStep() runs the next Perl module in the purchase prediction workflow.<\/p>\n<p>The <strong>PurchasePrediction<\/strong> object comes with sane default state stored within it that provides standard workflow for the process and some handy information that we use regularly (for example ratios of books to students in different departments and with different levels of academic recommendation in LORLS. \u00a0You tend to want more essential books than you do optional ones obviously, whilst English students may require use of individual copies longer than Engineering students do). \u00a0This data isn&#8217;t private to the <strong>PurchasePredictor<\/strong> module \u00a0though &#8211; the calling script could if it wished alter any of it (maybe change the ratio of essential books per student for the Physics department or even alter the workflow of the purchase prediction process).<\/p>\n<p>The workflow that we currently follow by default is:<\/p>\n<ol>\n<li><span style=\"line-height: 12.796875px\">Acquire ISBNs to investigate,<\/span><\/li>\n<li>Validate those ISBNs,<\/li>\n<li>Make an initial estimate of the number of copies required for each work we have an ISBN for,<\/li>\n<li>Find the cost associated with buying a copy of each work,<\/li>\n<li>Find the loan history of each work,<\/li>\n<li>Actually do the purchasing algorithm itself,<\/li>\n<li>Output the purchasing suggestions in some form.<\/li>\n<\/ol>\n<p><span style=\"line-height: 19px\">Each of those steps is broken down into one or more (mostly more!) separate operations, with each operation then having its own Perl module to instantiate it. \u00a0For example to acquire ISBNs we&#8217;ve a number of options:<\/span><\/p>\n<ol>\n<li><span style=\"line-height: 12.796875px\">Have an explicit list of ISBNs given to us,<\/span><\/li>\n<li>Find the ISBNs of works on a set of reading list modules,<\/li>\n<li>Find the ISBNs of works on reading lists run by a particular department (and optionally stage within that department &#8211; ie First Year Mech Eng)<\/li>\n<li>Find ISBNs from reading list works that have been edited within a set number of days,<\/li>\n<li>Find all the ISBNs of books in the entire reading list database<\/li>\n<li>Find ISBNs of works related to works that we already have ISBNs for (eg using LibraryThing&#8217;s FRBR code to find alternate editions, etc).<\/li>\n<\/ol>\n<p><span style=\"line-height: 19px\">Which of these will do anything depends on the input state passed into the <strong>PurchasePredictor&#8217;<\/strong>s DoNextStep() method by the calling script &#8211; they all get called but some will just return the state unchanged whilst others will add new works to be looked at later by subsequent Perl modules.<\/span><\/p>\n<p>Now this might all sound horribly confusing at first but it has big advantages for development and maintenance. When we have a new idea for something to add to the workflow we can generate a new module for it, slip it into the list in the <strong>PurchasePredictor<\/strong> module&#8217;s data structure (either by altering this on the fly in a test script or, once the new code is debugged, altering the default workflow data held in the <strong>PurchasePredictor<\/strong> module code) and then pass some input to the <strong>PurchasePredictor<\/strong> module&#8217;s DoNextStep() method that includes flags to trigger this new workflow.<\/p>\n<p>For example until today the workflow shown in the second list above did not include the step &#8220;Find ISBNs from reading list works that have been edited within a set number of days&#8221;; that got added as a new module written from scratch to a working state in a little over two hours. \u00a0And here&#8217;s the result, rendered as an HTML table:<\/p>\n<div id=\"attachment_1656\" style=\"width: 310px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blog.lboro.ac.uk\/lorls2\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1.png\" rel=\"lightbox[1652]\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1656\" class=\"size-medium wp-image-1656\" alt=\"Purchase Predictor pp2 HTML showing suggested purchases for works in LORLS edited in the last day.\" src=\"https:\/\/blog.lboro.ac.uk\/lorls2\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1-300x225.png\" width=\"300\" height=\"225\" srcset=\"https:\/\/blog.lboro.ac.uk\/lorls\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1-300x225.png 300w, https:\/\/blog.lboro.ac.uk\/lorls\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1-1024x768.png 1024w, https:\/\/blog.lboro.ac.uk\/lorls\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1-900x675.png 900w, https:\/\/blog.lboro.ac.uk\/lorls\/wp-content\/uploads\/sites\/3\/2013\/03\/pp-screenshot1.png 1068w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><p id=\"caption-attachment-1656\" class=\"wp-caption-text\">Purchase Predictor pp2 HTML showing suggested purchases for works in LORLS edited in the last day.<\/p><\/div>\n<p>As you can see the purchase predictor HTML output in this case tries to fit in with Jason&#8217;s new user interface, which it can do easily as that&#8217;s all encompassed in Perl modules as well!<\/p>\n<p>There&#8217;s still lots more work we&#8217;ve can do with purchase prediction. \u00a0As one example, the next thing on my &#8216;To Do&#8217; list is to make an output module that generates emails for librarians so that it can be run as batch job by cron every night. \u00a0The librarians can then sip their early morning coffee whilst pondering which book purchasing suggestions to follow up. \u00a0The extensible modular scheme also means we&#8217;re free to plug in different actual purchasing algorthims&#8230; maybe even incorporating some machine learning, with feedback provided by the actual purchases approved by the librarians against the original suggestions that the system made.<\/p>\n<p>I knew those undergraduate CompSci Artificial Intelligence lectures would come in handy eventually&#8230; \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For a while now we&#8217;ve been feverishly coding away at the top secret LORLS bunker on a new library management tool attached to LORLS: a system to help predict which books a library should buy and roughly how many will be required. \u00a0The idea behind this was that library staff at Loughborough were spending a [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[3],"tags":[27,52,67],"class_list":["post-1652","post","type-post","status-publish","format-standard","hentry","category-lorls","tag-dlib","tag-lorls-v7","tag-purchase-prediction","count-0","even alt","author-cojpk","last"],"_links":{"self":[{"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/posts\/1652","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/comments?post=1652"}],"version-history":[{"count":0,"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/posts\/1652\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/media?parent=1652"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/categories?post=1652"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/lorls\/wp-json\/wp\/v2\/tags?post=1652"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}