{"id":669,"date":"2016-01-20T15:57:17","date_gmt":"2016-01-20T15:57:17","guid":{"rendered":"https:\/\/copyright.lboro.ac.uk\/middleware\/?p=669"},"modified":"2016-01-20T16:03:12","modified_gmt":"2016-01-20T16:03:12","slug":"shibboleth-apis-cors-and-guest-access","status":"publish","type":"post","link":"https:\/\/blog.lboro.ac.uk\/middleware\/blog\/lab-availability\/shibboleth-apis-cors-and-guest-access","title":{"rendered":"Shibboleth, APIs, CORS and guest access"},"content":{"rendered":"<p>I&#8217;m going to start this blog post with a &#8220;user story&#8221; &#8211; the sort of thing that we use to shape systems design. \u00a0Then I&#8217;ll talk about one aspect of the design that this user story gave rise to, and the resulting fun with hacking on Shibboleth and cross site scripting. But first the user story:<\/p>\n<p><strong>User story<\/strong><\/p>\n<p><em>We have computer labs on campus, some of which are public (any members of the institution can use them) and some of which are &#8220;private&#8221; &#8211; only available for staff and students in certain groups (departments or modules for example). The campus is large, with long walks between buildings so students (and to some extent staff) would like to know which labs have machines that are available for them to drop in and use. \u00a0To complicate matters some labs are also bookable for classes, during which time no machines are available for drop in use.<\/em><\/p>\n<p><em>The University is also very keen on using single sign on technologies to minimise the number of times that our users are prompted for their username and password. \u00a0We use <a href=\"https:\/\/shibboleth.net\/\" target=\"_blank\">Shibboleth<\/a> and <a href=\"https:\/\/simplesamlphp.org\/\" target=\"_blank\">simpleSAMLphp<\/a> to provide this on campus, and where possible new services should make use of this technology and avoid popping up unnecessary log in dialogues.<\/em><\/p>\n<p><em>In this case we would like to provide a view of machine availability in the public labs for people that have not authenticated, and a tailored view for people that have got an existing Shibboleth session running. \u00a0We also need to make this available in a dynamic HTML page using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Representational_state_transfer\" target=\"_blank\">RESTful<\/a> APIs because some departments may wish to take the data and mash it up into their own displays and reporting systems.<\/em><\/p>\n<p><strong>Shibboleth and APIs<\/strong><\/p>\n<p>OK, so that&#8217;s our user story &#8211; the reason why we&#8217;re going to need to write some code. Most of the API details aren&#8217;t important here &#8211; they just talk to back end databases that are populated regularly with details of labs, machine availability and bookings. The APIs then format the data into some nice <a href=\"http:\/\/www.json.org\/\" target=\"_blank\">JSON<\/a> output that the Javascript on the client can turn into pretty HTML.<\/p>\n<p>However we need to tailor the output for the user if they have already authenticated and have active Shibboleth sessions running so that we can show them specific information about the private labs they have access to. \u00a0To do this from client side Javascript is we need to know what username (if any) a Shibboleth session is associated with, so that we can then provide a list of the labs that this person has access to using other API calls.<\/p>\n<p>The obvious first approach was to write a simple CGI API script on a web server that has <a href=\"https:\/\/wiki.shibboleth.net\/confluence\/display\/SHIB2\/NativeSPApacheConfig\" target=\"_blank\">Apache and mod_shib<\/a> installed. \u00a0The CGI script would be called from the client side Javascript and would get the user&#8217;s <em>eppn<\/em> or <em>cn<\/em> details. These come from the environment variables that mod_shib provides. The CGI script would return them in a JSON structure for the client side code to then use. \u00a0The resulting script is quite simple:<\/p>\n<p><code>#!\/usr\/bin\/perl<br \/>\nuse strict;<br \/>\nuse JSON;<br \/>\nprint \"Content-type: application\/json\\r\\n\\r\\n\";<br \/>\nmy $output = {};<br \/>\nforeach my $env_var ('cn', 'eppn') {<br \/>\n&nbsp;&nbsp;if($ENV{$env_var}) {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;$output-&gt;{$env_var} = $ENV{$env_var};<br \/>\n&nbsp;&nbsp;}<br \/>\n}<br \/>\nmy $json = new JSON;<br \/>\nprint $json-&gt;pretty-&gt;encode($output);<br \/>\n<\/code><\/p>\n<p>The first problem with this is that we also need to support people who aren&#8217;t logged in. This appeared to mean that we couldn&#8217;t use the common Apache mod_shib config that we use with our other server side Shibbolized CGI script:<\/p>\n<p><code>&lt;Location \/cgi-bin\/somewhere\/whoami&gt;<br \/>\n&nbsp;&nbsp;AuthType shibboleth<br \/>\n&nbsp;&nbsp;ShibRequestSetting requireSession 1<br \/>\n&nbsp;&nbsp;require valid-user<br \/>\n&lt;Location&gt;<br \/>\n<\/code><\/p>\n<p>Not to worry though: reading the Shibboleth documentation there is an option for &#8220;passive&#8221; or &#8220;lazy&#8221; authentication. This means that if a Shibboleth session is active, mod_shib makes use of it to fill in the environment variables with user details as before and approves running the CGI script. Otherwise it just passes authentication back to Apache which can then run the CGI script without the additional Shibboleth variables in the environment. All we need to do is remove the &#8220;<tt>require valid-user<\/tt>&#8221; and change the 1 to a 0 for the <tt>requireSession<\/tt> setting. Sounds just what we want, right?<\/p>\n<p>Wrong. What passive Shibboleth authentication lacks is the ability to check with the IdP if there is an existing Shibboleth session known to the web browser that wasn&#8217;t made to our web server. Effectively it allows &#8220;guest&#8221; access to the CGI script, with the option of going through an manual IdP login process if the user wishes to for that one site. Not that it really matters, as it soon became apparent that there were other issues with doing Shibbolized API calls from Javascript.<\/p>\n<p><strong>XMLHttpRequest and CORS<\/strong><\/p>\n<p>OK, so passive authentication in Shibboleth isn&#8217;t going to help much here. Lets step back a moment, and put the normal, non-passive mod_shib configuration shown above back in place. If the user has a valid Shibboleth session running, this should give us their user details, otherwise they&#8217;ll get an HTML page from the IdP asking them to log in.<\/p>\n<p>However, we want to use this CGI script as an API from Javascript, so we&#8217;re going to be calling it using the <tt><a href=\"https:\/\/www.w3.org\/TR\/XMLHttpRequest\/\" target=\"_blank\">XMLHttpRequest<\/a><\/tt> object&#8217;s methods. Maybe we could make the call and then see what the returned document is? If its JSON with <em>cn<\/em> or <em>eppn<\/em> details we know the user is logged in via Shibboleth. If its an HTML page of some sort, its probably a login or error page from the IdP intended to be displayed to the user, so we know they aren&#8217;t logged in.<\/p>\n<p>Now, when we call the API CGI script from <tt>XMLHttpRequest<\/tt> we&#8217;re actually going to end up with a <a href=\"https:\/\/www.switch.ch\/aai\/demo\/expert\/\" target=\"_blank\">set of HTTP 302 redirects<\/a> from the API&#8217;s server to the IdP server and possibly back again. Effectively one call to a Shibbolized resource may end up as multiple HTTP transactions. This is where Shibboleth stops playing nicely because of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Cross-site_scripting\" target=\"_blank\">cross domain security in Javascript<\/a> in web browsers:<\/p>\n<ol>\n<li>Cookies can not be set on the request, and often don&#8217;t propagate across the 302 redirects if a server attempts to set them with a Set-Cookie: HTTP header.<\/li>\n<li>We can&#8217;t intercept any of the 302 redirects in Javascript to try to inject headers or cookies. The browser will do those redirects itself until it hits of 200, 500, 403, etc response from a web server.<\/li>\n<li>By default, <tt>XMLHttpRequest<\/tt> ignores the output if the responding server doesn&#8217;t match the Origin of the request (ie the server where the Javascript came from originally).<\/li>\n<\/ol>\n<p>W3C have been working on\u00a0<a href=\"https:\/\/www.w3.org\/TR\/cors\/\" target=\"_blank\">Cross-Origin Resource Sharing (CORS)<\/a> technologies. \u00a0These can help with some of these issues. For example web servers can issues a\u00a0<a href=\"https:\/\/www.w3.org\/TR\/cors\/#access-control-allow-origin-response-header\" target=\"_blank\">Access-Control-Allow-Origin HTTP Header<\/a> which says which allows suitably equipped modern browsers to over come the Origin checking. \u00a0However these are limited: your server can only have one\u00a0Access-Control-Allow-Origin header value, otherwise browser Javascript interpreters will throw an error. \u00a0You can specify &#8220;*&#8221; for the\u00a0Access-Control-Allow-Origin header value which gives a wild card match against any Origin, but we found that if you do that browsers then disallow the passing of credentials (including cookies).<\/p>\n<p>So, calling a Shibbolized API from <tt>XMLHttpRequest<\/tt> looks like a non-starter. Every time a hand seems to reach out to help us, another hand comes along and slaps us down. \u00a0We need to be sneakier and&#8230; well, cruftier.<\/p>\n<p><strong>Evil iframe Hacking<\/strong><\/p>\n<p>Let me just say up front: <a href=\"https:\/\/www.w3.org\/wiki\/HTML\/Elements\/iframe\" target=\"_blank\">iframes<\/a> are ugly, are a hack and I&#8217;d rather not use them.<\/p>\n<p>However they do offer us the sneaky solution to this problem in that they don&#8217;t appear to have some of the restrictions that the <tt>XMLHttpRequest<\/tt> calls do. \u00a0Specifically they appear to set cookies for a remote web server based on ones know to the browser and also honour cookie setting during HTTP 302 redirects.<\/p>\n<p>What we can do is create a hidden iframe dynamically using client side Javascript, set an <tt>onLoad()<\/tt> handler function up and then point the iframe at our Shibboleth protected API CGI script. It will then do the 302 redirection chain to the IdP and possibly back to the API script and the iframe contents will end up as either a bit of JSON, or the HTML for the login error page from the IdP. In other words unlike <tt>XMLHttpRequest<\/tt>, the iframe behaves much more like the web browser session the user experiences.<\/p>\n<p>Our <tt>onLoad()<\/tt> handler function can then use this to determine if the user is logged in to Shibboleth or not. There is one more &#8220;gotcha&#8221; though, and again its related to cross site scripting protection in browsers. If we get a resource in an iframe that comes from the same server as the page that the Javascript was included in, we can peer into the contents of that iframe using Javascript object calls. However if the iframe is filled from another server, our Javascript in the client can&#8217;t fiddle with its contents. There&#8217;s a good reason for this: you don&#8217;t want naughty people including your banking site inside an iframe and then extracting out your account details as you&#8217;re using it. This also applies if we request a resource from our server in the iframe but due to HTTP 302 redirects the final document comes from a different server (as will happen if a user who is not logged in gets sent to\u00a0our IdP).<\/p>\n<p>Luckily, in our case we&#8217;ve got one hack left up our sleeve. If we try to access the iframe contents that have come from the IdP (which isn&#8217;t the server we put in the <em>src<\/em> attribute to the iframe), Javascript in the browser throws an error. However we can use the try-catch error handling mechanism to grab this error. As it only happens when we&#8217;ve got a document that hasn&#8217;t come from our CGI API script (assuming our CGI API is hosted on the same server as the HTML and Javascript came from), then we know that at that point the user isn&#8217;t logged in with a Shibboleth session. We don&#8217;t need to see the IdP&#8217;s document &#8211; the fact that we can&#8217;t see it tells us what we need to know.<\/p>\n<p>And we&#8217;re there! We can finally have client side Javascript that can deduce whether or not the user&#8217;s browser has had any Shibboleth session with our IdP open and if so can find out what the <em>cn<\/em> or <em>eppn<\/em> for the user is. Just for completeness, here&#8217;s a test HTML document with embedded Javascript to show that this works &#8211; you&#8217;ll need to serve it from the same web server as the Perl API CGI script above and modify the iframe.src accordingly.<\/p>\n<p><code><br \/>\n&lt;html&gt;<br \/>\n&nbsp;&lt;head&gt;<br \/>\n&nbsp;&nbsp;&lt;title&gt;Javascript playpen&lt;\/title&gt;<br \/>\n&nbsp;&lt;\/head&gt;<br \/>\n&nbsp;&lt;body&gt;<br \/>\n&nbsp;&nbsp;&lt;button type=\"button\" onclick=\"checkIdp()\"&gt;Test against IdP&lt;\/button&gt;<br \/>\n&nbsp;&nbsp;&lt;p id=\"response\"&gt;&lt;\/p&gt;<br \/>\n&nbsp;&nbsp;&lt;script&gt;<br \/>\nfunction IsJsonString(str) {<br \/>\n&nbsp;&nbsp;try {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;JSON.parse(str);<br \/>\n&nbsp;&nbsp;} catch (e) {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;return false;<br \/>\n&nbsp;&nbsp;}<br \/>\n&nbsp;&nbsp;return true;<br \/>\n}<br \/>\nfunction checkIdp() {<br \/>\n&nbsp;&nbsp;document.getElementById(\"response\").innerHTML = '';<br \/>\n&nbsp;&nbsp;var iframe = document.getElementById(\"iframe\");<br \/>\n&nbsp;&nbsp;if(!iframe) {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;iframe = document.createElement('iframe');<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;iframe.id = \"iframe\";<br \/>\n&nbsp;&nbsp;}<br \/>\n&nbsp;&nbsp;iframe.innerHTML = '';<br \/>\n&nbsp;&nbsp;iframe.onload = function() {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;var iframe = document.getElementById(\"iframe\");<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;try {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var text = iframe.contentDocument.body.innerText ;<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;} catch (e) {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;document.getElementById(\"response\").innerHTML = 'Not logged in';<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return;<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;}<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;if(IsJsonString(iframe.contentDocument.body.innerText)) {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var res = JSON.parse(iframe.contentDocument.body.innerText);<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;document.getElementById(\"response\").innerHTML = 'Logged in as ' + res.cn;<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;} else {<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;document.getElementById(\"response\").innerHTML = 'Not logged in';<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;}<br \/>\n&nbsp;&nbsp;};<br \/>\n&nbsp;&nbsp;iframe.hidden = true;<br \/>\n&nbsp;&nbsp;iframe.src = \"https:\/\/www.example.org\/cgi-bin\/ssocheck\/whoami\";<br \/>\n&nbsp;&nbsp;document.body.appendChild(iframe);<br \/>\n&nbsp;&nbsp;iframe.src = iframe.src;<br \/>\n}<br \/>\n&nbsp;&nbsp;&lt;\/script&gt;<br \/>\n&nbsp;&lt;\/body&gt;<br \/>\n&lt;\/html&gt;<br \/>\n<\/code><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m going to start this blog post with a &#8220;user story&#8221; &#8211; the sort of thing that we use to shape systems design. \u00a0Then I&#8217;ll talk about one aspect of the design that this user story gave rise to, and &hellip; <a href=\"https:\/\/blog.lboro.ac.uk\/middleware\/blog\/lab-availability\/shibboleth-apis-cors-and-guest-access\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[37,5,50],"tags":[],"class_list":["post-669","post","type-post","status-publish","format-standard","hentry","category-apis","category-lab-availability","category-shibboleth"],"_links":{"self":[{"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/posts\/669","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/comments?post=669"}],"version-history":[{"count":28,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/posts\/669\/revisions"}],"predecessor-version":[{"id":697,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/posts\/669\/revisions\/697"}],"wp:attachment":[{"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/media?parent=669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/categories?post=669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/middleware\/wp-json\/wp\/v2\/tags?post=669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}