{"id":181,"date":"2021-06-08T11:14:38","date_gmt":"2021-06-08T10:14:38","guid":{"rendered":"https:\/\/blog.lboro.ac.uk\/cmc\/?p=181"},"modified":"2026-07-23T09:12:03","modified_gmt":"2026-07-23T08:12:03","slug":"measuring-deep-learning-in-educational-research","status":"publish","type":"post","link":"https:\/\/blog.lboro.ac.uk\/cmc\/2021\/06\/08\/measuring-deep-learning-in-educational-research\/","title":{"rendered":"Measuring deep learning in educational research"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><em><span style=\"color:#5d6f77\" class=\"has-inline-color\">Written by Dr Ian Jones. Ian is a Reader in Mathematics Assessment at the Mathematics Education Centre, Loughborough University. Please see <a href=\"https:\/\/www.lboro.ac.uk\/departments\/mec\/staff\/ian-jones\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a> for more information about Ian and his work.<\/span><\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Quantitative research studies are increasingly valued by researchers, policymakers and teachers, but the findings are only as good as the measures of learning used. It is straightforward to measure some types of learning such as recalling facts and applying algorithms, but we are typically interested in deeper learning, such as understanding of concepts or applying knowledge to novel problems. Unfortunately, deep learning is harder to define and harder to measure, and this inhibits both the quantity and quality of educational research.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To address this problem, researchers at Loughborough investigated efficient methods for producing high-quality measures of deep learning. To do this we adapted and applied measures based on comparative judgement methods. The measures we produced are quite distinct to the traditional tests and scoring rubrics that dominate quantitative studies in educational research. Subject experts are presented with two pieces of student work and asked, simply, which student has demonstrated the deeper learning based on the evidence presented. Many such pairwise decisions from a group of subject experts are collected and then sent to an algorithm to produce a score for each piece of work. The algorithm, based on the Bradley-Terry model, is like a more sophisticated version of calculating points from match results in football. Our comparative judgement-based methods have been shown to be efficient, reliable and valid across a range of target domains and learning contexts.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">An example of using comparative judgement to measure deep learning was provided by research led by&nbsp;<a href=\"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/full\/10.1002\/berj.3519\">Dr <\/a><a href=\"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/full\/10.1002\/berj.3519\" target=\"_blank\" rel=\"noreferrer noopener\">Ian<\/a><a href=\"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/full\/10.1002\/berj.3519\"> Jones at <\/a><a href=\"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/full\/10.1002\/berj.3519\" target=\"_blank\" rel=\"noreferrer noopener\">Loughborough<\/a><a href=\"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/full\/10.1002\/berj.3519\"> University<\/a>. We ran an intervention study in which older primary students were introduced to simple algebra using one of two software packages:&nbsp;<a href=\"https:\/\/link.springer.com\/article\/10.1007\/s40751-016-0018-4\">Grid <\/a><a href=\"https:\/\/link.springer.com\/article\/10.1007\/s40751-016-0018-4\" target=\"_blank\" rel=\"noreferrer noopener\">Algebraor<\/a>&nbsp;<a href=\"https:\/\/onlinelibrary.wiley.com\/doi\/full\/10.1111\/j.1365-2729.2011.00469.x\" target=\"_blank\" rel=\"noreferrer noopener\">MiGen<\/a>. Following the intervention, the main measure was based on an open-ended mathematics prompt as follows.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Explain how letters are used in algebra to someone who has never seen them before. You can use examples and writing to help you give the best explanation that you can.&nbsp;<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Students had 10 minutes to complete their answer on a single page. A group of subject experts then made pairwise judgements of students&#8217; responses to the mathematics prompt, and from these decisions we generated a score for each participant. The results showed that students in the Grid Algebra intervention outperformed those in the MiGen intervention.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">An example comparison<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"656\" src=\"http:\/\/blog.lboro.ac.uk\/cmc\/wp-content\/uploads\/sites\/54\/2021\/06\/Ian-1024x656.png\" alt=\"\" class=\"wp-image-184\" srcset=\"https:\/\/blog.lboro.ac.uk\/cmc\/wp-content\/uploads\/sites\/54\/2021\/06\/Ian-1024x656.png 1024w, https:\/\/blog.lboro.ac.uk\/cmc\/wp-content\/uploads\/sites\/54\/2021\/06\/Ian-300x192.png 300w, https:\/\/blog.lboro.ac.uk\/cmc\/wp-content\/uploads\/sites\/54\/2021\/06\/Ian-768x492.png 768w, https:\/\/blog.lboro.ac.uk\/cmc\/wp-content\/uploads\/sites\/54\/2021\/06\/Ian.png 1384w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To validate our results, we also administered&nbsp;<a href=\"https:\/\/faculty.weber.edu\/eamsel\/Research%20Groups\/Math%20Research\/Kuchemann%20(1978).pdf\">a standard algebra test that we<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/faculty.weber.edu\/eamsel\/Research%20Groups\/Math%20Research\/Kuchemann%20(1978).pdf\" target=\"_blank\"> <\/a><a href=\"https:\/\/faculty.weber.edu\/eamsel\/Research%20Groups\/Math%20Research\/Kuchemann%20(1978).pdf\">adapted from the literature<\/a>. The standard test purports to measure understanding of algebra concepts and so provided a yardstick for our novel comparative judgement-based method. When we conducted the analysis again, but this time using scores from the standard test, we replicated the results produced using scores from the open-ended mathematics prompt. Importantly, the design and implementation of the comparative judgement-based method was far more efficient than the design and implementation of the standard test. Moreover, our approach is flexible and can be readily applied to any target concept without the time and expense required to develop and validate a traditional measure. Therefore, we concluded that comparative judgement-based methods have the potential to improve the quantity and the quality of quantitative educational research studies.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Researchers interested in using comparative judgement methods can do so using the freely available comparative judgement engine at&nbsp;<a rel=\"noreferrer noopener\" href=\"http:\/\/www.nomoremarking.com\/\" target=\"_blank\">www.nomoremarking.com<\/a>. We have recently developed a how-to guide for researchers interested in comparative judgement which is available here&nbsp;<a href=\"https:\/\/tinyurl.com\/NMM4researchers\">tinyurl.com\/<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/tinyurl.com\/NMM4researchers\" target=\"_blank\">NMM4researchers<\/a>. You are also welcome to get in touch with Ian at&nbsp;<a href=\"mailto:I.Jones@lboro.ac.uk\">I.Jones@lboro.ac.uk<\/a>for further advice and assistance.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Written by Dr Ian Jones. Ian is a Reader in Mathematics Assessment at the Mathematics Education Centre, Loughborough University. Please see here for more information about Ian and his work. Quantitative research studies are increasingly valued by researchers, policymakers and teachers, but the findings are only as good as the measures of learning used. It [&hellip;]<\/p>\n","protected":false},"author":676,"featured_media":184,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"lboro_blog_alternative_thumbnail_image":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[45],"tags":[43,40,38,39,44,42],"class_list":["post-181","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-methods","tag-algebra","tag-comparative-judgement","tag-education-assessment","tag-educational-assessment","tag-marking","tag-no-more-marking"],"_links":{"self":[{"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/posts\/181","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/users\/676"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/comments?post=181"}],"version-history":[{"count":5,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/posts\/181\/revisions"}],"predecessor-version":[{"id":187,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/posts\/181\/revisions\/187"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/media\/184"}],"wp:attachment":[{"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/media?parent=181"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/categories?post=181"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.lboro.ac.uk\/cmc\/wp-json\/wp\/v2\/tags?post=181"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}