How to measure learning gain
You'd think that for something of such fundamental importance there is one clear agreed method. Alas, there is not. We reviewed a lot of papers to figure out what's best.
Calculating learning gain using pre-test and post-test is not as straightforward as one might think. Assuming that the tests are created (whether it is the same test or similar one) and that we have a result out of 100 for each participant for the pre and post, then we need to decide on how best to calculate the learning gain.
The simplest way is just to calculate the difference between the two, which is called ‘raw gain’ or absolute gain.
That is raw (or absolute) gain = postTest - preTest
While this method is simple and easy to interpret, one of its main shortcomings is that it fails to account for the fact those with higher pre-test scores have a smaller range to show progress. In other words increasing one’s score from 10 to 20 (which is very easy) is looked at similarly as increasing one’s score from 85-95 (which is very hard).
Using the equation of (postTest - preTest) / preTest which some call ‘normalised change’, while easy to understand is also problematic as first, it will fail if the pre-test was zero and it vastly favours lower-pre test scores over those who start with higher scores so does not solve the problem with the raw gain.
Accordingly, one of the approaches mostly used to calculate learning gain, is what is called ‘normalised gain’ (Hake, 1998) which takes into account the range of improvement possible for each student and is very popular in undergraduate science, technology, engineering and mathematics (STEM) education literature and in medical education.
Normalised gain = (postTest - preTest)/(100%-preTest)
If it is not in percentages, then just replace the 100% with the maximum possible score. While this one is more sensitive to those starting with a higher score, this is seen as acceptable as increasing a score that is already high is harder than one that is low, so one can justify that the learning gain of moving from 85% to 95% is seen as a much higher learning gain than from 10% to 20%. This one is also easy to interpret as if the pre-test is 0% then the normalised gain will be equal to the postTest. Also if the postTest ends up being 100%, then the normalised gain is 100%, which also makes a lot of sense. Another important feature of this approach (compared to normalised change for example) is that the difference between calculating it based on the averages of pre and post test, or averaging the individual calculations is minimum especially when the samples are above 50.
According to Hake (1998), whose seminal paper about normalised gain is widely cited, a normalised gain <0.3 is considered poor, normalised gain >= 0.3 and <0.7 is considered medium, and >=0.7 is considered high.
Other methods that have been discussed in literature (e.g. Westphale, 2022) like dividing the difference over the sum, or multiplying the difference by a weighted preTest never became popular and are not easy to interpret and as they are not widely used, are different to use to compare between studies.
On the other hand, as normalised learning gain is wildly used, it can be used as a basis to compare between different studies as in the review carried out by Jekaterina et al. (2019). Interestingly, the average normalised gain of 32 studies analysing 71 students samples in the review carried out by Jekaterina et al. (2019) was 0.34, which is considered medium according to Hake.
So let’s review the pros and cons of the four predominant methods to measure learning gain.
Raw gain | the difference between post and pre tests
Equation: postTest - preTest
Pros:
Simplicity
Easy to interpret and understand
Cons:
Fails to account for the observation that higher pre-test scores result in disproportionately lower learning gain
Normalised Change Score | the difference in the pre and pre-test scores divided by the pre-test score
Equation: (postTest - preTest) / preTest
Pros:
Easy to understand
Cons:
Vastly favours lower pre-test scores (almost the reverse of G1)
Vastly penalises low-scorers when doing worse in post-test more than high scorers
Fails if pre-test is 0
Not normalised. Results vary greatly in range with no upper bound
Normalised Gain | the ratio between the average gain from pre to post-test and the maximum possible gain
Equation: (postTest - preTest) / (100% - preTest)
Pros:
Accounts for the fact that some student cohorts have a wider margin for improvement than others depending on how much they already know/don’t know.
Popular in undergraduate science, technology, engineering and mathematics (STEM) education literature and in medical education.
The difference between calculating it based on the averages, or averaging the individual calculations is minimum
For our scores, showed the least correlation with the pre-test results
Cons:
More sensitive to higher pre-test scores
Consequently, it severely penalises higher pre-scorers when ending up with lower post-test
Was primarily designed to compute gains at the group level (classroom/student cohort) with an overall improvement in knowledge
Symmetric Gain | the difference in the post and pre-test scores divided by the sum of each mean score (cohort level)
Equation: (postTest - preTest) / (postTest + preTest)
Pros:
Symmetric about the mean
Difficult to interpret
Cons:
Slightly favours lower pre-test scores
Slightly penalises low-scorers when doing worse in post-test more than high scorers
A big difference between calculating it based on the averages, or averaging the individual calculations is minimum
For the Kinnu learning gain calculation we went with 'normalised gain' for two reasons: (1) its wide use, meaning that we have a result that can be compared against other studies, and (2) because it does not disadvantage those who already score well in the pretest (all learning, regardless of the starting point is recognised and rewarded). When I have said ‘we’ throughout, I specifically meant Ahmed Kharrufa (our learning and AI expert) and me.
As you have probably gathered from reading the above, we are not faffing around - we are determined (dare I say - obsessed) with designing and faithfully measuring the best learning method on the planet.
Bibliography:
From the heaps of papers we reviewed, these three really stood out.
Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American journal of Physics, 66(1), 64-74. [link]
Jekaterina Rogaten, Bart Rienties, Rhona Sharpe, Simon Cross, Denise Whitelock, Simon Lygo-Baker & Allison Littlejohn (2019) Reviewing affective, behavioural and cognitive learning gains in higher education, Assessment & Evaluation in Higher Education, 44:3, 321-337 [link]
Westphale, S., Backhaus, J., & Koenig, S. (2022). Quantifying teaching quality in medical education: The impact of learning gain calculation. Medical Education, 56(3), 312-320. [link]