Angela Verschoor is presenting at the e-Assessment Question Conference
About the presentation: Machine scoring of open-ended items: The holy grail of e-assessment
Machine scoring is one of the key advantages of e-assessment. Yet, machine scoring is mainly limited to closed-response items, including hotspot items, drag-and-drop items, and many other beautiful but laborious-to-develop item types. The easiest-to-develop items, such as those for which the answer is a single word or a single sentence, while mistakes in spelling and grammar are not to be taken into account, are usually left to paper. Already 30 years ago, there were optimistic research papers announcing that within a few years, machine scoring would be widely used. Since then it has been rather quiet, despite new techniques like Machine Learning.
In this presentation we will take a closer look at the complexity of the problem, and we propose two methods for machine scoring. The first method, suitable for detecting a single keyword in a sentence despite typos and bad spelling, can be used as part of the second method to check the correctness of the entire sentence. Single keyword detection works without item specific training and gives near-perfect results. Entire sentence marking uses Machine Learning based on a training set of manually scored answers in order to predict correctness of an entirely new response.
Unfortunately, this method has its limitations and the drawback of using a relatively large sample of manually scored responses. When presented with new responses, it usually gives correlations of 0.70 to 0.75 with manual scoring, while we find correlations in the same order of magnitude between human scoring. The method is deemed not good enough to be employed in high stakes testing environments, but currently experiments are carried out, aiding the human marker. The holy grail is still out there…
About Angela Verschoor
Angela Verschoor is Senior Researcher at CITO, the Netherlands. With a background in discrete optimisation, her interest is the development and application of automated test assembly (ATA), optimal design and computerised adaptive testing (CAT). She has been the driving force behind well over 30 operational CAT projects in Europe, as well as major improvements in large-scale projects such as the Dutch Final Primary Education Test, the Central Examinations, the examinations for Dutch as a Second Language, as well as the theory part of the Driving License Examinations.
Other recent projects included the introduction of ATA in, amongst others, Russia, Kazakhstan, Singapore, the Philippines and Italy.
In 2010 Angela was accredited as a Fellow of the AEA-Europe and in 2018 she won the Lifetime Contribution Award of the e-Assessment Association.