|
The following
is the abstract of the paper presented at the Loughborough
CAA conference on the 9th July 2002. You can also request
the full paper.
Towards
Robust Computerised Marking of Free-Text Responses.
Tom
Mitchell ( 1), Terry Russell (2),
Peter Broomhead (3), Nicola Aldridge (1)
(1)
Intelligent Assessment Technologies
(2)
Centre for Research in Primary Science and Technology,
University of Liverpool
(3)
Dept of Systems Engineering, Brunel University
Abstract
This paper
describes and exemplifies an application of AutoMark, a software
system developed in pursuit of robust computerised marking
of free-text answers to open-ended questions. AutoMark employs
the techniques of Information Extraction to provide computerised
marking of short free-text responses. The system incorporates
a number of processing modules specifically aimed at providing
robust marking in the face of errors in spelling, typing,
syntax, and semantics. AutoMark looks for specific content
within free-text answers, the content being specified in the
form of a number of mark scheme templates. Each template represents
one form of a valid (or a specifically invalid) answer. Student
answers are first parsed, and then intelligently matched against
each mark scheme template, and a mark for each answer is computed.
The representation of the templates is such that they can
be robustly mapped to multiple variations in the input text.
The current
paper describes AutoMark for the first time, and presents
the results of a brief quantitative and qualitative study
of the performance of the system in marking a range of free-text
responses in one of the most demanding domains: statutory
national curriculum assessment of science for pupils at age
11. This particular domain has been chosen to help identify
the strengths and weaknesses of the current system in marking
responses where errors in spelling, syntax, and semantics
are at their most frequent. Four items of varying degrees
of open-endedness were selected from the 1999 tests.
These
items are drawn from the real-world of so-called 'high stakes'
testing experienced by cohorts of over half a million pupils
in England each year since 1995 at ages 11 and 14. A quantitative
and qualitative study of the performance of the system is
provided, together with a discussion of the potential for
further development in reducing these errors. The aim of this
exploration is to reveal some of the issues which need to
be addressed if computerised marking is to play any kind of
reliable role in the future development of such test regimes.
Key
words : Computer Assisted Assessment, Free-Text Marking
|