On the recent CBC Early Edition podcast, the issue of what standardized testing was really assessing was raised. I find a similar concern arises with student evaluations of teaching. The debate of the validity and meaning is not new, but recent findings further suggest that when asking student about their instructors what we are actually measuring may not be what we expected. We may be looking at the gas gauge to measure speed.
We do not appear to be measuring learning, or at least the actively engaged involvement with material that produces increased confidence, higher attendance, greater usefulness of reading textbooks, and better performance in their grades found in Walker and colleagues’ comparison of a traditional and an active learning section of introductory biology.
For unlike a car we are not measuring a physical property of speed or volume of gas in a single vehicle. Instead we are measuring the understood experiences of many individuals as they relate them to a particular question. We may be measuring their satisfaction with their grades as demonstrated in Tracy Vaillancourt’s recent three experimental studies where “students did seem to reward professors for good grades with high SETs and did seem to punish them for low grades (p. 10).
We are also capture on students’ memory tendencies to remember negative (or positive) moments, as well as their awareness and definitions of what we are asking about. For in order to report objectively that they found the course interesting, students need to define “interesting,” connect that definition to their experiences in the course, recall general patterns across the course, make a consistent assessment, and record that assessment carefully. What if interesting is a word associated with movies or favorite activities? And so on…
A very messy leap of faith indeed, yet one that can be complemented and framed to improve how accurately we are measuring what we hope to measure through multiple strategies including:
- Peer evaluations
- Focus groups with students that explore instances, definitions, and experiences
- Surveys immediately after an activity rather than the end of term
- Framing of activities to define rationale or purpose and then assessing specifically related to those goals rather than broad constructs.
- Specific questions that provide or invite definition and examples from across timeframe, as well as focus reflection on specific details or at the very least on the breadth of things to start, stop, continue, positive, negative etc.
The ongoing debate suggests highly variable usefulness of student evaluations of teaching; there are many dials in the car and sometimes we pick the right one. Perhaps we should also look under the hood, drive at test distance, and hold up a radar speed tracker.