Wednesday, September 21, 2005
Did He Make It? Bring In the Chains!
...the research expectations at this place are on the rise. A large percent of my department members, especially the younger faculty, have at least one book, for example. At the same time, we are expected to teach well. What’s been getting me is the steps being taken to determine if we are teaching up to par. First, there are the repeated ‘peer evaluations’ of your teaching, or when a colleague sits through your course. I’m in the middle of that, and besides being annoying, it very much feels like a bureaucratic hoop. My colleague was required to show up to every one of my classes (all three of them), and the requirements say that she has to make ‘repeated’ visits. And I need another person to do the same. Second, I’m supposed to put together a teaching portfolio...[that] will be hundreds of pages. Third, for every single course, I’m supposed to read my course evaluations and use them to write a self-analysis that describes strengths and weaknesses as well as discusses how I would improve on mistakes. Of course, I always find out about these rules secondhand, because the people who come up with them consider them self-evident.
The point being: The problem is only partially that we are being pulled in two directions. It’s also the bureaucratic enforcement. If this were still primarily a teaching institution, all this crap would be more tolerable. But, when you lose valuable research time and energy creating meaningless documents and sitting through other people’s classes, it seriously sucks.
Yes, it does. It sucks from the management side, too. Let’s say your institution values teaching, and someone you know to be a far below-average teacher is coming up for tenure. You want to deny, since you’re reasonably sure you could hire someone much stronger. (Assume, for the sake of argument, that you’re reasonably sure you won’t lose the position altogether.)
How do you prove that the candidate isn’t up to snuff as a teacher? What can you use? (And yes, in this litigious climate you have to assume that any negative decision will be challenged.)
Peer evaluations don’t work, since they’re subject to all manner of bias. In colleges with ‘consensus’ cultures, the unwritten rule is that no professor ever speaks ill of another to a dean. So the Lake Woebegone effect kicks in, and everybody is far above average, rendering the evaluations worthless. In a conflictual culture, the evaluation will reflect the political fault lines of the department – more interesting to read, but still useless as signs of actual teaching performance.
Evaluations by chairs are subject to both personal whims and political calculations, so their worth is frequently suspect, as well.
Student evaluations are less likely to reflect internal departmental politics, but they have a somewhat justified reputation for being manipulable. More disturbingly, I’ve read that student evaluations correlate with both the physical attractiveness of the teacher (particularly for male teachers, oddly enough), and the extent to which the teacher plays out the assigned gender role (men are supposed to be authoritative, women are supposed to be nurturing – when they switch, students punish them in the evaluations).
Other than evaluations, what should count? Outcomes are tricky, since they’re usually graded by the professor being evaluated. (My previous school used to fire faculty who failed ‘too many’ students. You can imagine what happened to the grading standards.) Outcomes also commonly tell you more about student ability and interest going into the course than about what went on during it.
Attendance isn’t a bad indicator, but it’s hard to get right. If the students hate the class so much that they simply stop showing up, something is probably wrong. But good luck getting that information in a regular, reliable way. And it, too, often reflects time slot and local culture.
Bad measures aren’t unique to academia. On those rare occasions when I actually get to watch football, I always get a kick out of the moments when they aren’t sure if the runner made quite enough yards (meters) for a first down. The referees put the ball on the ground where they think it should be, then march two poles onto the field, each supporting one end of a ten-yard chain. They plunk the first pole down by sort of eyeballing it, then pull the chain taut and use the location of the ball relative to the second pole to see if the runner made it. I’ve seen decisions hinge on inches (centimeters).
The false precision always makes me chuckle. If they can just eyeball the location of the first pole, then exactly how precise can the second pole really be? It’s just the initial eyeballed spot, plus ten yards.
Measuring the quality of teaching, sadly enough, is sort of like that. We use ungainly and even weird measures because we need to use SOMETHING, and nobody has yet come up with a better, practicable idea. Bad teachers rarely confess, so we need evidence. It’s fine to condemn the evidence we use – I certainly think my friend’s school is going way, way overboard – but I don’t foresee any change until we have an alternative.
Question for the blogosphere: how SHOULD we measure teaching quality? Put differently, what evidence would be fair game to use to get rid of a weak, but not criminal, teacher? If there’s a better, less intrusive way to do this that still gets rid of the weak performers, I’m all for it. Any ideas?
Given the different populations that different schools (and different disciplines within schools) attract, I think this approach would quickly penalize anybody who works with high-risk students, and would falsely reward folks at schools with selective admissions.
I can certainly see the difficulties inherent in teaching evals, but still, I find it kind of ridiculous that I don't get evaluated at all as part of my 3rd year review. They just go on the basis of student evals. Shows how little teaching counts at XU...
a) what methods of making materials available to students does the teacher use and how timely and relevant are they?
b) Conduct a focus group of some of the students who attended at least 70% of the lectures on a course to quiz them with specific directed questions about the lecturer's performance, not letting vague generalities like "not bad" stand but getting more specifics about the lecturing style and how well information was conveyed, and how does that lecturer compare with others?
c) Show their lecture materials to two others elsewhere in the same field to see whether the explanations make sense, how suitable the coverage is with respect to the syllabus.
d) The evaluator could attend a lecture and focus on how attentive the students are (or aren't).
I'm sure there's lots more creative evaluation methods that could be used. But I agree it's problematic. The only thing I am pretty convinced of is that you need a variety of sources of information to do a proper measure, because teaching has so many different aspects to it.
One point I would make is that we might, in some ways, try to make the evaluation of teaching more like our evaluation of research. Now, don't go ballistic on me just yet.
This involves outside, anonymous peer review, and not at the point of decision-making (tenure, promotion), but on an ongoing basis. In research, we peer review articles, for example. In teaching, we could, if we chose to commit the resources to it, the course or something like it. The structure for doing this might be difficult to establish, and it would be expensive. Peer reviewers would have to be trained.
Actually, my real feeling is that universities are generally not really serious about evaluating teaching/learning. They'd like to be seen as being serious, but they're not. The consequence is what Dean Dad refers to as mission creep--in this context the elevation of reearch, because it's easier to measure, other people do it for you, and it's free.
Then again, if anyone would like to see commentary on performance evaluation in other contexts, take a look at some of the recent posts in Mark A. R. Keliman's blog (http://www.markarkleiman.com/archives/cat_military_meritocracy.html).
How about if we convince all schools with master or doctorate degrees in education to require they go to one local school (outside of their own) and review a class/instructor/professor each week for some sort of semester/term credit? Or, sister schools where evaluations would be done by someone who may have less bias but would be coming from a similar teaching background in regards to the type of student body. For example, a CC would only be a sister school for another CC, not a R1 school. Again, time and cost prohibitive but fun to think about.
i don't think the external approach penalizes anyone if it is done with finesse. seeing as significant parts of the world use it to some degree of success, isn't it more likely that the resistance in the u.s. is probably ideological. it isn't that you send the material to the external reviewers blindly, you prepare the report on the class, what was covered, what the demographics are, etc. then they are graded, sometimes the assignments are graded by two external reviewers. this process removes instructor bias, and the odd habit of teachers to give grades out for things other than merit.
Measuring strictly by grading outcome will encourage teachers and institutions to shy away from students who aren’t already prepared or are difficult to teach.
The problem is community colleges can’t do that. Defective widgets can be destroyed. Dumb or ill-educated kids at a community college can’t be. Somebody’s got to do take on the job. Why rig the system at a CC to cripple the career of that “somebody?”
Hiring a designated “teacher assessor” who travelled from class to class sounds like a catastrophically bad idea. Said person would instantly become the Secret Police of the school. Feared, loathed, and the target of more flattery and bribes than anyone you’ve ever met. Also, teachers would gear their classes to please that single assessor. Not a good idea.
What if the assessor is a jerkass? Or plays favorites? Or doesn’t understand a particular teacher’s approach? Those are risks inherent to any system, but concentrating assessment power in a single person’s hands magnifies those risks beyond any level I’d deem acceptable.
I once argued that he should go himself to observe classes, but to no avail.
Somebody observing is necessary, we can't just count on students, though it's also important. I would argue for some kind of rotating peer evaluation (that is not too time consuming), with the Dean/HoD also actually seeing what is happening, and a meeting together to discuss the feedback.
So overall you have three different sources of information, and the oppotunity to put it in context.
Exams etc. should be monitored by someone else too.
Can you tell me where you read this please?
Here's an annotated bibliography of research on how gender affects evaluations.
Hope that helps.