Students Evaluating Teaching – The Unending Conversation

The New York Times Magazine for September 21, 2008 is the “College Issue,” with a cover title, “It’s All About Teaching.” Among the articles is one on “Judgment Day” by Mark Oppenheimer. While making the point that there have been over 2,000 studies on the value of student teaching evaluations (i.e., those evaluations which all students are required to fill out at the end of our classes), the research on their utility is still mixed. The article calls attention to the way that the evaluations are subject to particular gender/race biases, how they can reward “entertainment value” over good teaching, how it is difficult to rate the sciences/math (i.e. very vertical curricula) vs. the humanities and social sciences, etc. Oppenheimer closes his article (spoiler alert!) with the following: “When students in the 1960s demanded more say in academic governance, they could not have predicted that their children would play so outsize a role in deciding which professors were fit to teach them. Once there was a student revolution, which then begat a consumer revolution, and along with more variety in the food court and dorm rooms wired for cable, it brought the curious phenomenon of students grading their graders. Whether students are learning more, it’s hard to say. But whatever they believe, they’re asked to say it.”

What do you think? SET’s are required at Oberlin and we have put a fair amount of time trying to make them more reliable and uniform. Are there better ways to evaluate teaching? What would you like to see (other than superlative comments from students on ALL your classes)?

3 thoughts on “Students Evaluating Teaching – The Unending Conversation

  1. Gary Kornblith

    On the assumption that faculty are at least as skilled at recognizing quality teaching as students are, I recommend that we pay more attention to peer review of classroom performance. We should try to develop campus-wide standards and procedures for observing one another’s classes — perhaps even across departmental lines — and incorporate those observations more consistently in evaluating our colleagues’ effectiveness as teachers. This process of peer review of teaching should not end with the award of tenure but continue throughout our careers. –Gary

  2. Nancy Darling

    After 20 years of looking at my own evaluations, I think evaluations reliably differentiate between classes where the students were angry at me and classes where I had put together a solid class based on good pedagogy. The former is not a good learning environment, no matter how much work I’d put into designing the class. It meant I had made a mistake.

    However, I don’t think there was any reliable difference between classes that were in the 3+ and 4+ range. Differences depended on how hard it was, whether it was required and unpopular, whether I challenged the students and pushed them out of their comfort zone, whether students who liked me were in the class, or how close the evaluation was to the last test.

    In other words, I don’t see the problem with using student evaluations to judge whether or not the professor has designed a solid course and created a good learning environment. I do think that there is a problem saying that there is a minimal set number you have to reach or making fine distinctions in either the upper OR the lower range.

    Having read the full NYT article, I would also like to take umbrage with the assumption the author made that ‘liberal’ students in a ‘liberal arts’ college would inherently ‘like’ classes on ‘whiteness’ or a historical treatment of race. IMHO, students who go to good schools – like Williams and Wesleyan, in the article, or like Oberlin – are fundamentally conservative. Not politically conservative, but truly conservative. They work hard, dot their i’s and cross their t’s, study, and work strategically to get good grades. That’s why they got into the schools they did and that’s why their parents were willing to pay for them to go. I don’t think this inherently selects for student who are comfortable being challenged about race, who want to take risks in their learning or their grades. It will tend to select for students who are conservative and protective about their grades.

    The danger of relying on course evaluations as the main means of judging pedagogical success is that faculty who do push students may have some great successes, but also be soundly rejected by others. A pattern of lots of high marks and a block of lows will result in an overall ‘average’ rating. But that’s really different from having a professor who gets uniformly average marks.

  3. Kirk Ormand

    Given the dispute over the effectiveness of SET’s and what, exactly, they measure, I think the amount of emphasis that they are given in evaluating professors at Oberlin for tenure, promotion, and (of course) annual salary review is fairly problematic. The oft-cited anecdotes about a correlation between high grades and good course evals are particularly pernicious; if true, it means that the way for me to earn more money at Oberlin is to become an easier grader. The dean’s office doesn’t want this — but is the Dean willing to pay me more money to hold the line? Because if he isn’t, it’s pretty clear where my incentives are.

    The problem, of course, is that if we drop the SET’s, then what does Council have to use for evidence when they evaluate us for teaching? Gary is right — peer evaluation, if regularized and done throughout one’s career, would be a much more reliable and -dare I say – fair way to evaluate us as teachers. But any kind of peer review requires more work from our peers. Council now specifies that department evaluations of teachers (for salary) should include such things as review of syllabi. How many of us do this, in a systematic way? How many of us have time to?

    At the very least, it would be nice to feel that when we say “teaching excellence” we have some agreement on what that means. Does that mean that the students learn a lot? Does it mean that they enjoy their classes? Does it mean that the material is challenging and presented well (independent of measures of student learning and/or enjoyment)? Should we give more credit for classes in which the students are made _uncomfortable_ by the presentation of material, because it means that we are challenging them to expand their thinking? (In my experience, students who are uncomfortable rarely write positive evaluations.) How do we distinguish “uncomfortable because legitimately challenged” from “just plain uncomfortable” on the basis of student evaluations? I don’t know the answers to these questions. But I do know that my salary is dependent on receiving good marks from my students. This strikes me as a less than healthy situation.

    Don’t get me wrong – I’m not opposed to student evaluations. I read them, and sometimes I find that the comments in them result in changes in the way I teach. That, in my view, is the principal use to which they should be put. My concern is that they are significantly used by the college to evaluate my “effectiveness” (whatever that means), and they are not reliable enough to be used for that, except perhaps in a subsidiary role to other, better defined measures.


Leave a Reply

Your email address will not be published. Required fields are marked *