Mitigating gender bias in student evaluations of teaching

Friday, September 13, 2019
by Kathy Becker

In a recently published 2018 study, researchers undertook an experiment to examine the impact of an introductory statement drawing students' attention to the existence and potential impact of their own biases as they pertain to student course evaluations. 

Two large two-section intro courses (two sections of an intro biology course and two sections of an intro political science course) were included in the study. Each course had one section taught by a female instructor and one taught by a male instructor (four different instructors, all white). The students enrolled in these four sections were randomly assigned to receive either the standard course evaluation questionnaire or a copy that included a preamble (the Bias Statement) noted below:

“Student evaluations of teaching play an important role in the review of faculty. Your opinions influence the review of instructors that takes place every year. Iowa State University recognizes that student evaluations of teaching are often influenced by students’ unconscious and unintentional biases about the race and gender of the instructor. Women and instructors of color are systematically rated lower in their teaching evaluations than white men, even when there are no actual differences in the instruction or in what students have learned.

As you fill out the course evaluation please keep this in mind and make an effort to resist stereotypes about professors. Focus on your opinions about the content of the course (the assignments, the textbook, the in-class material) and not unrelated matters (the instructor’s appearance).”

A total of 249 students completed a course evaluation: 128 female students, 118 male students, and 3 students of unreported gender.

In reading this study, I was interested to see that two of the questions in the course evaluation being studied were similar to two of our Course Critique questions; this led me to have a closer look at the dataset, which the researchers have made available at the link below.

Institution Teaching Rating - Questions and Anchors Course Rating - Questions and Anchors
Waterloo Engineering

Q10: What is your overall appraisal of the quality of teaching in this course?

(Very high ... ... Very low)

Q17: What is your overall appraisal of this course?

(Excellent ... ... Poor)

Iowa State University

What is your overall rating of the instructor's teaching effectiveness?

(Almost Never Effective … … Almost Always Effective)

Your overall rating of this course is:

(Very Poor … … Very Good)

I took the dataset from this study and applied the Faculty of Engineering Course Critique weights and scoring (available at https://www.eng.uwaterloo.ca/critiques/ - login required). The table below shows how the effect of the Bias Statement on the sections evaluated in this study would be scored in our own context.

Iowa State University
scores calculated using Waterloo Engineering Course Critiques weights & scoring
Female Instructors Male Instructors
Teaching Rating
score/100
Course Rating
score/100
Teaching Rating
score/100
Course Rating
score/100
score WITHOUT Bias Statement 76 73 77 79
score WITH Bias Statement 84 86 76 76
impact of Bias Statement on score +8 +13 -1 -3

As a comparison, here are the Waterloo Engineering Teaching (Q10) and Course (Q17) scores by instructor gender for the past ten years (excluding any instructors for whom gender data isn't available).

Waterloo Engineering Female Instructors Male Instructors
Teaching Rating
score/100
Course Rating
score/100
Teaching Rating
score/100
Course Rating
score/100
score WITHOUT Bias Statement 76 72 78 73
score WITH Bias Statement unknown unknown unknown unknown
impact of Bias Statement on score unknown unknown unknown unknown

Potential Impact in Waterloo Engineering Context

Would a Bias Statement have a similar effect in our own context? It's impossible to know without running our own study. But if a similar effect were to be seen here, the potential impact is outlined below.

The average Teaching Rating (Q10) score for female Waterloo Engineering instructors over the past ten years is 76/100, which is below the overall Waterloo Engineering average for this same timespan (78/100). If a Bias Statement had a similar impact in the Waterloo Engineering context, the average Teaching Rating for female Waterloo Engineering instructors would increase from 76 to 84. The average Teaching Rating for male Waterloo Engineering instructors would decrease from 78 to 77.

The average Course Rating (Q17) score for female Waterloo Engineering instructors over the past ten years is 72/100, which is below the overall Waterloo Engineering average for this same timespan (73/100). If a Bias Statement had a similar impact in the Waterloo Engineering context, the average Course Rating for female Waterloo Engineering instructors would increase from 72 to 85. The average Course Rating for male Waterloo Engineering instructors would decrease from 73 to 70.

Limitations

It should be noted that the Iowa study included a small number of instructors (4), classes (4) and student responses (249). In comparison, the 10-year Waterloo Engineering dataset used above included 1055 instructors, 7065 classes, and over 330000 student responses. As such, individual instructor differences may have had an overly significant impact on the results observed in the Iowa study.

You can find the full paper and complete dataset at https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0216241