Cornell University Teaching Evaluation Handbook
Third Edition, 1997Table of Contents
Chapter IV - Criteria for Evaluating Data on Teaching
This discussion of evaluation criteria is meant to assist the department or college in establishing its own system for the evaluation of teaching. The dictionary uses the terms "standard, rule or test" in defining criterion. Tenure decisions require a general rule for defining excellence that can accommodate the variety of disciplinary-based instructional traditions, while honoring the individual's freedom to express and develop personal style in teaching. An example of such a general rule might be, "To what degree does the data supplied support the reasoned opinion of those most competent to judge that the candidate has-and will continue to-demonstrate the capacity to improve instructional practice?"
A distinction has been made between promotion and tenure criteria: promotion criteria focus on merit of the candidate's professional and scholastic contributions and promise, whereas tenure criteria focus on the long-time worth of the candidate's professional and scholastic contributions and promise. "Worth requires merit, but merit is not a sufficient condition for worth."1
"Merit is free of the specifics of the pool; it is criterion referenced and deals with the candidate's ranking on those criteria. Worth is utility to the hiring party."2 Based on these definitions, if tenure candidates sufficiently prove their capacity and commitment to continually improving research and teaching practice, their long-time worth to the unit and institution will be greater than if they have merely made a case that their performance has measured up to a universally established absolute standard. Normative standards are necessary for determining merit, but merit is a concept that may be more relevant at the point of hiring and when the candidate is being considered for promotion.
In general, criteria for evaluating teaching will be more useful in the tenure and promotion process if they
can discriminate between teachers in terms of specific competencies
can reliably and consistently measure a specific competency both for the same individual over time and between individuals
maintain a neutral orientation relative to individual style and viewpoint
yield information about instructional situations where the teacher functions best3
This chapter provides suggested criteria to use to evaluate teaching, based on the data categories described in Chapter 3. It begins with a general discussion of effective teaching and goes on to include criteria for use by students, criteria relevant to evaluation of teaching materials, and criteria for use by peers, including classroom observations, all of which have been developed through controlled inquiry in work carried out within the last 25 years.
Effective Teachers - A Description
During the past 50 years the debate over effective teaching has moved
from a discussion of technical, classroom skills, or process skills
as they have been called, to a focus on skills necessary to make the subject
matter understandable to the student, what one author calls "the
pedagogy of substance." Thinking dichotomously about teaching-as
either a technical process skill divorced from the subject matter or solely
a matter of translating abstract and technical information into understandable
terms-limits the conception of what teaching is. Looking at teaching as
a scholarly activity that is connected to research suggests a dialogue
between the tasks of understanding a body of knowledge and explaining
it. Effective teaching must be concerned with both
of these areas of expertise: I am no more effective if I have a body of
knowledge to profess but am unable to communicate it than I am if I can
hold students rapt in wonder who do not know what I am talking about.
The dichotomy can be avoided by a more integrative model of teaching: effective teachers are able to understand enough about their students' ways of thinking that they can translate their own understanding of the subject matter into a form that connects with their students.
...one of the things we see when we look at teaching analytically is this combination of an emphasis on understanding the subject matter, understanding how it is represented in the heads of students and then being able to generate representations of your own as a teacher that will be a bridge between the subject matter and the students.4
Recent work on teacher effectiveness has yielded the following observations which support an integrative model that is both process- and content-based (italics in original):
Teachers promote learning by communicating to their students what is expected and why
Effective teachers not only know the subject matter they intend their students to learn but also know the misconceptions their students bring to the classroom that will interfere with their learning of that subject matter
Effective teachers are clear about what they intend to accomplish through their instruction, and they keep these goals in mind both in designing the instruction and in communicating its purposes to the students. They make certain that their students understand and are satisfied by the reasons given for why they should learn what they are asked to learn.
Effective instruction provides students with metacognitive strategies to use in regulating and enhancing their learning. It also provides them with structured opportunities to exercise and practice independent learning strategies.
Effective teachers create learning situations in which their students are expected not just to learn facts and solve given problems but to organize information in new ways and formulate problems for themselves. Such learning situations include creative writing opportunities in language arts, problem-formulation activities in mathematics, and independent projects in science, social studies and literature.
Effective teachers continuously monitor their students' understanding of presentations and responses to assignments. They routinely provide timely and detailed feedback, but not necessarily in the same ways for all students.
Effective teachers realize that what is learned is more likely to be remembered and used in the future if it serves students' purposes beyond meeting school requirements.
. . . effective teachers . . . take time for reflection and self-evaluation, monitor their instruction to make sure that worthwhile content is being taught to all students, and accept responsibility for guiding student learning and behavior. . . . the same research . . . has made it clear that few teachers follow all of these practices all of the time.
. . . teachers must cope with a full agenda that typically precludes time for serious reflection . . .5
The last two points raised in the quotes above deserve to be emphasized. First, teachers are human and not machines. Strict adherence to a set of principles does not in itself establish effectiveness. I may, for any number of acceptable reasons, occasionally exhibit inconsistency in teaching practice. The more important issues are: to what degree is my practice governed by some explicit pedagogical framework, and how frequently am I unable to follow my own guiding principles of teaching, which my experience has shown to produce desirable results. Second, the extent to which I can be effective will be governed, to a certain degree, by the environment and conditions under which I must work. I only have so much time and energy, and I have a life beyond my work, which has its own demands. These are facts we take for granted, but because we take them for granted, we may be in danger of forgetting them during the rigor of a tenure decision. A case where a newly hired faculty member is assigned to teach five courses represents a much more stressful situation than a case with a lighter teaching load. Work load is an important factor to be considered when evaluating a candidate on the following departmentally based criteria:
Has the candidate assumed the responsibilities related to the department's or university's teaching mission?
Does the candidate recognize the problems that hinder good teaching in his or her institution and does he or she take a responsible part in trying to solve them?
If all members of the faculty were like this individual, what would the college be like?
To what extent is the candidate striving for excellence in teaching?6
If teaching is to be adequately rewarded as a valued activity and contribution to the department or unit, the degree to which a candidate has accomplished the following should be recognized:
whether there is sufficient data on teaching quality
whether alternative teaching methods have been explored
whether changes have been made in the candidate's courses over time
whether the candidate sought aid in trying new teaching ideas
whether the candidate developed special teaching materials
whether the candidate participated in teaching improvement opportunities
A study carried out at Berkeley (Hildebrand, Wilson & Dienst, 1971) was designed to discriminate between best and worst teachers. One set of scales were factor-analyzed out of student survey data that are relevant to evaluating teaching by students. They are:
Scale 1: Analytic/Synthetic Approach, relates to scholarship, with emphasis on breadth, analytic ability, and conceptual understanding.
Scale 2: Organization/Clarity, relates to skill at presentation, but is subject-related, not student-related, and not concerned merely with rhetorical skill.
Scale 3: Instructor-Group Interaction, relates to rapport with the class as a whole, sensitivity to class response, and skill at securing active class participation.
Scale 4: Instructor-Individual Student Interaction, relates to mutual respect and rapport between the instructor and the individual student.
Scale 5: Dynamism/Enthusiasm relates to the flair and infectious enthusiasm that comes with confidence, excitement for the subject, and the pleasure in teaching7
A second set of scales were factor-analyzed out of survey data from faculty colleagues. These surveys were also designed to discriminate between the best and worst teachers. These scales are relevant for use by colleagues in evaluating a candidate's teaching. They are:
Scale 1: Research Activity and Recognition
Scale 2: Intellectual Breadth
Scale 3: Participation in the Academic Community
Scale 4: Relations with Students
Scale 5: Concern for Teaching8
The criteria identified with each scale for use by students and colleagues and which were the most discriminating between the best and worst teachers are included in the tables below. (Factor analysis coefficients that were used to associate the item with the particular scale are included.)
| Scale 1. Analytic/Synthetic Approach |
Factor coefficient |
| 1. Discuss points of view other than their own | .70 |
| 2. Contrast implications of various theories | .66 |
| 3. Discuss recent developments in the field | .64 |
| 4. Present origins of ideas and concepts | .60 |
| 5. Give references for more interesting and involved points | .53 |
| 6. Present facts and concepts and related fields | .53 |
| 7. Emphasize conceptual understanding | .46 |
| 8. Explain clearly | .78 |
| 9. Are well prepared | .63 |
| 10. Give lectures that are easy to take notes in | .62 |
| 11. Are careful and precise in answering questions | .61 |
| 12. Summarize major points | .51 |
| 13. State objectives for each class session | .50 |
| 14. Identify what they consider important | .47 |
Scale 3. Instructor-Group Interaction
| 15. Encourage class discussion | .70 |
| 16. Invite students to share their knowledge and experiences | .65 |
| 17. Clarify thinking by identifying reasons for questions | .64 |
| 18. Invite criticism of their own ideas | .62 |
| 19. Know if the class is understanding them or not | .58 |
| 20. Know when students are bored or confused | .57 |
| 21. Have interest and concern in the quality of their teaching | .48 |
| 22. Have students apply concepts to demonstrate understanding | .43 |
Scale 4. Instructor-Individual Student Interaction
| 23 Have a genuine interest in students | .74 |
| 24. Are friendly toward students | .71 |
| 25. Relate to students as individuals | .69 |
| 26. Recognize and greet students out of class | .69 |
| 27. Are accessible to students out of class | .65 |
| 28. Are valued for advice not directly related to the course | .64 |
| 29. Respect students as persons | .60 |
Scale 5. Dynamism/Enthusiasm
| 30. Are dynamic and energetic persons | .80 |
| 31. Have an interesting style of presentation | .76 |
| 32. Seem to enjoy teaching | .74 |
| 33. Are enthusiastic about their subject | .65 |
| 34. Seem to have self-confidence | .64 |
| 35. Vary the speed and tone of their voice | .63 |
| 36. Have a sense of humor | .53 |
*Based on 1968 Survey, N = 1015
These items can be used either to develope end-of-semester summative evaluation questionnaires or to evaluate other student data on teaching, such as letters. If a numeric evaluation schema is adopted, caution should be exercised in establishing normative data. "The usual overall evaluation of teaching will provide for evaluation on a five-point scale and will permit a classification of teachers as poor, adequate, good, excellent, or outstanding. In practice, the bottom end of the scale is rarely used and the actual range varies between a little under 3.0 to a little over 4.5 That is, anything under 3.0 is poor, and anything over 4.5 is outstanding; the other classifications are arranged in between these two extremes. . . . The concept of improvement implies progressing up the scale."10
Evaluation Criteria for Use by Peers
Making global assessments of an instructor's overall teaching effectiveness
is a practice that is unsatisfactory to the candidate, to those who must
make the evaluation, and to the department. A more useful and practical
practice is for colleagues to focus on certain qualities associated with
good teaching that they are in a good position to judge. The items listed
below were those most discriminative between best and worst teachers as
perceived by their colleagues (Hildebrand, et al., 1971). They are included
here to provide a general profile of effective teaching from which a department
may develop its own profile. Because these items discriminated between
the best and worst teachers at the p < .001 level they have a high
level of validity. The authors of the study suggest they be used as a
supplement (and not as a substitution) for student ratings.
| Scale 1. Research Activity and Recognition |
Factor coefficient |
| 1. Do work that receives serious attention from others | .69 |
| 2. Correspond with others about their research | .69 |
| 3. Do original and creative work | .64 |
| 4. Express interest in the research of colleagues | .55 |
| 5. Give many papers at conferences | .55 |
| 6. Keep current with developments in their field | .49 |
| 7. Have done work to which I refer in teaching | .48 |
| 8. Have talked with me about their research | .38 |
Scale 2. Intellectual Breadth
| 9. Seem well read beyond the subject they teach | .66 |
| 10. Are sought by others for advice on research | .60 |
| 11. Can suggest reading in any area of their general field | .59 |
| 12. Know about developments in fields other than their own | .51 |
| 13. Are sought by colleagues for advice on academic matters | .43 |
Scale 3. Participation in the Academic Community
| 14. Encourage students to talk with them on matters of concern | .60 |
| 15. Are involved in campus activities that affect students | .58 |
| 16. Attend many lectures and other events on campus | .47 |
| 17. Have a congenial relationship with colleagues | .39 |
Scale 4. Relations with Students
| 18. Meet with students informally out of class | .58 |
| 19. Are conscientious about keeping appointments with students | .57 |
| 20. Meet with students out of regular office hours | .57 |
| 21. Encourage students to talk with them on matters of concern | .55 |
| 22. Recognize and greet students out of class | .37 |
Scale 5. Concern for Teaching
| 23. Seek advice from others about the courses they teach | .70 |
| 24. Discuss teaching in general with colleagues | .60 |
| 25. Do not seek close friendships with colleagues (Negative) | -.47 |
| 26. Are people with whom I have discussed my teaching | .45 |
| 27. Are interested in and informed about the work of colleagues | .44 |
| 28. Express interest and concern about quality of their teaching | .40 |
*Based on 1967 survey, N = 119
Once a candidate has been evaluated on these or other criteria, certain precautions are necessary to ensure fairness: "include a review of central tendencies and variations in the rating results; an analysis of the effects of ecological factors, including different types of courses, students, and time frames on ratings in the unit; and the establishment of agreed-upon standards and steps to be taken in the application of the standards."12
Classroom Observation by Peers
What happens in the classroom can have a substantial impact on student
relationships with the course material. It is therefore important to add
to students' and the candidate's own perspective a third view of classroom
performance by peers through planned observations. Studies seeking to
determine whether peers can reliably and validly evaluate classroom performance
through observations have been discouraging, however. "It is not
clear . . . whether the validity and reliability of classroom observation
procedures warrant their being considered as a legitimate approach for
summative evaluation."13
Reliability and validity of classroom observations can be enhanced if guidelines are established that address the following issues: how many visits and when are they carried out; who does the observing; how are observers selected and how many people are involved in the observations; what is observed, and, consequently, what is the character of the observational report; and to whom do observers report?
The following guidelines can enhance the quality of classroom observation by peers:
1. use with caution, training of observers is suggested to minimize bias
2. use several observations by several people over time
3. select observers with no biases (use multiple observers)
4. observations should be done with prior notification of candidate
5. observational criteria should be oriented towards currency/accuracy of material & ethical conduct (content & professionally oriented) rather than stylistically/rapport-oriented
6. records of colleague observational data should be summarized with explicit descriptions of the context of the observation14
Staff in the Office of Instructional Support have developed a protocol for classroom observation and performance review, which they have taught successfully to many individuals. This protocol is based on a cognitive development paradigm that fosters the improvement of practice, rather than on a remedial approach that limits teaching to a set of technical skills. More will be said about this process in Chapter 5. The following questions are included to assist the department in developing a comprehensive and consistent classroom observation protocol.
Structure and Goals
The instructor was fully prepared for class.
The instructor provides an overview of what is planned for the class period.
The instructor emphasizes the conceptual basis of the material.
The instructor's lectures are well organized.
The instructor provides periodic summaries of what has been covered or discussed.
The instructor uses class time efficiently.
The instructor ties things together at the end of class.
The instructor chooses appropriate activities for learning the material.
Teaching Behaviors
The instructor asks questions that encourage students to think about the subject.
The instructor is animated.
The instructor clearly explains instructions for completing required tasks.
The instructor leaves enough wait time after asking questions for students to think of a response.
The instructor uses eye contact effectively.
The instructor provides clear and comprehensive explanations when required.
Instructor-Student Rapport
The instructor encourages students to ask questions and express their opinions.
The instructor gives clear and understandable responses to students' questions.
The instructor seems genuinely concerned about the students' learning.
The instructor is actively helpful when students need assistance.
The instructor is skillful at promoting interaction among students.
The instructor is able to involve everyone in the class.
The instructor listens carefully to student questions and comments.
The instructor knows when students seem confused.
The instructor provides clear, relevant and understandable responses to student questions.
The instructor periodically checks to make sure everyone understands what has been covered.
The instructor is able to involve everyone in the class, not just the most outspoken students.
The instructor is interested in students as individuals.
The instructor listens carefully to student questions and comments.
The instructor holds students' attention.
Subject Matter and Instruction
The instructor stimulates interest in the subject matter.
The instructor relates various topics of the course to each other.
The instructor uses real-life anecdotes and examples to illustrate abstract ideas.
The instructor creates a classroom atmosphere conducive to learning.
The instructor seems enthusiastic about teaching the material.
The instructor makes effective use of props, visual aids, illustrations and examples.
The instructor demonstrates command of the subject matter.
Establishing evaluation standards and norms requires additional precautions: "Teachers regarded as excellent by some observers and poor by others should be rated by as many observers as possible. . . . Norms should be calculated at the campus level for some elements of any evaluation form used in promotion procedures . . . Departments or subject areas might find it useful also to calculate their own norms, particularly if they have developed their own evaluation forms, but it is desirable that any norms used be recalculated at frequent intervals to assure that the system of evaluation is being responsive to change."15
Teaching and course materials are evidence that a department can use to evaluate a tenure candidate's course design skills as well as skills necessary to effectively evaluate student learning.
Course organization
The course objectives are congruent with the department curricula.
The course objectives are clearly stated.
The syllabus adequately outlines the sequence of topics to be covered.
Is the syllabus current and relevant to the course outline?
Are the outline and topic sequence logical?
The intellectual level of the course is appropriate for the enrolled students.
Time given to the various major course topics is appropriate.
The course is an adequate prerequisite for other courses.
Written course requirements, including attendance policies, are included in the course syllabus.
Course content
The required or recommended reading list is up to date and includes works of recognized authorities.
A variety of assignments is available to meet individual needs.
Laboratory work, if a part of the course, is integrated into the course.
The assignments are intellectually challenging to the students.
Is it up to date?Is the instructor's treatment fair and lively?
Are conflicting views presented?
Are the breadth and depth of coverage appropriate for the course?
Has the instructor mastered the subject matter?
Evaluating student learning
The standards used for grading are communicated to the students in the course syllabus.
The written assignments and projects are chosen to reflect course goals.
The examination content is representative of the course content and objectives.
The tests used in the course have been well designed and selected.
The examination questions are clearly written.
The examinations and papers are graded fairly.
The grade distribution is appropriate to the level of the course and the type of student enrolled.
The examinations and papers are returned to the students in a timely fashion.
Students are given ample time to complete the assignments and take-home examinations.
The amount of homework and assignments is appropriate to the course level and to the number of credit hours for the course.
Is the examination suitable to content and course objectives?
Are tests graded and returned promptly?
Are the grading standards understood by students?
Is the grade distribution pattern appropriate for the course level?
How do students perform in more advanced courses?
Do students apply in their papers and projects the principles learned in the course?
What is the general quality of major homework assignments?
Course objectives
Have the objectives been clearly communicated to the students?
Are they consistent with the department's overall objectives?
If the course is a building block for a more advanced course, are the students being properly prepared?
Instructional methodology
Are the instructor's teaching approaches (lectures, discussion, films, fieldwork, outside speakers) suitable to the course objectives?
Is the pacing varied?
Do students use the library for the course?
Would audiovisual or television services strengthen the course?
Homework assignments
Do homework assignments supplement lectures and class discussions?
Do assignments reflect appropriate course goals?
Is the reading list relevant to course and department goals?
Is it appropriate to the course level?
Once a set of evaluation criteria has been established within a department, thought must be given to weighting the various evaluation sources. A departmental standing committee on teaching can be responsible for determining the relative weight attributable to each data source to ensure consistency between tenure cases and to explicitly communicate the department's expectations regarding teaching. An example of how this might be done is included in Table 3.
| Sources of information | Percentage of Total Evaluation |
| Student ratings of in-class activities |
|
| Peer rating of course design features: | |
organization |
|
goals |
|
instructional materials |
|
evaluation devices |
|
| Peer rating of teaching qualities: | |
| intellectual breadth |
|
| commitment to teaching |
|
| improvement of teaching practice |
|
| Peer rating of student achievement |
|
| Self-rating of overall teaching effectiveness & improvement |
|
A major premise of this handbook has been that the demonstrated improvement of practice should be a major criterion by which a tenure candidate is evaluated. Just what that means and how it can be accomplished, documented and evaluated is the subject of the last chapter.
Footnotes
1. J. Aubrecht, Delaware State College: personal communication.
2. G. Leinhardt. (1991)." Evaluating the New Handbook of Teacher
Evaluation" Educational Researcher, 20, 6, p. 24.
3. J. D. McNeil and W. J. Popham (1973). " The Assessment of Teacher
Competence." In R.M.W. Travers ed. Second Handbook of Research
on Teaching (Skokie, Ill.: Rand McNally), 218-244.
4. L. Shulman (1989). "Toward a Pedagogy of Substance," AAHE
Bulletin, June, 11.
5. Andrew Porter and Jere Brophy (l988). "Synthesis of Research on
Good Teaching: Insights from the Work of the Institute for Research on
Teaching."Educational Leadership, 78-83.
6 G. French-Lazovik (l981). "Peer Review: Documentary Evidence in
the Evaluation of Teaching." In Handbook of Teacher Evaluation,J.
Millman, ed.(Beverly Hills: Sage Publications), 77-78.
7. Milton Hildebrand, Robert C. Wilson, Evelyn R. Dienst (l971) Evaluating
University Teaching, Center for Research and Development in Higher
Education, University of California, Berkeley, 18.
8. Ibid., p. 20.
9. Ibid., pp. 18-19.
10. A. Sullivan (l985). "The Role of Two types of Research on the
Evaluation and Improvement of University Teaching." In Arthur Sullivan
and J. Donald eds., Using Research to Improve Teaching: New Directions
for Teaching and Learning (no. 23) ( San Francisco: Jossey-Bass), 76.
11. Hildebrand, et al., pp. 21-22.
12. A. Sullivan (l985), p. 16.
13. Peter Cohen and Wilbert McKechie p. 148.
14. G. R. Sell and N. Chism p 8.
15. Hildebrand p 38.
16. R. Miller (l987). Evaluating Faculty for Promotion and Tenure.
(San Francisco: Jossey-Bass).
17. Adapted from Peter Cohen and Wilbert McKeachie p. 152.
