How to develop a sound resident performance evaluation system
Debra A. DaRosa, PhD
Examination results should serve as supplements to, rather than substitutes for, expert judgments about a student’s or resident’s performance. The most common analogy used to illustrate this point is that of the airline pilot trainee. We would be far less confident if informed before the start of our flight that the new pilot about to command our plane was qualified to do so based on performance on a multiple-choice exam. Our confidence in the individual’s abilities would be much greater if we were informed that this trainee performed alongside experts who, for hun-dreds of hours, directly observed and assessed his/her qualifications before graduating him/her to fly without supervision. Hence, it is critical for our performance evaluation system to include more than a form. The purpose of this article is to outline questions every resident performance evaluation system should address in a written policy and procedures document and to outline the general design features of an evaluation system.
Performance Evaluation System Audit Questions
Why are we evaluating the residents?
The program director should outline the various purposes that the evaluation data will serve to manage the information as needed. The more purposes served, the more complex the system. For example, does the system simply need to serve educational needs (feedback on progress, motivate learners, etc.) or will it also be used for research (for example, resident selection study, performance comparisons) and program evaluation? Different uses of the data may require different report formats for easy translation.
What should be evaluated?
How specific do you want the data to be? If you gather information on the basics such as knowledge, skills, and attitudes, will there be sufficient information to offer constructive advice to the resident on the areas that need improvement? The balance between being too specific and being overly general is a challenge warranting attention. It is important to sample broadly the knowledge, skills, or behaviors being evaluated. For example, evaluation data should be collected based on performance in the clinic, OR, skills lab, and other clinical venues. This yields more meaningful information than a collection of global ratings.
Who should evaluate?
Should it be full-time faculty only? Should senior residents evaluate junior residents or junior residents evaluate the leadership skills of senior residents? Should medical students evaluate the teaching skills of the residents? Some programs use self-evaluation tools and others use nurses to evaluate residents’ patient relations and team relations skills. The answer to this question relies heavily on answers to the “what” question.
When should evaluation take place?
Should residents receive written performance feedback at the end of every rotation or just twice per year, which is when the Residency Review Committee (RRC) requires program directors to evaluate the residents? Do the residents want midrotation written feedback so they don’t become aware of adverse feedback only at the end? Also, when should the data be analyzed, summarized, and disseminated to the faculty and residents?
How should you evaluate?
Although the focus of this article is on the subjective rating system, would a complete evalua-tion system involve multiple measures? Will the system involve Objective Structured Clinical Examinations (OSCEs), in-house exams, mock orals, etc.? You need to ensure adequate psychometric properties for each.
So What?
How will the data be used, by whom, and what consequences or actions will be triggered given certain results? It is best to anticipate results and determine outcomes at the start. For example, if a resident receives negative ratings from a faculty member, what will be done? A system procedure might include ratings being sent to the program coordinator and anything with negative narrative or ratings being sent directly to the program director, who would then immediately follow up with the author and resident rather than wait until the six-month evaluation meeting. If actions or decisions are difficult or cannot be made from the data because of missing rating forms, tardy forms, or improperly completed forms, the system is broken and needs to be fixed.
There are other design considerations, such as commitment from the faculty to cooperate, faculty “buy in” and involvement, and plans for how to contend with system “snipers.” There should be an annual review of the system to see how well it is working and if it is meeting the information needs of the program director, RRC, residents, and faculty.
On What Form?
No perfect performance evaluation form exists, but the ones that are used are only as good as the information recorded by the faculty. Therefore, focus on the raters rather than the form if you want to improve your performance evaluation data. However, the form should provide for global ratings as well as specific comments that note areas of strength and areas needing improvement (even the best can get better!). I’d encourage you to read the paper by William, Dunnington, and Klamen titled “Forecasting Residents’ Performance-Partly Cloudy”1 for a practical guideline to improving resident appraisal.
I hope those of you rethinking your performance evaluation system find that asking these basic questions is helpful as you audit your current system and plan for enhancements.
- Williams R., Dunnington G, and Klamen D., Forecasting Residents’ Performance-Partly Cloudy. Academic Medicine, Vol. 80, No. 5/May 2005.
Online March 1, 2007


