American College of Surgeons: Technical Skills Education in Surgery
Foreword Introduction Background Criteria Study Design and Glossary of Terms Kirkpatrick Hierarchy FAQs
Categories and Article Reviews CME Program Selected References for Noteworthy Review Articles from the Literature
FAQs
 

What is reliability?
Reliability can be defined as the ability of an instrument to consistently discriminate between performances across evaluators or over time. It is measured on a scale of 0 to 1, with 0 being totally unreliable and 1 being completely reliable. With very important decisions, it is generally agreed that a reliability of > 0.8 is necessary.

There are several types of reliability including:

Inter-Rater or Inter-Observer Reliability refers to the degree to which different raters/observers provide the same assessment.

Test-Retest Reliability refers to the consistency of a measure from one time point to another.

Internal Consistency or Inter-Station Reliability refers to the ability of different items on an assessment tool or examination to measure the same characteristic or skill.

What is validity?
The term validity refers to the ability of an assessment tool to measure what it is supposed to measure. There are several types of validity.

Face validity addresses the extent to which the examination resembles the situation in the "real world". For example, does a suturing task in a bench top laparoscopic model resemble laparoscopic suturing in the real world setting?

Criterion validity refers to the extent to which an assessment tool correlates with other measures of performance. There are two types of criterion validity. Predictive validity addresses the ability of a tool to predict future performance (e.g. the ability of performance on the MIST-VR to predict future performance as a surgeon). Concurrent validity refers to the correlation between an assessment tool and the perceived "gold standard" (e.g. performance in the lab measured by global ratings compared with performance measured by faculty ratings).

Construct validity describes the agreement between a theoretical concept and a specific assessment tool or procedure. For example, in order to demonstrate that a new simulator has construct validity as a measure of technical performance, more senior surgeons should score higher on its assessment parameters than more junior ones.

Content Validity refers to the extent to which a measurement reflects the trait or domain it purports to measure. For example, an assessment of a resident performing a laparoscopic cholecystectomy on an anesthetized pig has higher content validity as a measure of surgical skill than a multiple choice exam on the anatomy of the gall bladder.

What is a Surgical Skills Center?
Surgical Skills Centers are technical training centers for both new and experienced surgeons, and are growing in popularity. Usually affiliated with medical schools, they provide a dedicated space in which surgical trainees can learn and practice their skills. Common equipment in these centers include an array of operating room instruments, different types of surgical simulators (low-fidelity, high fidelity, bench-top, virtual reality, etc.) as well as necessary disposables (tubes, sutures, guaze, etc) to allow for adequate replication of numerous different procedures.

How do you measure surgical performance?
Traditionally, surgical performance has been assessed almost exclusively through unstructured, subjective observation. Research in the field of surgical education, however, has lead to the development of several different assessment tools in the hopes of providing more reliable and valid measures of surgical performance. They include the Objective Structured Assessment of Technical Skills (OSATS), hand motion analysis, final product analysis, and time taken for completion of a task.

What is the OSATS?
The Objective Structured Assessment of Technical Skills (OSATS) is an OSCE-like examination in which subjects perform standardized surgical procedures while being evaluated by an expert surgeon. The assessment is made using a task-specific checklist in which all important steps of a procedure are identified and a mark is assigned for every step the candidate completes correctly (0 or 1). Also included is a global assessment score (GAS) which is a 7-item assessment of overall performance such as respect for tissues and flow of procedure. Studies have repeatedly demonstrated the reliability and validity of this assessment tool.

What is Hand Motion Analysis (HMA)?
Borrowed from the field of kinesiology, HMA uses electromagnetic markers to track the hand movements during a variety of surgical tasks to determine numerous outcome variables including the number of hand movements and path length. Studies by several groups have demonstrated its validity and reliability as an assessment tool although its use to-date has been limited to the laboratory-based training environment. Its advantage lies in both its use of objective measures of performance and lack of dependency on expert raters for assessment.

What is aptitude?
Aptitude refers to an individual's ability to learn or perform certain skills. Aptitude tests refer to standardized tests designed to measure an individual's ability to develop certain skills. Studies have applied tests of psychomotor ability, cognitive knowledge, and personality and attempted to relate them to measures of surgical skill. Although several studies suggest aptitude tests might be used to predict an individual's ability to perform surgery, much of the existing literature in this area has had conflicting results.

What is skill transfer?
Skill transfer refers to the application of a skill learned in one situation to a different but similar situation. It is also called transfer of training. One of the key questions in the technical skills education literature is the question of skill transfer from simulators to clinical contexts. This form of skill transfer refers to whether the skills acquired on simulators ultimately lead to improved performance of those skills inside the operating room setting. Other forms of skill transfer include transfer between models (e.g. from bench models to virtual reality trainers) or between related tasks (e.g. from open to laparoscopic surgery).

What types of surgical simulators are available?
An important number of simulators have been developed for various surgical skills training and evaluation. Many are described in the literature; few have undergone a rigorous evaluation of their teaching effectiveness and validity, although there is an emerging body of literature in the field.

There is currently no formal classification of surgical simulators. However, they could be classified according to different criteria, including for example, their level of fidelity (from low to high), the task they simulate (part task or whole task trainers), or the "technology" they employ (bench models or mechanical simulators, animal or human cadavers, part or full scale mannequins, computer based simulator generating a virtual reality, or hybrid simulators).

What is meant by model fidelity?
"Fidelity" refers to the extent to which a given simulator or model imitates reality. Several dimensions of realism can be considered such as visual cues, tactile features, haptic feedback capabilities, and dynamic interaction with the learners.

Model fidelity is best conceptualized as a continuous spectrum, ranging from low to high-fidelity. Examples of low fidelity models include bench models made of simple materials that often have little anatomical resemblance with reality. However, these models incorporate some of the key constructs of the simulated tasks. At the other end of the spectrum are high fidelity models such as human or animal cadavers or the new array of virtual reality simulators. These simulators usually incorporate highly realistic visual and tactile cues in the midst of a highly interactive model. In between these two extremes, virtually any kind of intermediate fidelity can exist.

Model fidelity possibly plays a role in the training effectiveness of a given simulator. However, it has been argued that it is not so much the level of fidelity that matters but ratter the appropriate match between the learner's level of expertise and the model fidelity. Novices may benefit from training with low-fidelity models whereas more experienced learners may require more complex simulators.

For what surgical procedures are virtual reality (VR) simulators available?
Many VR simulators are currently being developed and it is likely that the number of simulators commercially available will increase in the near future.

The list below includes the surgical domains or procedures where at least one VR simulator is commercially available or where there is a prototype that has already undergone some testing in the literature. This list is not supposed to be exhaustive, but it is intended to provide the reader with a quick overview of what is currently available. For more detailed information, the reader should refer to the relevant reviews.

  • Laparoscopic surgical simulator (basic laparoscopic skills or complex procedures)
  • Endoscopic simulators
    • Bronchoscopy
    • Colonoscopy and sigmoidoscopy
    • Upper GI endoscopy
    • Endourological procedures
    • Hysteroscopy
    • Sinusoscopy
  • Anastomosis simulator (bowel or vascular)
  • Orthopedic and arthroscopic simulators
  • Ophthalmologic simulators
  • Intravenous catheter insertion simulators

 

This page and all contents are Copyright © 2006 by the American College of Surgeons, Chicago, IL 60611-3211
ContactDisclaimerACS Home