How Artificial Intelligence Is Expected to Transform Surgical Training

An idea previously reserved for science fiction, artificial intelligence (AI) has become ubiquitous within our world of modern technology, with tools such as ChatGPT, Google Lens, Microsoft PowerPoint Speaker Coach, and others entering the mainstream at an increasing rate.

AI loosely is defined as a field of computer science that uses theories and algorithms to give computer systems the ability to perform tasks such as problem-solving, language processing, and decision-making.¹ These computer systems have finally accrued a critical mass with regard to large data and knowledge such that use of AI can be seen across industries and is transforming every field, including medicine, surgery, and medical education.

Over the past 20 years, a substantial transformation in surgical training has occurred. Graduate medical education has shifted away from the traditional apprenticeship model of teaching to a more focused, competency-based training in the 21st century. The breadth of material to learn has expanded over time.

An evolution in operative techniques—from open to minimally invasive and endovascular or image-guided approaches—has had an impact on resident education, requiring extensive and prolonged training to gain proficiency. The American Board of Surgery (ABS) and the ACS have attempted to address these issues with the introduction of Entrustable Professional Activities (EPAs) and the Mastery in General Surgery Program, respectively.

This shift in surgical education workflow and productivity has not occurred in isolation but rather within a greater movement in information technology. AI now has the ability to impact healthcare and surgical training. This article explores how surgical trainees can interact with AI through surgical education, skills acquisition, and intraoperative decision-making, and it also examines ethical considerations surrounding AI.

AI in Surgical Education

Despite the relatively recent introduction and popularization of AI, there have already been multiple applications of this tool to benefit, advance, and facilitate surgical education.

One of the previously mentioned AI tools is ChatGPT, a large language model developed by OpenAI, trained using a large amount of textual data available on the Internet. ChatGPT was launched on November 30, 2022; by January 2023, it had already reached a total of 100 million users, making it the fastest-growing application.² This tool is able to produce text that emulates human language and can be applied in a variety of processing tasks.³

AI has been used to assist in surgical education, both from the learner’s and teacher’s perspectives.^2,4

The Learner Perspective

The rapid advancement of the medical field brings along an overwhelming amount of new information that needs to be triaged, skimmed, and then read and internalized continuously.

Data from 2016 showed that within the biomedical sciences, more than 1 million papers were being added to the PubMed database every year, corresponding to approximately two papers per minute.⁵

AI allows students to quickly triage the most relevant articles within an area of interest and even summarize their content for a quick read so they can stay up to date on new information and research.

Preparing for a big test or developing a study routine largely relies on creating good study habits and having access to study materials that are adequate to an individual’s best modality of learning.

ChatGPT and AI in general can assist the trainee in organizing a study routine by forming a study plan, as well as pointing out and generating personalized study materials, including flashcards, quizzes, and practice scenarios/questions. This process also could include generating material to prepare for a surgical case (e.g., operative steps, indications, complications, postoperative evaluation). These approaches to learning can be completed in a quick and efficient way using this technology.

Another way AI can be used by the learner is to have it generate case scenarios in the areas of interest by simulating patient encounters. This strategy allows the student to practice clinical decision-making, from clinical diagnosis to treatment. The student also can practice answering medical questions while preparing for a test or the beginning of a residency program.

AI can be used to practice how to communicate medical concepts to a patient in a clear and objective way, using adequate terms tailored to a patient’s level of health literacy. This approach also can be helpful to practice tough conversations, including those about end-of-life goals of care.

AI can help with searching literature and generating bibliographies for a research project. This includes translating material from other languages, which allows a broader range of articles to be screened and included in a project.

The Educator Perspective

Every student learns best in his or her own unique way. Using AI, the educator can recreate the same study content using different materials or methods to satisfy the needs of different styles of learning. This approach includes creating a variety of different exercises, quizzes, and clinical scenarios.

Trainees can leverage AI to prepare for exams, participate in journal clubs, or produce high-yield lectures and materials for their teams.

Algorithms already exist to screen residency applications using set parameters (e.g., standardized test scores, number of publications). However, AI can enhance the application process by using themes within personal statements, recommendation letters, and work-life experience to find a program’s ideal candidate that may otherwise be overlooked.

AI, which is not limited to passive forms of scoring, may eventually be capable of interacting with trainees who are taking examinations such as the ABS Certifying Exam and scoring them in real time.

In addition to being able to filter and sort large quantities of data, AI also has the ability to evaluate what is missing from a dataset. This functionality will allow educators to critically review and assess feedback and surgical resident performance results and evaluate areas of deficiency. With these AI-enabled assessments, instructors will be able to modify their teaching strategies in areas where students struggled.

Skill Acquisition and Surgical Simulation

Advances in surgical innovation have far outpaced advances in surgical training and competency. In 2017, George and colleagues suggested that as much as 20% of graduating residents did not meet competency in core surgical procedures.⁶ Restricted work hours and the increase in nonclinical administrative responsibilities have contributed to decreased operative experience during residency.⁷

This dilemma has led to an unprecedented opportunity for surgery programs to explore cutting-edge technologies and pursue innovative approaches for training surgical residents while reinforcing skill acquisition.⁸

Virtual-reality (VR)-based training modalities may offer a solution to the challenges posed by decreased operative experience during residency. VR platforms provide a closely simulated hands-on operative environment while addressing training gaps through checklists.

While simulators offer trainees the opportunity to hone their surgical skills, they do not inherently provide feedback the way an experienced surgical attending might.

However, machine learning may help bridge this gap. Machine learning, a subfield of AI, involves generating algorithms and models capable of novel prediction by using historical data as examples with the focus of improving accuracy.⁹

By combining VR simulation with machine learning, trainees would benefit from a fully immersive experience while receiving unbiased personalized feedback to more efficiently facilitate their skill acquisition. Much like flight simulators in the aviation industry, the demand for VR simulation, skill acquisition, and ultimately certified proficiency prior to operating likely will grow.

Intraoperative Guidance and Assessment

For surgical trainees, accuracy and reliable assessment of performance are critical for developing into a competent and safe surgeon.

Given the inherent risks, surgery often has been compared to other disciplines with similar levels of risk, such as aviation or motor racing, and yet a surgical trainee’s assessment is much less sophisticated. Broadly based on the apprenticeship model of Halsted, assessment is largely delivered by a surgeon mentor.

While objective means of performance assessment exist, such as the Objective Structured Assessment of Technical Skills and most recently the ABS launch of EPAs in July 2023, these tools are time-consuming and limited by rater bias.¹⁰

To that end, AI has the promise of rapid, automated, reproducible surgical performance assessment that is objective and would allow for continuous feedback without necessarily the need for direct observation by an expert surgeon.

New technological innovations such as robotic surgery platforms offer a wealth of digital information like kinematics, sensors, and video data that can be used to provide automated objective skill assessment and prompt constructive feedback to surgeons and trainees.

Robotic platforms are capable of mapping the kinematics and video data of its arms and tools and convert these movements into measurable metrics (e.g., speed and number of hand movements, path length, force, torque, time to complete operation). These metrics previously have been used to differentiate expert surgeon from novice post hoc, but only recently have machine learning techniques been able to automatically differentiate the skill level of the robotic operator within seconds.¹¹

AI also can be used to analyze surgical videos with the goal of providing decision-making support and improving surgical quality and patient outcomes. A group out of Stanford University in California developed a neural network model that analyzed over a thousand videos of the critical view of safety to investigate the reliability and utility of AI-driven procedure segmentation and annotation.¹²

The model identified whether the critical view of safety was achieved with remarkable efficiency (50 videos in 1 hour). However, the AI model was unable to discern nuances associated with disease severity; for instance, how much of the cystic duct needs to be visualized in a “difficult” gallbladder, remains a limiting factor.

Ethical Limitations

While the integration of AI technology into surgical training heralds a transformative era in the House of Surgery, the promise of better surgery and patient outcomes must be balanced with caution and consideration of their use.

The output of machine learning and other AI models is only as good as the accuracy of the data used to develop them. A recent study found AI systems do not account for malignant stereotypes, such as racial and sexual biases, and its facial-recognition model was unable to accurately identify women and people of color and ultimately unable to incorporate this information into its algorithm.¹³ Racial background and socioeconomic status are well-recognized contributors to disparities in surgical outcomes.^14,15 Developing algorithms and technologies that do not critically assess existing biases in datasets may lead to the perpetuation of these disparities.

AI in medicine requires massive amounts of patient data in order to develop reliable and accurate algorithms. With a growing number of cyber attacks, there remains an appropriate concern regarding the storage, management, and protection of patient health records.

Information security is critically important in protecting the patient and their information, and health systems must mitigate these risks by investing in expertise and data infrastructure.

As fascinating as many of these technologies promise to be, few studies demonstrate improvement in care when AI-assisted decision-making is used.¹⁶

Barriers to clinical impact remain, including interpreting results arrived by AI. Known as the “black box” design, researchers often are left with little ability to assess or understand why the system arrived at a particular decision or solution.

In other words, AI is unable to yet determine causal relationships in data in ways necessary for clinical implementation nor is it able to provide interpretation of its own analyses.¹ In effect, predictions generated by AI must continue to be critically assessed by humans, who will then interpret the data in clinically relevant ways.

Ultimately, as it stands, AI does not have promise to replace the clinician, particularly regarding the skills associated with bedside acumen and decision-making, and this reality should continue to be reflected in surgical training models.

Looking Ahead

AI has the ability to greatly shape the field of surgery going forward and this can perhaps best be summarized by AI itself.

User Prompt>>

Write a conclusion for our editorial for the American College of Surgeons on how artificial intelligence will improve surgical training.

ChatGPT>>*

“The era of AI-driven surgical training holds immense potential to revolutionize the way surgeons are educated and trained. By harnessing the power of AI, the American College of Surgeons can empower future generations of surgeons with enhanced technical skills, deeper anatomical knowledge, and data-driven approaches to patient care. Through responsible integration and continued collaboration between humans and machines, we can achieve safer surgeries, improved patient outcomes, and a brighter future for surgical practice.”

Dr. Nathan Rubalcava is a pediatric surgery fellow in the Department of Surgery at the Children’s Mercy Hospital in Kansas City, MO.

*Text generated by OpenAI’s AI Assistant: https://openai.com.

References

Hashimoto DA, Rosman G, Rus D, Meireles OR. Artificial intelligence in surgery: Promises and perils. Ann Surg. 2018;268(1):70-76.
Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: A conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9. Available at: https://mededu.jmir.org/2023/1/e46885/PDF. Accessed July 18, 2023.
Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT-reshaping medical education and clinical management. Pakistan J Med Sci. 2023;39(2):605-607.
Bergman D. Why ChatGPT is a “game changer” for med students. Med Page Today. April 18, 2023. Available at: https://www.medpagetoday.com/opinion/kevinmd/104085. Accessed June 24, 2023.
Landhuis E. Information overload: How to manage the research-paper deluge? Blogs, colleagues and social media can all help. Nature. 2016;535(7612):457-458.
George BC, Bohnen JD, Williams RG, et al. Readiness of US general surgery residents for independent practice. Ann Surg. 2017;266(4):582-594.
Bell RH, Biester TW, Tabuenca A, et al. Operative experience of residents in US general surgery programs: A gap between expectation and experience. Ann Surg. 2009;249(5):719-724.
Rogers MP, DeSantis AJ, Janjua H, et al. The future surgical training paradigm: Virtual reality and machine learning in surgical education. Surg (US). 2021;169(5):1250-1252.
Kanevsky J, Corban J, Gaster R, et al. Big data and machine learning in plastic surgery: A new frontier in surgical innovation. Plast Reconstr Surg. 2016;137(5):890e-897e.
Davids J, Lam K, Nimer A, Gianarrou S, et al. AIM in medical education. In: Artificial Intelligence in Medicine. Springer International Publishing; 2022:319-340.
Fard MJ, Ameri S, Darin Ellis R, et al. Automated robot-assisted surgical skill evaluation: Predictive analytics approach. Int J Med Robot Comput Assist Surg. 2018;14(1):1-11.
Korndorffer JR, Hawn MT, Spain DA, et al. Situating artificial intelligence in surgery: A focus on disease severity. Ann Surg. 2020;272(3):523-528.
Hundt A, Agnew W, Zeng V, et al. Robots enact malignant stereotypes. ACM Int Conf Proceeding Ser. 2022. Available at: https://arxiv.org/pdf/2207.11569.pdf. Accessed July 18, 2023.
Suneja N, Kong RM, Tiburzi HA, et al. Racial differences in orthopedic trauma surgery. Orthopedics. 2022;45(2):71-76.
Goljo E, Parasher AK, Iloreta AM, et al. Racial, ethnic, and socioeconomic disparities in pituitary surgery outcomes. Laryngoscope. 2016;126(4):808-814.
Zhou Q, Chen Zhi- hang, Cao Y heng, Peng S. Clinical impact and quality of randomized controlled trials involving interventions evaluating artificial intelligence prediction tools: A systematic review. NPJ Digit Med. 2021;4(1). Available at: https://www.nature.com/articles/s41746-021-00524-2. Accessed July 18, 2023.