Artificial Intelligence in Colonoscopy: Could It Be Making Us Worse?
Margaret J. Zhou, MD
Clinical Assistant Professor of Medicine, Division of Gastroenterology & Hepatology, Stanford University Medical Center, Stanford, CA
This summary reviews Budzyń K, Romańczyk M, Kitala D, et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. Lancet Gastroenterol Hepatol. 2025 Oct;10(10):896-903.
Access the article through PubMed
Correspondence: Margaret J. Zhou, MD. Associate Editor. Email: EBGI@gi.org
Keywords: Artificial intelligence; colonoscopy; observational study
STRUCTURED ABSTRACT
Question: Does exposure to artificial intelligence (AI) tools for colonoscopy impact non-AI assisted colonoscopy quality?
Design: Retrospective, observational study
Setting: Four endoscopy centers in Poland
Patients: This study included patients taking part in the ACCEPT (Artificial Intelligence in Colonoscopy for Cancer Prevention) trial. Patients were excluded if they had a contraindication to biopsy/polypectomy due to anticoagulant use or coagulation disorders, were pregnant, were referred for a known lesion, had a history of bowel resection or inflammatory bowel disease, or if the colonoscopy was not performed with a high-definition colonoscope.
Interventions: The 4 participating centers introduced AI computer-aided detection (CADe) tools in late 2021, after which colonoscopies were randomly assigned to be conducted with or without AI assistance according to exam date. The AI system used was ENDO-AID CADe (OIP-1, Olympus Medical Systems, Tokyo). This study compared non-AI assisted colonoscopies performed 3 months before and 3 months after AI tools were implemented at these centers.
Outcomes: The primary outcome was the change in adenoma detection rate (ADR) of non-AI assisted colonoscopies before and after exposure to the AI tool. ADR included adenomas or cancers, but not sessile serrated lesions (SSLs). Secondarily, the authors measured the change in the mean number of adenomas per colonoscopy (APC) and the mean number of advanced APCs before and after AI exposure.
Data Analysis: ADR before and after AI exposure was compared using a χ² test. Mean number of APCs and advanced APCs before and after AI exposure was compared using a t-test. ADR was also compared by subgroups of center, physician specialty, and endoscopist sex.
Multivariable logistic regression was performed to identify variables affecting ADR with a random effect for endoscopist. Analyzed variables included patient age and sex, use of sedation, Boston Bowel Preparation Scale score, cecal intubation, endoscopist specialty (gastroenterologists vs surgeons), endoscopist’s years after graduation from medical school, endoscopist sex, center, and AI implementation. Variables with a P value <0.05 in the univariable model were included in the adjusted multivariable model.
Funding: Authors of the study report financial support from the European Commission and the Japan Society for the Promotion of Science.
Results: Between September 2021-March 2022, 1,443 colonoscopies were performed without AI, of which 795 were performed before introduction of AI and 648 performed after. Colonoscopies were performed by 19 endoscopists (16 gastroenterologists and 3 general surgeons), who had performed >2,000 colonoscopies each with mean experience of 28 years (range 8–39).
Factors which had statistically significant differences between the patients in the before AI group vs those in the after AI group included higher proportion of female patients (62% vs 55%, respectively; P = 0.005) and lower proportion of patients using sedation (77% vs 82%, respectively; P = 0.02) (Table 1).1 Indications for colonoscopy were overall similar for alarm symptoms, surveillance, or positive fecal occult blood test.
ADR before vs after AI exposure decreased significantly from 28.4% to 22.4% (absolute difference -6.0% [95% CI -10.5 to -1.6%, P= 0.009]) (Figure 1). Mean APC before vs after AI exposure was not significantly different (0.54 vs 0.43; mean difference 0.11 [95% CI -0.01 to 0.24; P = 0.071]). Mean advanced APC was also similar (0.062 vs 0.063; mean difference -0.002, 95% CI -0.03 to 0.03; P = 0.92). Colorectal cancers were detected in 6 (0.8%) of colonoscopies before AI exposure vs 8 (1.2%) after AI exposure (P = 0.35).
Variables associated with a statistically significant change in ADR in multivariable logistic regression analysis included exposure to AI (adjusted odds ratio [aOR] 0.69; 95% CI 0.53-0.89; P = 0.005), male patient sex (aOR 1.78; 95% CI 1.38-2.30; P < 0.0001), and patient age ≥60 years (aOR 3.60; 95% CI 2.74-4.72; P < 0.0001).
Figure 1. Change in ADR with standard, non-AI assisted colonoscopy before and after introduction of AI for polyp detection.
Table 1. Patient characteristics for those who had standard, non-AI assisted colonoscopies.
Data are n (%) unless other indicated.
*Defined by score of at least 6 on Boston Bowel Preparation Scale.
†Significant assessment was not done due to too few events.
‡Weight loss, anemia, GI bleeding signs, and tumor seen in CT scan.
§ Change in bowel habits or diarrhea.
COMMENTARY
Why Is This Important?
This is the first study to assess the impact of exposure to AI CADe on colonoscopy quality and physician performance in the absence of AI assistance. One of the posited risks of AI tools has been a decline in human-only colonoscopy quality, and this study provides important, novel observational data on how these AI tools may potentially negatively impact physician performance.
Prior evidence suggests that colonoscopy outcomes including ADR likely improve with CADe-assisted colonoscopy vs conventional colonoscopy. A recent systematic review/meta-analysis of 44 RCTs comparing CADe-assisted vs standard colonoscopy analyzing >36,000 patients found higher ADR with CADe vs standard colonoscopy (44.7% vs 36.7%; rate ratio 1.21; 95% CI 1.15-1.28).2 APC was also higher with CADe vs standard colonoscopy (0.98 vs 0.78; incidence rate difference [IRD] 0.22; 95% CI 0.16-0.28). Examining 22 studies with >19,000 patients, advanced colorectal neoplasia (ACN) detection rate was slightly higher with CADe (12.7% vs 11.5%; RR 1.16; 95% CI 1.02-1.32). This meta-analysis was used to inform the American Gastroenterological Association (AGA)’s living clinical practice guideline in 20253 which made no recommendation on the use of CADe-assisted colonoscopy, due to very low certainty of evidence on the impact of CADe-assisted colonoscopy on long-term outcomes such as colorectal cancer (CRC) incidence and mortality.
Interestingly, Budzyń et al pose the question of whether differences in ADR seen in prior studies of AI-assisted colonoscopy vs non-AI assisted colonoscopy as standard of care could in part be impacted by a possible reduction in ADR with unassisted colonoscopy seen after AI exposure. This hypothesis is difficult to support without delving more deeply into each study’s design (including whether non-CADe colonoscopies were performed by endoscopists who had vs had not been exposed to AI tools). Nonetheless, the authors do cite an interesting study looking at the impact of a CADe system on visual gaze pattern assessed on colonoscopy video sequences, which found that use of CADe was associated with a significant reduction in eye travel distance compared to non-CADe exams.4 While this study does not address the question of how any CADe exposure may impact endoscopist metrics, the potential impact on visual gaze could potentially be a mechanism by which prolonged CADe exposure could potentially impact performance over time. The trend towards potential decline in skills after AI exposure has been evaluated in other areas as well, including a recent study demonstrating that during the task of writing SAT essays, users of ChatGPT had the lowest neural engagement (assessed by electroencephalography) compared to those not using AI tools.5 Further investigation into the impact of AI tools on human skills will continue to be important as AI is increasingly deployed in clinical practice.
Key Study Findings
Caution
The authors thoughtfully consider some limitations in this study, the most significant being that this was an observational study susceptible to confounding and selection bias. The study population was a nested cohort within a RCT, and the manuscript does not describe in detail the patient selection for those included in this study; specifically, it is not clear if all patients in the non-AI cohort after the introduction of the CADe tool were randomized into that group. A major concern for this study is potential differences in patient population between the groups before vs after AI introduction. Multiple variables demonstrated statistically significant differences between the two groups, which the authors adjusted for in their analysis; however there is likely still residual confounding. Furthermore, results were obtained from only 19 endoscopists of at least moderate experience level which may limit generalizability, and there were insufficient colonoscopies to allow per-endoscopist analyses. Importantly, withdrawal time was not reported in this study, which could potentially significantly impact ADR. ADR for this cohort was also quite low (overall ADR among the 1,443 colonoscopies performed without AI was 25.7%). While Poland does not have set ADR targets as is seen in the U.S., generalizability of these study results to endoscopists in the U.S. who now have an ADR target of 35% may be limited.
My Practice
My institution currently does not use any CADe system for colonoscopy. A 3-month trial of a CADe device (GI Genius; Medtronic, Minneapolis, MN) at my institution was previously described,6 where a retrospective pragmatic trial was conducted at our outpatient endoscopy unit. In contrast to many prior studies, we saw no statistically significant difference in ADR with vs without CADe (40.1% vs. 41.8%; OR 1.14; 95% CI 0.83-1.56; P = 0.41) or in mean APC (0.78 vs 0.89; OR 1.08; 95% CI 0.80-1.45; P = 0.63). Based on these findings, our endoscopy unit did not continue with the CADe system, although this remains under active discussion.
With or without the CADe system, I prioritize using best practices for a high-quality colonoscopy.7 First, our unit uses split prep for all patients. During the procedure, I take care to maximize mucosal exposure by irrigating and using an Endocuff, and either perform a second look or retroflex in the right colon. Our endoscopy unit has a robust Colonoscopy Quality Assurance Program8 which monitors and reports endoscopist ADR, SSL-DR, and withdrawal times with steps to improve performance among endoscopists with lower detection rates.
For Future Research
Prospective trials comparing colonoscopy quality before vs after AI exposure are needed to validate these findings. These studies should evaluate outcomes including ADR and APC as well as other patient-important outcomes such as sessile serrated lesion (SSL) detection rate and CRC incidence.
Conflicts of Interest
Dr. Zhou reports no conflicts of interest related to this study.
REFERENCES
- Budzyń K, Romańczyk M, Kitala D, et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. Lancet Gastroenterol Hepatol. 2025 Oct;10(10):896-903.
- Soleymanjahi S, Huebner J, Elmansy L, et al. Artificial Intelligence-assisted colonoscopy for polyp detection: A systematic review and meta-analysis. Ann Intern Med. Dec 2024;177(12):1652-1663.
- Sultan S, Shung DL, Kolb JM, et al. AGA Living Clinical Practice Guideline on Computer-Aided Detection-Assisted Colonoscopy. Gastroenterology. 2025;168(4):691-700.
- Troya J, Fitting D, Brand M, et al. The influence of computer-aided polyp detection systems on reaction time for polyp detection and eye gaze. Endoscopy. Oct 2022;54(10):1009-1014.
- Kosmyna N HE, Yuan YT, et al. Your brain on ChatGPT: Accumulation of cognitive debt when using an AI Assistant for essay writing task [preprint]. arXiv. Published June 10, 2025. doi:10.48550/arXiv.2506.08872.
- Ladabaum U, Shepard J, Weng Y, Desai M, Singer SJ, Mannalithara A. Computer-aided detection of polyps does not improve colonoscopist performance in a pragmatic implementation trial. Gastroenterology. Mar 2023;164(3):481-483.e6.
- Keswani RN, Crockett SD, Calderwood AH. AGA Clinical Practice Update on Strategies to Improve Quality of Screening and Surveillance Colonoscopy: Expert Review. Gastroenterology. 2021;161(2):701-711.
- Ladabaum U. The Stanford Colonoscopy Quality Assurance Program: Lessons from the intersection of quality improvement and clinical research. Gastroenterology. 2023;164(6):861-865.

