Introduction: Crohn’s Disease (CD) is a chronic progressive inflammatory condition of the gastrointestinal tract currently affecting an estimated 750,000 children and adults in the United States and increasing in prevalence worldwide. Timely, accurate diagnosis in the pediatric population is important to prevent associated morbidity, including malnutrition and impaired growth. However, inter-radiologist agreement has been documented as only moderate, and radiologist diagnostic accuracy ranges widely. We sought to develop a machine learning method for predicting small bowel Crohn’s disease (CD) using radiomic features from a single noncontrast T2-weighted MRI sequence and limited clinical data, and then compare its performance to human radiologists.
Methods: In this retrospective, single institution study, existing MRE and clinical data acquired between 2018 and 2021. Axial T2-weighted single-shot fast spin-echo (SSFSE) images were independently reviewed by three fellowship-trained pediatric abdominal radiologists for the presence of ileal CD, blinded to all clinical data. Using the same sequence, another radiologist identified two images from the area of greatest terminal ileal (TI) wall thickening, from which four 2D regions of interest (ROI) were segmented. Radiomic features were extracted from each ROI and averaged between slices. Using radiomic and clinical data and following LASSO feature reduction, support vector machine models were used to classify patients as having CD, using clinical diagnosis of CD as reference standard. Radiomic-only, clinical-only, and ensemble models were trained and evaluated using nested cross-validation.
Results: A total of 135 patients (mean age 15.2±3.2 years; 49.6% female) were included in this study. 70/135 (51.9%) patients were clinically diagnosed with CD. Three radiologists had accuracies of 83.7%, 86.7%, and 88.1% for diagnosing CD, with a consensus accuracy of 88.1% and. There was substantial inter-reader agreement (Fleiss’ kappa=0.78). The best performing ROI was the bowel core (AUC=0.95±0.01, accuracy=89.5±1.3%); other ROIs had worse diagnostic performance (whole bowel AUC=0.86±0.02; fat core AUC=0.70±0.03; and whole fat AUC=0.73±0.03). A clinical-only model had AUC=0.85±0.01 and accuracy=80.0±1.7%. Ensembling the bowel core and clinical models achieved the best diagnostic performance (AUC=0.98±0.01, accuracy=93.5%±1.2%). The five most predictive radiomic features of the bowel core ROI all characterized image texture.
Conclusions: An ensemble machine learning model using T2-weighted radiomic and clinical data has excellent diagnostic performance for diagnosing pediatric small bowel CD, with a diagnostic performance exceeding expert human radiologists. With continued refinement, this model could improve diagnostic performance and decrease inter-reader variability in clinical practice.
Contact
information: liurc@mail.uc.edu
Key
words: Crohn’s Disease, machine learning, radiomics