Case-level artificial intelligence for multi-photo teledermatology submissions: development and internal validation using patient-submitted dermatology images
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Store-and-forward teledermatology commonly relies on several patient-submitted photographs of the same concern, but most dermatology artificial intelligence models classify single images independently.
Objective
To develop and internally validate a case-level diagnostic-support model that aggregates multiple patient-submitted photographs for common dermatologic conditions.
Methods
We conducted a retrospective diagnostic-modeling study using the Skin Condition Image Network, a public dataset of deidentified selftaken dermatology images from US adults. We curated 2,336 cases comprising 5,041 images across 10 common inflammatory, allergic, and infectious conditions. Cases were split at the submission level into training, validation, and held-out test sets. Frozen general-purpose and dermatology-specific encoders were compared with image-level classifiers and a gated-attention multiple instance learning model that generated one case-level output from 1–3 images.
Results
The strongest image-level baseline, dermatology-specific embeddings with random forest classification, achieved macro/micro ROCAUCs of 0.797/0.854. Case-level aggregation improved discrimination, with dermatology-specific embeddings plus multiple instance learning achieving mean macro/micro ROC-AUCs of 0.819/0.863 across repeated stratified experiments. The locked final model achieved macro/micro ROCAUCs of 0.800/0.849 on the held-out test set. Balanced-threshold sensitivity/specificity examples were 0.702/0.688 for eczema and 0.818/0.826 for urticaria.
Limitations
Internal validation used a 10-condition subset from a US volunteer dataset; external validation, calibration, subgroup performance analysis, and prospective workflow studies are required.
Conclusion
Modeling the teledermatology submission as a multi-image case better reflects asynchronous dermatology workflow than single-image classification. The model is preliminary clinician-facing support for structured review and triage, not autonomous diagnosis.
Key Points
-
Store-and-forward teledermatology submissions usually contain multiple patient-submitted photographs, whereas most dermatology AI models classify single images independently.
-
This study developed a case-level multiple instance learning model that aggregates 1–3 photographs from the same SCIN submission and produces one clinician-facing diagnostic-support output.
-
Case-level aggregation modestly improved discrimination over the strongest image-level baseline and produced threshold-specific sensitivity/specificity outputs suitable for structured review and triage research.