Oral Presentation Skin Cancer 2024

Diagnostic performance of neural networks in dermoscopic assessment of melanocytic lesions: context is critical. (#115)

Nicholas M Muller 1 2 , Chantal Rutjes 1 , Samuel Tan 1 2 , Brigid Betz-Stablein 1 , Kiarash Khosrotehrani 1 2 , Zongyuan Ge 3 , Peter Soyer 1 , Ian Katz 4
  1. Frazer Institute, University of Queensland, Woolloongabba, Queensland, Australia
  2. Department of Dermatology, Princess Alexandra Hospital, Woolloongabba, Queensland, Australia
  3. Monash Medical AI Group, Monash University, Clayton, Victoria, Australia
  4. Southern Sun Pathology, Thornleigh, New South Wales, Australia

Introduction:

Artificial intelligence (AI) holds immense promise as a diagnostic tool for clinicians. Here, we examine the utility of convolutional neural network models for the dermoscopic diagnosis of melanoma.

 

Methods:

We compare models trained on a dataset that included images from European and American sources (CNN-1) where one (SMARTI) had also been pre-trained on an Australian dataset. Dermoscopic images were collected from a prospectively recruited cohort of 210 lesions (from 191 patients) biopsied due to suspicion for melanoma. Each biopsy was diagnosed by five pathologists and compared to diagnoses of the two AI models.

 

Results:

CNN-1 yielded an area under the receiver-operator curve of 0.682 while SMARTI yielded 0.725. CNN-1 had a specificity of 0.35 (95% confidence interval (CI) 0.27-0.45) and sensitivity of 0.91 (CI 0.84-0.96). Whereas SMARTI demonstrated a specificity of 0.26 (CI 0.19-0.35) at a sensitivity of 0.95 (CI 0.88-0.98). We observed a higher inter-rater agreement for lesions correctly classified by SMARTI (Fleiss’ Kappa 0.788) relative to lesions misclassified by SMARTI (Fleiss’ Kappa 0.406). So, lesions misclassified by the AI model were also divisive for pathologists.

 

Conclusion:

These results demonstrate the impact of population relevant training data on the performance of dermoscopy AI. We find that lesions that were incorrectly diagnosed by AI also provoked disagreement between independent pathologists. This highlights the importance of incorporating multiple diagnosticians to establish ground truth for training datasets.

  1. Barata, C., Rotemberg, V., Codella, N. C. F., Tschandl, P., Rinner, C., Akay, B. N., Apalla, Z., Argenziano, G., Halpern, A., Lallas, A., Longo, C., Malvehy, J., Puig, S., Rosendahl, C., Soyer, H. P., Zalaudek, I., & Kittler, H. (2023). A reinforcement learning model for AI-based decision support in skin cancer. Nature Medicine, 29(8), 1941-1946. https://doi.org/10.1038/s41591-023-02475-5
  2. Barnhill, R. L., Elder, D. E., Piepkorn, M. W., Knezevich, S. R., Reisch, L. M., Eguchi, M. M., Bastian, B. C., Blokx, W., Bosenberg, M., Busam, K. J., Carr, R., Cochran, A., Cook, M. G., Duncan, L. M., Elenitsas, R., de la Fouchardière, A., Gerami, P., Johansson, I., Ko, J., . . . Elmore, J. G. (2023). Revision of the Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis Classification Schema for Melanocytic Lesions: A Consensus Statement. JAMA Network Open, 6(1), e2250613-e2250613. https://doi.org/10.1001/jamanetworkopen.2022.50613
  3. Elmore, J. G., Barnhill, R. L., Elder, D. E., Longton, G. M., Pepe, M. S., Reisch, L. M., Carney, P. A., Titus, L. J., Nelson, H. D., Onega, T., Tosteson, A. N. A., Weinstock, M. A., Knezevich, S. R., & Piepkorn, M. W. (2017). Pathologists’ diagnosis of invasive melanoma and melanocytic proliferations: observer accuracy and reproducibility study. BMJ, 357, j2813. https://doi.org/10.1136/bmj.j2813
  4. Felmingham, C., MacNamara, S., Cranwell, W., Williams, N., Wada, M., Adler, N. R., Ge, Z., Sharfe, A., Bowling, A., Haskett, M., Wolfe, R., & Mar, V. (2022). Improving Skin cancer Management with ARTificial Intelligence (SMARTI): protocol for a preintervention/postintervention trial of an artificial intelligence system used as a diagnostic aid for skin cancer management in a specialist dermatology setting. BMJ Open, 12(1), e050203. https://doi.org/10.1136/bmjopen-2021-050203
  5. Felmingham, C., Pan, Y., Kok, Y., Kelly, J., Gin, D., Nguyen, J., Goh, M., Chamberlain, A., Oakley, A., Tucker, S., Berry, W., Darling, M., Jobson, D., Robinson, A., de Menezes, S., Wang, C., Willems, A., McLean, C., Cranwell, W., . . . Zallmann, M. (2023). Improving skin cancer management with ARTificial intelligence: A pre-post intervention trial of an artificial intelligence system used as a diagnostic aid for skin cancer management in a real-world specialist dermatology setting. Journal of the American Academy of Dermatology, 88(5), 1138-1142. https://doi.org/https://doi.org/10.1016/j.jaad.2022.10.038
  6. Katz, I., O'Brien, B., Clark, S., Thompson, C. T., Schapiro, B., Azzi, A., Lilleyman, A., Boyle, T., Espartero, L. J. L., Yamada, M., & Prow, T. W. (2021). Assessment of a Diagnostic Classification System for Management of Lesions to Exclude Melanoma. JAMA Netw Open, 4(12), e2134614. https://doi.org/10.1001/jamanetworkopen.2021.34614
  7. Marchetti, M. A., Cowen, E. A., Kurtansky, N. R., Weber, J., Dauscher, M., DeFazio, J., Deng, L., Dusza, S. W., Haliasos, H., Halpern, A. C., Hosein, S., Nazir, Z. H., Marghoob, A. A., Quigley, E. A., Salvador, T., & Rotemberg, V. M. (2023). Prospective validation of dermoscopy-based open-source artificial intelligence for melanoma diagnosis (PROVE-AI study). npj Digital Medicine, 6(1), 127. https://doi.org/10.1038/s41746-023-00872-1
  8. Rotemberg, V., Kurtansky, N., Betz-Stablein, B., Caffery, L., Chousakos, E., Codella, N., Combalia, M., Dusza, S., Guitera, P., Gutman, D., Halpern, A., Helba, B., Kittler, H., Kose, K., Langer, S., Lioprys, K., Malvehy, J., Musthaq, S., Nanda, J., . . . Soyer, H. P. (2021). A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific Data, 8(1), 34. https://doi.org/10.1038/s41597-021-00815-z
  9. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3), 211-252. https://doi.org/10.1007/s11263-015-0816-y