The focal score (FS) in biopsies of the labial minor salivary glands is one of the two most important criteria (with anti-SSA positivity) for the diagnosis of primary Sjögren’s syndrome (pSS). In order to reduce error rates in diagnostics, deep learning algorithms using artificial neural networks could be used to support pathologists in the future.
According to the ACR/EULAR criteria for the pSS, the classification of the focus score (FS) is one of three classification criteria, but it requires expertise that is not always available in practice. 53% of cases lead to a revision of the diagnosis when assessed by experts.
Machine learning involves families of algorithms that can take in a large amount of data and use it to predict something, for example. Deep learning is a subset of this field that focuses on just one family of algorithms and is called “deep neural networks”, explained Louis Basseto, Scienta Lab, Research Department, Paris, by way of introduction [1].
Deep learning is already being used in medical pathology, for example in breast cancer, where it already uses algorithms to detect metastases in tissue samples from lymph nodes very accurately and with better performance than pathologists. “Our group has also shown that machine learning can be used in rheumatoid arthritis to predict response to methotrexate and TNF inhibitors.” Now the researchers set themselves the task of applying deep learning to Sjögren’s syndrome and finding out whether it can help with the classification of the focus score and pSS diagnosis?
Deep learning based on salivary gland biopsies
To this end, they developed two deep learning networks using digitized labial salivary gland biopsies from patients to predict the focal score (FS ≥1 or FS <1) and the diagnosis of primary Sjögren’s syndrome based on histology only (pSS+ or pSS-). The study included 325 patients (145 from the University of Paris-Saclay, Bicêtre Hospital, 71 from Queen Mary University London and 109 from the University of Birmingham), using biopsies taken in routine clinical practice. The participants were divided into three groups:
- pSS and FS <1 (32%, sicca symptoms)
- pSS+ and FS ≥1 (47%)
- pSS+ and FS <1 (21%)
All FS were previously confirmed by pathologists and the pSS diagnoses by experts.
One or more images from the biopsy are used to make a prediction. Each image is divided into tiles and the algorithm assigns a risk score to each of these tiles independently. All this information is then combined to make a prediction of either the focus score or the diagnosis, depending on which algorithm is used.
“We use two different data sets,” explained Basseto: a training set (70% of patients) and a validation set (30% of patients). The two sets contain the same proportion of positive and negative classes (pSS+/pSS- or FS ≥1/FS <1) und die verschiedenen Zentren, aus denen sich die Teilnehmer zusammensetzen, sind im Training und in der Validierung gemischt. Diese beiden Aufgaben nennt man «semi-supervised learning», also halbüberwachtes Lernen. «Das ist der Schlüssel zum Verständnis der Arbeit: Halbüberwachtes Lernen bedeutet, dass wir dem Modell im Training zwar Informationen geben – z.B. sagen wir ihm, wenn ein Patient einen FS>1. However, we do not say which parts of the imaging would lead a pathologist to believe this. The model determines this from the data itself.”
Potential new biomarkers
The performance of the algorithm was measured using the area under the ROC curve (AUROC). With regard to the focus score (FS ≥1/FS <1), the algorithm achieved an AUROC of 0.88. 77% of the positive predictions were actually patients with a focus score of ≥1. In the case of the negative prediction value, 83% even correctly indicated that it was an FS<1. In the case of the diagnosis prediction, the AUROC was 0.84. 83% of patients with a positive prediction of the algorithm were actually Sjögren’s positive (pSS+) and 67% of patients with a negative prediction were actually Sjögren’s negative (pSS-) (Table 1).
“These predictions were made solely on the basis of the images. We didn’t tell the algorithm where to look to make its decision – but after training, we can review the algorithms to see on what basis they made their predictions.” The researchers’ finding was that the algorithm for the focus score identified lymphocyte foci to make a prediction (Fig. 1). This ensures explainability and enables the pathologist to visually confirm the results of the prediction. This is therefore reassuring for the FS, says Basseto, “but what is reassuring for the FS is very exciting for the diagnosis, as it could lead to the potential identification of new histological biomarkers that are only of interest for the pSS+ and FS<1) population”. This aspect is of particular interest for the future work of the research group.
Take-Home-Messages
- Deep learning accurately predicts the focus score and diagnosis of primary Sjögren’s syndrome based on labial salivary gland biopsies.
- Deep learning could potentially reduce the error rate in diagnosis by ~2.5 times. The error rate due to deep learning is 19.7% vs. 53% for non-expert centers.
Perspectives:
- Further clinical assessments are required to validate the algorithms in real clinical practice.
- Additional validation on a larger cohort.
- Ongoing work aims to identify new histologic biomarkers associated with the diagnosis of pSS.
Congress: EULAR 2023
Source:
- Basseto L: Vortrag «Deep Learning Accurately Predicts Focus Score and Diagnosis of Primary Sjögren Syndrome using Labial Salivary Gland Biopsies»; EULAR 2023, Mailand, 2.6.2023 (online).
- Basseto L: Scientific Abstract OP0232, EULAR 2023; doi: 10.1136/annrheumdis-2023-eular.418.
InFo RHEUMATOLOGIE 2023: 5(2): 20–21