A test of performance of breast MRI interpretation in a multicentre screening study.

R. M. Warren, C. Hayes, L. J. Pointon, R. J. Hoff, Fiona Jane Gilbert, A. R. Padhani, C. Rubin, G. Kaplan, K. Raza, L. Wilkinson, M. Hall-Craggs, P. Kessar, S. Rankin, A. Dixon, J. Walsh, L. W. Turnbull, P. Britton, R. Sinnatamby, D. Easton, D. ThompsonS. R. Lakhani, M. O. Leach, Collaborators UK MRC study MRI br

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)


Objectives: The aim of this study was to assess the consistency and performance of radiologists interpreting breast magnetic resonance imaging (MRI) examinations.

Materials and Methods: Two test sets of eight cases comprising cancers, benign disease, technical problems and parenchymal enhancement were prepared from two manufacturers' equipment (X and Y) and reported by 15 radiologists using the recording form and scoring system of the UK MRI breast screening study [(MAgnetic Resonance Imaging in Breast Screening (MARIBS)]. Variations in assessments of morphology, kinetic scores and diagnosis were measured by assessing intraobserver and interobserver variability and agreement. The sensitivity and specificity of reporting performances was determined using receiver operating characteristic (ROC) curve analysis.

Results: Intraobserver variation was seen in 13 (27.7%) of 47 of the radiologists' conclusions (four technical and seven pathological differences). Substantial interobserver variation was observed in the scores recorded for morphology, pattern of enhancement, quantification of enhancement and washout pattern. The overall sensitivity of breast MRI was high [88.6%, 95% confidence interval (CI) 77.4-94.7%], combined with a specificity of 69.2% (95% CI 60.5-76.7%). The sensitivities were similar for the two test sets (P=.3), but the specificity was significantly higher for the Manufacturer X dataset (P <.001). ROC curve analysis gave an area under the curve of 0.85 (95% CI 0.79-0.92)

Conclusions: Substantial variation in all elements of the scoring system and in the overall diagnostic conclusions was observed between radiologists participating in MARIBS. High overall sensitivity was achieved with moderate specificity. Manufacturer-related differences in specificities possibly occurred because the numerical thresholds set for the scoring system were not optimised for both equipment manufacturers. Scoring systems developed on one equipment software may not be transferable to other manufacturers. (c) 2006 Elsevier Inc. All rights reserved.

Original languageEnglish
Pages (from-to)917-929
Number of pages12
JournalMagnetic Resonance Imaging
Publication statusPublished - 2006


  • breast MRI
  • quality control
  • reporting performance
  • RISK


Dive into the research topics of 'A test of performance of breast MRI interpretation in a multicentre screening study.'. Together they form a unique fingerprint.

Cite this