Rationale and Objectives: Multiple diagnostic tests are often available for a disease. Their diagnostic accuracy may depend on the characteristics of testing subjects. The investigators propose a new tree-structured data-mining method that identifies subgroups and their corresponding diagnostic tests to achieve the maximum area under the receiver-operating characteristic curve. Materials and Methods: The Osteoporosis and Ultrasound Study is a prospectively designed, population-based European multicenter observational study to evaluate state-of-the-art diagnostic methods for assessing osteoporosis. A total 2837 women underwent dual x-ray absorptiometry (DXA) and quantitative ultrasound (QUS). Prevalent vertebral fractures were determined by a centralized radiology laboratory on the basis of radiographs. The data-mining algorithm includes three steps: defining the criteria for node splitting and selection of the best diagnostic test on the basis of the area under the curve, using a random forest to estimate the probability of DXA being the preferred diagnostic method for each participant, and building a single regression tree to describe subgroups for which either DXA or QUS is the more accurate test or for which the two tests are equivalent. Results: For participants with weights ≤54.5 kg, QUS had a higher area under the curve in identifying prevalent vertebral fracture. For participants whose weights were >58.5 kg and whose heights were ≤167.5 cm, DXA was better, and for the remaining participants, DXA and QUS had comparable accuracy and could be used interchangeably. Conclusions: The proposed tree-structured subgroup analysis successfully defines subgroups and their best diagnostic tests. The method can be used to develop optimal diagnostic strategies in personalized medicine.
- Classification and regression tree
- Personalized medicine
- Random forest
- Receiver-operating characteristic curve
- Subgroup analysis