Objective A deep learning-based classification system (DLCS) which uses structural brain magnetic resonance imaging (MRI) to diagnose Alzheimer’s disease (AD) was developed in a previous recent study. Here, we evaluate its performance by conducting a single-center, case-control clinical trial. Methods We retrospectively collected T1-weighted brain MRI scans of subjects who had an accompanying measure of amyloid-beta (Aβ) positivity based on a 18F-florbetaben positron emission tomography scan. The dataset included 188 Aβ-positive patients with mild cognitive impairment or dementia due to AD, and 162 Aβ-negative controls with normal cognition. We calculated the sensitivity, specific-ity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) of the DLCS in the classification of Aβ-positive AD patients from Aβ-negative controls. Results The DLCS showed excellent performance, with sensitivity, specificity, positive predictive value, negative predictive value, and AUC of 85.6% (95% confidence interval [CI], 79.8–90.0), 90.1% (95% CI, 84.5–94.2), 91.0% (95% CI, 86.3–94.1), 84.4% (95% CI, 79.2–88.5), and 0.937 (95% CI, 0.911–0.963), respectively. Conclusion The DLCS shows promise in clinical settings where it could be routinely applied to MRI scans regardless of original scan purpose to improve the early detection of AD.