Abstract Accurate differentiation among asthma, bronchiectasis, and chronic obstructive pulmonary disease (COPD) remains a critical challenge due to overlapping clinical symptoms and limitations of conventional diagnostic tools. This study establishes a transparent, reproducible baseline using gas chromatography-mass spectrometry (GC-MS) data derived from exhaled breath to classify asthma, bronchiectasis, and COPD. Using a publicly available clinical dataset comprising 121 breath samples and 76 shared volatile organic compounds (VOCs), we evaluated seven supervised classifiers under nested cross-validation. Among the classifiers, XGBoost achieved the highest performance, with a mean accuracy of 95.83% and macro-averaged AUC of 0.998. To enhance clinical interpretability, we applied Shapley Additive exPlanations (SHAP) to identify the most influential VOCs for each disease class. This analysis revealed several candidate biomarkers with disease-specific or cross-disease relevance, such as 2-pentylfuran and hexadecane. This integrative approach demonstrates the potential of breathomics combined with explainable AI as a scalable and non-invasive tool for respiratory disease classification and biomarker discovery. By providing this reproducible baseline, our work offers a reference point for future methodological advances and clinical validation using breathomics data.