How AI powers breath analysis at RespiQ

Human breath contains over 1,000 volatile organic compounds (VOCs). Some combinations of these VOCs reflect basic metabolic processes and are common to most people. Other combinations serve as unique identifiers of disease onset and may therefore be considered as disease biomarkers 1,2​. These VOCs make up less than 1% of exhaled breath, with the remaining 99% consisting of nitrogen, oxygen, carbon dioxide, and water vapour. Therefore, detecting disease biomarkers means finding signals that are three orders of magnitude smaller than the background, in the part-per-billion range: a needle-in-a-haystack problem that has constrained breath analysis for decades.

At RespiQ, we are developing an AI-driven approach that pushes detection sensitivity from parts-per-million (PPM) to parts-per-billion (PPB) levels, unlocking the potential for early disease detection through breath analysis. 

 
 
 

Breath composition: Nitrogen (74%), Oxygen (15%), Carbon Dioxide (4.5%), Water (5%), and Argon (0.9%) dominate exhaled breath, while disease biomarkers hide in the <1% VOC fraction 

 
 

The technical challenge 

At RespiQ, we use plasma emission spectroscopy to analyze breath composition.

Traditional analytical chemistry can identify known compounds at characteristic wavelengths. However, these traditional methods struggle with breath analysis, because target biomarkers are typically found at PPB levels–three orders of magnitude below conventional detection limits! Often, VOCs have overlapping emission bands, requiring simultaneous analysis of hundreds of coupled wavelengths rather than simple single-wavelength measurements. Further complicating matters, everyone's baseline breath is different: Age, diet, genetics, and medication all shift the chemical landscape, turning what looks like a disease signature in one person into normal variation in another.

These challenges have led to reproducibility problems across the field ​3,4​, with biomarkers identified in discovery cohorts frequently failing to replicate in validation studies, as what appears as a disease signature in one individual might be normal variation in another.

 

How we do AI at RespiQ 

Our approach centres on collaboration between experimental scientists and data scientists: we integrate machine learning throughout the measurement and analysis process rather than using it as a black box after data collection. 

We use machine learning to analyse the complex patterns in each data sample. While traditional methods look for known compounds at single characteristic wavelengths, our ML models can spot subtle combinations of signals across hundreds of coupled wavelengths simultaneously, enabling us to detect biomarkers at the PPB levels that conventional approaches miss. 

 

Continuous collaboration between the experimental scientists and the data scientists drives iterative improvements in both device performance and model accuracy.

 
 

Making this work requires close integration between disciplines, with spectroscopy experts and data scientists working in continuous feedback loops rather than sequential handoffs: when models identify unexpected patterns, experimental scientists use that insight to design validation tests. Conversely, when lab results uncover new behaviour, data scientists refine the model architecture and preprocessing accordingly. Each iteration refines both the experimental protocols and the computational methods, with insights flowing bidirectionally between the bench and the algorithm. This iterative process has been essential in lowering the limit of detection of VOCs, bringing us closer to finding our needle in the haystack.

 

Our vision: Continuous health monitoring 

Traditional medicine operates episodically, with symptom onset triggering diagnosis and intervention. Breath analysis powered by AI could enable a non-invasive, frequent, and longitudinal monitoring that detects metabolic shifts before clinical symptoms appear. Machine learning models trained on breath data will enable early warning systems for COPD exacerbations, treatment adjustments tailored to individual metabolic responses, and daily health monitoring in home settings.  

As we advance toward clinical deployment, we're focused on building technology that healthcare professionals can trust. This means developing models that explain their predictions in terms clinicians can understand, creating transparent AI systems that meet regulatory requirements, and conducting rigorous testing to demonstrate real-world effectiveness across diverse patient populations. 

We understand that technical innovation alone isn't enough, and AI in healthcare requires transparency, validation, and trust

 

RespiQ is developing AI-powered breath analysis technology for early disease detection and continuous health monitoring. Our approach combines advanced spectroscopy with machine learning to achieve parts-per-billion sensitivity in complex biological samples.

 

References

  1. ​Sharma, A., Kumar, R. & Varadwaj, P. Smelling the Disease: Diagnostic Potential of Breath Analysis. Mol Diagn Ther 27, 321–347 (2023). doi: 10.1007/s40291-023-00640-7

  2. Cao, W. & Duan, Y. Current status of methods and techniques for breath analysis. Critical Reviews in Analytical Chemistry vol. 37 3–13. doi: 10.1080/10408340600976499

  3. Smith, D. & Spanel, P. Pitfalls in the analysis of volatile breath biomarkers: suggested solutions and SIFT–MS quantification of single metabolites. J Breath Res 9, 022001 (2015). doi: 10.1088/1752-7155/9/2/022001

  4. Smith, D., Španěl, P., Herbig, J. & Beauchamp, J. Mass spectrometry for real-time quantitative breath analysis. J Breath Res 8, 027101 (2014). doi: 10.1088/1752-7155/8/2/027101


Written by Chandan Sreedhara, 05/12/2025