IEEE J Biomed Health Inform. 2025 May 2;PP. doi: 10.1109/JBHI.2025.3566767. Online ahead of print.

ABSTRACT

Data scarcity is a common and serious problem in depression detection, often leading to overfitting and bias that degrade the performance of depression detectors. We propose a counterfactual augmentation (CF aug) framework that generates latent features for speechbased depression detection under data-scarce conditions. The generation method is based on exploring how feature changes affect the outcomes. To this end, we introduce a counterfactual layer to a deep network to transform the representation of the original data to its opposite class, while a group-wise vector quantization module helps the model explore how the changes in vectors (or entries) sampled from codebooks affect the outcome. Experimental results demonstrate that CF-aug can alleviate the overfitting and bias problems caused by data scarcity. Our CF-aug framework achieves competitive performance compared to state-of-the-art methods on two depression datasets. We also demonstrate the potential of CF-aug in other domains and modalities for medical diagnosis under data-scarce settings.

PMID:40315097 | DOI:10.1109/JBHI.2025.3566767