Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Daily Overview |
| Session | ||
121: From Behavior to Biology: Contactless Digital Biomarkers in Stress and Mental Health Research
| ||
| Session Abstract | ||
|
Digital biomarkers are rapidly transforming biopsychological research by enabling the quantification of stress- and emotion-related processes from digital data streams such as video and audio, capturing behavioral and expressive patterns including movement, facial dynamics, and vocal characteristics. Despite substantial methodological advances, conceptual integration, validation standards, and accessible analytical infrastructures remain ongoing challenges. The symposium presents methodological advances and empirical applications in stress and mental health. Jost Blasberg will kick off the symposium by presenting findings from two studies examining facial dynamics during acute psychosocial stress. Frame-wise facial activity time series were extracted from video recordings and analyzed for their potential to serve as temporally sensitive digital markers of stress responses. Second, Magdalena Wekenborg will introduce a multimodal digital biomarker framework based on vision-language models to derive psychophysiological stress- and emotion-related signatures from videos and real-world photographs in ecological momentary settings. The presentation will highlight both the methodological potential and the current limitations of large-scale multimodal models in ambulatory assessment. Third, Lara Puhlmann will address inter- and intra-individual variability in framewise facial and audio features from a fully structured psychological interview. By comparing feature expression across question-specific responses, she will discuss methodological challenges and opportunities for ambulatory and context-sensitive assessment. Finally, Robert Richer will present openViDA, an open, modular Python-based framework for unified extraction, synchronization, and analysis of multimodal digital biomarker data. Designed to lower technical barriers while maintaining methodological rigor, openViDA supports reproducible workflows across modalities and study designs. Different application examples will illustrate the integrative potential of openViDA. | ||
| Presentations | ||
Facial Dynamics during Psychosocial Stress Universitätsklinikum Jena, Germany Psychosocial stress detection via cost-effective means could prove a useful tool in the diagnosis of long-term stress experience and subsequent prevention of stress-related health disorders. Facial expressions are both part of and a relay for complex affective states, such as acute stress. To examine this, we investigated facial dynamics during a standardized laboratory stressor, the TSST, in two independent samples. Participants’ facial activity was extracted from video recordings on a frame by frame basis across multiple action units. The resulting facial time series were reduced using principal component analysis and subsequently modeled using dynamic structural equation modeling (DSEM) to capture autoregressive and innovation processes. Although overall psychological and physiological stress markers were not associated with facial activity in either sample, participants exposed to stress exhibited clear differences in facial dynamics compared to a stress free control group. Other than previously expected, facial activity during psychosocial stress was characterized by tension and rigidity rather than overall increased activity. With the ever growing evidence of the severity of stress-associated mental and bodily diseases early stress detection is a highly important avenue to pursue. Stress in Sight? Decoding Affective Signatures with Vision-Language Models 1Else Kroener Fresenius Center for Digital Health, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany; 2Health Psychology, University of Graz, Graz, Austria; 3Department of Medicine I, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany; 4Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany Vision-language models (VLMs) offer a novel pathway toward quantifying behavioral and environmental information from visual data for stress and mental health research. This presentation highlights the promises and limitations of VLM-based approaches for decoding affective signatures across two lines of research. In the first study, we coupled ecological momentary assessment with VLM-based scene analysis to quantify visual environmental features in 2,674 participant-generated photographs. VLM-derived greenness closely tracked self-rated greenness and robustly predicted momentary affect and chronic stress, matching established benchmarks. To extend these analyses, we developed a semi-autonomous large language model (LLM)-based system that mined over seven million scientific publications to extract nearly 1,000 environmental features empirically linked to mental health. This literature-derived feature set allowed us to evaluate scalability and generalizability of VLM-based scene analysis. Applied to real-world imagery, up to 33% of VLM-extracted context ratings significantly predicted stress, affect, and other mental-health-relevant outcomes, demonstrating reliable detection of theoretically expected associations and establishing a scalable framework for first-person "visual exposomics". In the second line of research, we applied VLMs to video recordings from the Trier Social Stress Test. Stress classification performed poorly, raising questions about the lack of a unitary stress ground truth, the divergence between psychological and biological stress responses, and current limits of VLMs in capturing subtle affective cues. These findings show that VLMs can reliably decode affective signatures from environmental context, yet face challenges targeting fine-grained psychophysiological processes. Integrating physiological measures represents a critical next step toward validated multimodal digital biomarker frameworks. Digital Biomarkers of Subclinical Mental Health Symptoms: Variability and Question Effects in a Structured Video Interview 1Leibniz Institute for Resilience Research, Mainz, Germany; 2TUD Dresden University of Technology, Dresden, Germany, Germany; 3University of Zurich, Zurich, Switzerland; 4Department of Psychiatry, NYU Grossman School of Medicine, New York, NY, USA; 5Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA In the context of a longitudinal resilience study (DynaM-INT), we investigate the use of digital biomarkers (DBMs) as objective mental health indicators in healthy but stressor-exposed young adults. DBMs derived from ambulatory audiovisual recordings offer promising avenues for scalable early detection, but their application is hindered by limited interpretability and individual variability across contexts and time. Here, we examine such variability using data from an online video interview that captures DBMs evoked by different question prompts. The structured interview was developed with thirteen questions about internalizing symptoms and positive and negative events. From participants’ video-recorded responses, facial action units, emotional expressivity (e.g., happiness), and vocal acoustic features (e.g., voice pitch) were extracted using pre-trained machine-learning algorithms. Using elastic net regression, we additionally assessed whether facial expressivity and vocal acoustic features predict self-reported symptom load, measured via the General Health Questionnaire (GHQ-28). We identify substantial within-subject variation in feature expression across framewise time series and interview prompts, alongside noticeable between-subject differences. Between-subject variability exceeded question-related differences, but some question-wise modulation was detected. Most notably, participants displayed greater facial happiness and lower sadness when describing positive events, indicating systematically evoked behavior. In a preliminary analysis, symptom load was most associated with reduced overall facial expressivity across all interview questions. These data highlight that DBM signals are influenced by both question-dependent responses and dynamic within-subject variation. Important next steps for the field include minimizing noise in ambulatory assessments and leveraging context-specific responses for more clinically interpretable symptom detection and prediction. OpenViDA – Towards an Open, Flexible, and Modular Video-based Digital Biomarker Analysis Framework for Biobehavioral Health 1Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany; 2Munich Center for Machine Learning (MCML), Munich, Germany; 3Chair of Health Psychology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany; 4Chair of AI-supported Therapy Decisions, Institute for Medical Information Processing, Biometry, and Epidemiology, LMU München, Munich, Germany; 5Translational Digital Health Group, Institute of AI for Health, Helmholtz Zentrum München – German Research Center for Environmental Health, Neuherberg, Germany Video-based digital biomarkers offer a scalable and unobtrusive means to quantify behavior and its relation to underlying psychophysiological processes during situations such as acute stress, pain, social interaction dynamics, and many more. However, current methodological practice remains fragmented and lacks methodological standards: analytic pipelines are often study-specific, difficult to reproduce, and limited in their ability to integrate multimodal signals or capture temporal dynamics. This fragmentation constrains comparability across studies and slows methodological convergence in psychological research. This talk introduces openViDA, an open, flexible, and modular Python-based framework designed to address these challenges by providing end-to-end workflows for video-based digital biomarker analysis. Building on a plethora of readily available, but fragmented, open-source libraries, openViDA enables the unified extraction, synchronization, and analysis of multimodal behavioral signals, including facial dynamics, body posture and movement, voice, and speech. The framework emphasizes three core principles: (i) modularity, allowing interchangeable algorithmic components and extensibility across modalities; (ii) methodological rigor, through transparent, reproducible pipelines aligned with open-science standards; and (iii) accessibility, by lowering technical barriers while avoiding opaque black-box solutions. Beyond standard feature extraction, openViDA supports advanced analytics capturing temporal structure, multimodal interactions, and context-dependent behavioral signatures. Application examples from controlled laboratory paradigms and naturalistic settings illustrate how openViDA facilitates integrative analyses of underlying psychophysiological processes. By unifying currently dispersed methods into a shared infrastructure, openViDA contributes to standardization, accelerates reuse of existing datasets, and supports the development and validation of robust digital biomarkers. Overall, the framework represents a step toward scalable and reproducible analysis of behavior in biobehavioral research. | ||
