Publications

Browse our research papers published in top-tier conferences and journals in machine learning, computer vision, natural language processing, and AI ethics.

Interspeech 2025
NLP & Speech
2025

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

Santosh Patapati, Trisanth Srinivasan

We present GenECA, a general-purpose framework for creating real-time adaptive multimodal embodied conversational agents. Our framework integrates advanced natural language processing, computer vision, and speech synthesis to create more natural and effective human-computer interactions. Accepted to the Show & Tell Track

DG-EBF @ CVPR 2025
Computer Vision
2025

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan, Santosh Patapati

We present PhysNav-DG, a novel adaptive framework for robust vision-language model (VLM) and sensor fusion in navigation applications. Our framework addresses the challenges of integrating visual perception with other sensor modalities in dynamic and uncertain environments.

VERDI @ DSN 2025
AI Dependability
2025

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

Trisanth Srinivasan, Santosh Patapati, Himani Musku, Idhant Gode, Aditya Arora, Abubakr Nazriev, Sanika Hirave, Zaryab Kanjiani, Srinjoy Ghose

We present CPS-Guard, a multi-role orchestration system for ensuring the dependability and security of AI systems in critical infrastructure and cyber-physical systems. Our system provides comprehensive monitoring, verification, and adaptation capabilities for AI-enhanced CPS.

CVPR 2025 Demo
Computer Vision
2025

VIZ: Virtual & Physical Navigation System for the Visually Impaired

Trisanth Srinivasan, Santosh PAtapati

We present VIZ, a navigation system that helps visually impaired individuals navigate both physical and virtual environments with greater confidence and independence. VIZ uses computer vision and natural language processing to provide real-time guidance and information.

IEEE ICICT 2025
NLP
2025

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Trisanth Srinivasan

We present a novel approach to automated UI element annotation using semantic web technologies. Our approach improves the accessibility of web and mobile applications by providing more accurate and meaningful annotations for screen readers and other assistive technologies.

CVPR 2025 Demo
NLP & Speech
2025

GenECA: A Generalizable Framework for Real-Time Multimodal Embodied Conversational Agents

Santosh Patapati, Trisanth Srinivasan

Introduces a robust framework for multimodal interactions with embodied conversational agents, emphasizing emotion-sensitive interaction.