ViLaBot: Connecting Vision and Language for Robots That Assist Humans at Home

Asfand Yaar, Marco Rosano, Antonino Furnari, Aki Harma, Giovanni Maria Farinella

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Despite significant advancements in the field of vision, language and robotics, integrating these capabilities to create an autonomous robot assistant remains a challenge. This paper presents ViLaBot (Vision and Language roBot), a system designed to aid humans in daily activities while at home. ViLaBot combines a language model with a library of basic visuomotor skills to understand human needs, create action plans and execute them. The system relies solely on onboard visual and proprioceptive sensing, eliminating the need for pre-built maps or precise object locations and facilitating real-world deployment in a variety of environments. Experimental validation conducted in 11 realistic home environments featuring simulated human agents using the Habitat simulator indicated that ViLaBot can achieve promising results when using ground-truth image segmentation, yet exhibits modest performance in scenarios involving imperfect visual perception. The results support the validity of the proposed pipeline and highlight the critical components of the system that should be improved to increase its overall success rate and reliability.
Original languageEnglish
Title of host publication2024 IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2024 - Proceedings
PublisherIEEE
Pages1206-1211
Number of pages6
ISBN (Electronic)9798350378009
DOIs
Publication statusPublished - 2024
Event3rd IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2024 - St Albans, United Kingdom
Duration: 21 Oct 202423 Oct 2024
https://metroxraine.org/metroxraine2024/index

Publication series

SeriesIEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE - Proceedings

Conference

Conference3rd IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2024
Abbreviated titleMetroXRAINE 2024
Country/TerritoryUnited Kingdom
CitySt Albans
Period21/10/2423/10/24
Internet address

Keywords

  • assistive tasks
  • human-robot interaction
  • navigation and manipulation
  • task planning

Fingerprint

Dive into the research topics of 'ViLaBot: Connecting Vision and Language for Robots That Assist Humans at Home'. Together they form a unique fingerprint.

Cite this