We aim to create a virtual environment to improve models for action recognition
Computer vision has shown great success in a large number of tasks by learning through large scale annotated datasets. However, in applications such as cognitive or social robotics, autonomous agents are not simply passive observers but need to be active in order to learn in an environment. The focus of our work is to create a realistic virtual environment mimicking a household where agents can perform everyday actions by executing sets of instructions. Such framework will allow creating potentially infinite datasets of videos which can yield to developing better systems for action recognition.