EGOcentric Perception (EGO4D)

Ego4D- An Achievement to Celebrate!


Egocentric 4D Perception(Ego4D) is a Facebook Artificial Intelligence(AI) Research project dedicated to creating a dataset of egocentric or first-person videos. The purpose is to understand how people interact in society and use objects with the intention of application in AI. It is the largest dataset in the world that captures daily-life activities around the globe from a first person’s point of view. Facebook AI collaborated with 13 universities around the world that collected over 3600 hours of unscripted, in the wild videos from 9 countries using a variety of head-mounted cameras.

International Institute of Information Technology Hyderabad (IIITH) is the sole university in India that partakes in the Ego4D project. The team from IIITH is from the Center of Visual Information Technology(CVIT), led by Professor C.V. Jawahar, Raghava Modhugu(student), Siddhant Bansal(student), Ram Sharma(Technical Support), Aradhana Vinod(Project Coordinator), and Varun Bhargavan(Network Engineer). From the 13 universities participating, IIITH has contributed the highest number of hours to Ego4D. To capture videos, 14 go-pro cameras were used by 138 people in different states of India such as Telangana, Uttar Pradesh, Kerala, West Bengal, and Gujarat. The videos consist of daily life scenarios like cooking, cleaning, traveling, shopping, and much more. In order to protect the privacy of the participants, de-identification of faces, credit cards, and other identifiers were done and consent forms were signed by the people, where relevant. “Since our participation began in March 2020, a great deal of effort was taken by the team to collect videos especially due to covid-19 restrictions.”, says Aradhana Vinod.

Professor Jawahar, the Principal Investigator(PI), guided the team to aim to contribute 1000 hours of egocentric videos to Ego4D. Once the videos were recorded they were reviewed by four annotators who also created written descriptions and validated each video. A final review was done by Aradhana Vinod and Ram Sharma before submission. Siddhant Bansal submitted a paper to the CVPR workshop as well as contributed to the benchmark track and hosted the ‘challenge’ of detecting an object’s state change for Ego4D.

A Tea Party, with around 60 attendees, was held at IIITH on Thursday, April 14 at 3:00 p.m. to celebrate the accomplishment of providing 1000 hours of videos to Ego4D. The event began with a slide show presentation of IIITH’s participation in the Ego4D project. Subsequently, Professor Jawahar spoke a few words appreciating the team for their remarkable work. Thereafter, Professor P.J. Narayanan, Director of IIITH, felicitated Ram Sharma for the exemplary effort of contributing the largest data collection, followed by awarding him a certificate of appreciation. All the team members and guests were left content as the event concluded on a positive note with a customary cutting of a specially designed ‘Ego4D cake’. Professor Jawahar stated, “The dedication, commitment, and hard work of each team member have made this an achievement to celebrate!”

The videos were collected in 5 different states in India, geographically well apar.

  • Telangana
  • Andra Pradesh
  • Kerala
  • Kolkata
  • Uttar Pradesh

We cover 36 different scenarios, such as making bricks using hands, knitting, making egg cartons, and hairstyling. The age of subjects ranged from 18-84 years with 10 distinct professional backgrounds (teachers, students, farmers, blacksmiths, homemakers, etc.). Out of all the subjects, 4 were males, and 44 were females. We use GoPro Hero 6 and GoPro Hero 7 for recording the videos to the participants in different parts of the country. Videos were shared back either in external hard disks or over the cloud storage. Each video was manually inspected for any sensitive content before sharing.


Primary contributors: Raghava Modhugu - data collection pipeline, design of the setup and workflow. Siddhant Bansal - IRB application, consent forms and de-identification. C.V. Jawahar - lead contributor for data collection. We also acknowledge the contributions of Aradhana Vinod (coordination and communication), Ram Sharma (local data management and verification), and Varun Bhargavan (systemsand resources). — Ram Sharma