- Facebook has announced a research project in which it collected 2,200 hours of first-person footage from around the world to train next-generation AI models.
- The project is called Ego4D, and it could prove to be crucial to Facebook's Reality Labs division, which is working on smart glasses, augmented reality and virtual reality projects.
- Facebook said it will make the Ego4D data set publicly available to researchers in November.
Facebook on Thursday announced a research project in which it collected 2,200 hours of first-person footage from around the world to train next-generation artificial intelligence models.
The project is called Ego4D, and it could prove to be crucial to Facebook's Reality Labs division, which is working on numerous projects that could benefit from AI models trained using video footage shot from the perspective of a human. This includes smart glasses, like the Ray-Ban Stories that were released by Facebook last month, and virtual reality, in which Facebook has invested heavily since its 2014 $2 billion acquisition of Oculus.
The footage could teach artificial intelligence to understand or identify something in the real world, or a virtual world, that you might see from a first-person perspective through a pair of glasses or an Oculus headset.
Facebook said it will make the Ego4D data set publicly available to researchers in November.
"This release, which is an open data set and research challenge, is going to catalyze progress for us internally but also widely externally in the academic community and [allow] other researchers to get behind these new problems but now be able to do it in a more meaningful way and at a greater scale," Kristen Grauman, lead research scientist at Facebook, told CNBC.
The data set could be deployed in AI models used to train technology like robots to more rapidly understand the world, Grauman said.
"Traditionally a robot learns by doing stuff in the world or being literally handheld to be shown how to do things," Grauman said. "There's openings to let them learn from video just from our own experience."
Facebook and a consortium of 13 university partners relied on more than 700 participants across nine countries to capture the first-person footage. Facebook says Ego4D has more than 20 times more hours of footage than any other data set of its kind.
Facebook's university partners included Carnegie Mellon in the U.S., the University of Bristol in the U.K., the National University of Singapore, the University of Tokyo in Japan and the International Institute of Information Technology in India, among others.
The footage was captured in the U.S., U.K., Italy, India, Japan, Singapore and Saudi Arabia. Facebook said it is hoping to expand the project to more countries, including Colombia and Rwanda.
"An important design decision for this project is we wanted partners that first of all are leading experts in the field, interested in these problems and motivated to pursue them but also have geographic diversity," Grauman said.
The announcement of Ego4D comes at an interesting time for Facebook.
The company has steadily been ramping up its efforts in hardware. Last month, it released the $299 Ray-Ban Stories, its first smart glasses. And in July, Facebook announced the formation of a product team to work specifically on the "metaverse," which is a concept that involves creating digital worlds that multiple people can inhabit at the same time.
Over the past month, however, Facebook has been hit by a barrage of news stories stemming from a trove of internal company research leaked by Frances Haugen, a former Facebook product manager turned whistleblower. Among the research released were slides that showed Instagram was harmful to the mental health of teenagers.
For the sake of privacy, Facebook said participants were instructed to avoid capturing personal identifying characteristics when collecting footage indoors. This includes people's faces, conversations, tattoos and jewelry. Facebook said it removed personally identifiable information from the videos and blurred bystanders' faces and vehicle license plate numbers. The audio was also removed from many of the videos, the company said.
"The university partners who did this video collection, step No. 1 for all of them was a pretty intensive and important process to create a policy for proper collection," Grauman said.