Robot Technology News  
ROBO SPACE
Toward a machine learning model that can reason about everyday actions
by Kim Martineau for MIT News
Boston MA (SPX) Sep 01, 2020

A computer vision model developed by researchers at MIT, IBM, and Columbia University can compare and contrast dynamic events captured on video to tease out the high-level concepts connecting them. In a set of experiments, the model picked out the video in each vertical-column set that conceptually didn't belong. Highlighted in red, the odd-one-out videos show a woman folding a blanket, a dog barking, a man chopping greens, and a man offering grass to a llama. Credits:Image: Allen Lee

The ability to reason abstractly about events as they unfold is a defining feature of human intelligence. We know instinctively that crying and writing are means of communicating, and that a panda falling from a tree and a plane landing are variations on descending.

Organizing the world into abstract categories does not come easily to computers, but in recent years researchers have inched closer by training machine learning models on words and images infused with structural information about the world, and how objects, animals, and actions relate. In a new study at the European Conference on Computer Vision this month, researchers unveiled a hybrid language-vision model that can compare and contrast a set of dynamic events captured on video to tease out the high-level concepts connecting them.

Their model did as well as or better than humans at two types of visual reasoning tasks - picking the video that conceptually best completes the set, and picking the video that doesn't fit. Shown videos of a dog barking and a man howling beside his dog, for example, the model completed the set by picking the crying baby from a set of five videos. Researchers replicated their results on two datasets for training AI systems in action recognition: MIT's Multi-Moments in Time and DeepMind's Kinetics.

"We show that you can build abstraction into an AI system to perform ordinary visual reasoning tasks close to a human level," says the study's senior author Aude Oliva, a senior research scientist at MIT, co-director of the MIT Quest for Intelligence, and MIT director of the MIT-IBM Watson AI Lab. "A model that can recognize abstract events will give more accurate, logical predictions and be more useful for decision-making."

As deep neural networks become expert at recognizing objects and actions in photos and video, researchers have set their sights on the next milestone: abstraction, and training models to reason about what they see. In one approach, researchers have merged the pattern-matching power of deep nets with the logic of symbolic programs to teach a model to interpret complex object relationships in a scene. Here, in another approach, researchers capitalize on the relationships embedded in the meanings of words to give their model visual reasoning power.

"Language representations allow us to integrate contextual information learned from text databases into our visual models," says study co-author Mathew Monfort, a research scientist at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). "Words like 'running,' 'lifting,' and 'boxing' share some common characteristics that make them more closely related to the concept 'exercising,' for example, than 'driving.' "

Using WordNet, a database of word meanings, the researchers mapped the relation of each action-class label in Moments and Kinetics to the other labels in both datasets. Words like "sculpting," "carving," and "cutting," for example, were connected to higher-level concepts like "crafting," "making art," and "cooking." Now when the model recognizes an activity like sculpting, it can pick out conceptually similar activities in the dataset.

This relational graph of abstract classes is used to train the model to perform two basic tasks. Given a set of videos, the model creates a numerical representation for each video that aligns with the word representations of the actions shown in the video. An abstraction module then combines the representations generated for each video in the set to create a new set representation that is used to identify the abstraction shared by all the videos in the set.

To see how the model would do compared to humans, the researchers asked human subjects to perform the same set of visual reasoning tasks online. To their surprise, the model performed as well as humans in many scenarios, sometimes with unexpected results. In a variation on the set completion task, after watching a video of someone wrapping a gift and covering an item in tape, the model suggested a video of someone at the beach burying someone else in the sand.

"It's effectively 'covering,' but very different from the visual features of the other clips," says Camilo Fosco, a PhD student at MIT who is co-first author of the study with PhD student Alex Andonian. "Conceptually it fits, but I had to think about it."

Limitations of the model include a tendency to overemphasize some features. In one case, it suggested completing a set of sports videos with a video of a baby and a ball, apparently associating balls with exercise and competition.

A deep learning model that can be trained to "think" more abstractly may be capable of learning with fewer data, say researchers. Abstraction also paves the way toward higher-level, more human-like reasoning.

"One hallmark of human cognition is our ability to describe something in relation to something else - to compare and to contrast," says Oliva. "It's a rich and efficient way to learn that could eventually lead to machine learning models that can understand analogies and are that much closer to communicating intelligently with us."

Other authors of the study are Allen Lee from MIT, Rogerio Feris from IBM, and Carl Vondrick from Columbia University.

Research Report: We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos


Related Links
MIT Quest for Intelligence
All about the robots on Earth and beyond!


Thanks for being here;
We need your help. The SpaceDaily news network continues to grow but revenues have never been harder to maintain.

With the rise of Ad Blockers, and Facebook - our traditional revenue sources via quality network advertising continues to decline. And unlike so many other news sites, we don't have a paywall - with those annoying usernames and passwords.

Our news coverage takes time and effort to publish 365 days a year.

If you find our news sites informative and useful then please consider becoming a regular supporter or for now make a one off contribution.
SpaceDaily Contributor
$5 Billed Once


credit card or paypal
SpaceDaily Monthly Supporter
$5 Billed Monthly


paypal only


ROBO SPACE
AlphaDogfight trials foreshadow future of human-machine symbiosis
Washington DC (SPX) Aug 27, 2020
A small Maryland company took first place in last week's AlphaDogfight Trials Final event, a three-day competition designed to demonstrate advanced algorithms capable of performing simulated, within-visual-range air combat maneuvering - commonly known as a dogfight. Heron Systems' F-16 AI agent defeated seven other companies' F-16 AI agents and then went on to dominate the main event - a series of simulated dogfights against an experienced Air Force F-16 pilot - winning 5-0 through aggressive and ... read more

Comment using your Disqus, Facebook, Google or Twitter login.



Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle

ROBO SPACE
Britain, Belgium to collaborate on MQ-9B drone acquisition

Israel strikes Hamas targets in Gaza over balloon attacks

SqwaQ demonstrates BVLOS UAS flight capabilities for controlled airspace

Turkish drone kills 2 Iraqi officers in Kurdish region: army

ROBO SPACE
Purdue, US Army to collaborate on next-generation energetic materials

TWTS and 3D Printing

NOAA selects Orbit Logic for enterprise scheduling

New ground station brings laser communications closer to reality

ROBO SPACE
Pentagon: It's time to bring microelectronics manufacturing to the U.S.

DARPA Selects Teams to Increase Security of Semiconductor Supply Chain

Artificial materials for more efficient electronics

Spin, spin, spin: researchers enhance electron spin longevity

ROBO SPACE
Framatome signs contract to provide field instrumentation to Hinkley Point C

US versatile test reactor program chooses Bechtel-led team

After Huawei, spotlight on China's role in UK nuclear power

UAE connects first Arab nuclear plant to power grid

ROBO SPACE
IG report: Turkey still aiding Islamic State

Norway says arrests man wanted in Italy over terror plot

Syria warned by chemical warfare watchdog over sarin attacks

Western powers seek condemnation of Syria over sarin attacks

ROBO SPACE
Finnish town offers prizes to turn residents green

Finnish town offers prizes to turn residents green

Russia bristles at proposed EU carbon tax

Sri Lanka rations power after Chinese generator crashes during blackout

ROBO SPACE
The factory of the future, batteries not included

Revised code could help improve efficiency of fusion experiments

Russian chemists proposed a new design of flow batteries

Red bricks can be charged, store energy

ROBO SPACE
China's Mars probe over 8m km away from Earth

China seeks payload ideas for mission to moon, asteroid

China marching to Mars for humanity's better shared future

From the Moon to Mars: China's long march in space









The content herein, unless otherwise known to be public domain, are Copyright 1995-2024 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.