AI News, Artificial intelligence in action
Artificial intelligence in action
A person watching videos that show things opening — a door, a book, curtains, a blooming flower, a yawning dog — easily understands the same type of action is depicted in each clip.
Learning from dynamic scenes The goal is to provide deep-learning algorithms with large coverage of an ecosystem of visual and auditory moments that may enable models to learn information that isn’t necessarily taught in a supervised manner and to generalize to novel situations and tasks, say the researchers.
“This dataset can serve as a new challenge to develop AI models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis,” Oliva adds, describing the factors involved.
Oliva and Gutfreund, along with additional researchers from MIT and IBM, met weekly for more than a year to tackle technical issues, such as how to choose the action categories for annotations, where to find the videos, and how to put together a wide array so the AI system learns without bias.
Now we have reached the milestone of 1 million videos for visual AI training, and people can go to our website, download the dataset and our deep-learning computer models, which have been taught to recognize actions.” Qualitative results so far have shown models can recognize moments well when the action is well-framed and close up, but they misfire when the category is fine-grained or there is background clutter, among other things.
This first version of the Moments in Time dataset is one of the largest human-annotated video datasets capturing visual and audible short events, all of which are tagged with an action or activity label among 339 different classes that include a wide range of common verbs.
The researchers intend to produce more datasets with a variety of levels of abstraction to serve as stepping stones toward the development of learning algorithms that can build analogies between things, imagine and synthesize novel events, and interpret scenarios.
The limitations of current approaches to these problems are stark: they are unable to understand social interactions, unable to recognize events from new viewpoints, unable to recognize that the same event can be described in different ways depending on the listener, unable to adapt to new scenarios, etc.
A deep understanding of massive video and text data that adapts to new scenarios will enable new applications of AI and machine learning that are well beyond what can be achieved today, and we are excited to see where these techniques can be applied.
Helping AI master video understanding
I am part of the team at the MIT IBM Watson AI Lab that is carrying out fundamental AI research to push the frontiers of core technologies that will advance the state-of-the-art in AI video comprehension.
Great progress has been made and I am excited to share that we are releasing the Moments in Time Dataset, a large-scale dataset of one million three-second annotated video clips for action recognition to accelerate the development of technologies and models that enable automatic video understanding for AI.
lot can happen in a moment of time: a girl kicking a ball, behind her on the path a woman walks her dog, on a park bench nearby a man is reading a book and high above a bird flies in the sky.
When asked to describe such a moment, a person can quickly identify objects (girl, ball, bird, book), the scene (park) and the actions that are taking place (kicking, walking, reading, flying).
While new algorithmic ideas have emerged over the years, this success can be largely credited to two other factors: massive labeled datasets and significant improvements in computational capacities, which allowed processing these datasets and training models with millions of parameters in reasonable time scales.
We have been working over the past year in close collaboration with Dr. Aude Oliva and her team from MIT, where we are tackling the specific challenge of action recognition, an important first step in helping computers understand activities which can ultimately be used to describe complex events (e.g.
In other words, this is a relatively short period of time, but still long enough for humans to process consciously (as opposed to time spans associated with sensory memory, which unconsciously processes events that occur in fractions of a second).
encourage you to leverage the Dataset for your own research and share your experiences to foster progress and new thinking. Visit the website to obtain the dataset, read our technical paper that explains the approach we took in designing the dataset and see examples of annotated videos that our system was tested on.
MIT-IBM Watson AI Lab researchers train computers to understand dynamic events
Researchers from the MIT-IBM Watson AI Lab want to make computers more “human.” The researchers are currently working on a project that will help computers understand and recognize dynamic events.
“This dataset can serve as a new challenge to develop AI models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis,” Oliva added, describing the factors involved.
According to the researchers, this dataset took more than a year to put together and faced many technical issues such as choosing the action categories, where to find videos, and how to put it together in a way that an AI system can learn without bias.
Now we have reached the milestone of 1 million videos for visual AI training, and people can go to our website, download the dataset and our deep-learning computer models, which have been taught to recognize actions,” Oliva said.
- On Tuesday, February 25, 2020
Keep Calm and Trust your Model - On Explainability of Machine Learning Models: Praveen Sridhar
The accuracy of Machine Learning models is going up by the day with advances in Deep Learning. But this comes at a cost of explainability of these models.
Secure and Privacy - Preserving Data Analytics and Machine Learning
Prof. Dawn Song, Professor at the Computer Science Divsion, University of California Academic Perspectives on Cybersecurity Challenges Cyberweek 2017 Tel ...
Lec-13 Transportation Problems
Lecture Series on Fundamentals of Operations Research by Prof.G.Srinivasan, Department of Management Studies, IIT Madras. For more details on NPTEL visit ...
Network Dissection: Quantifying Interpretability of Deep Visual Representations
David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba We propose a general framework called Network Dissection for quantifying the ...
MIT News at Noon: Devavrat Shah on Twitter trends
Devavrat Shah, Associate Professor in the Development of Electrical Engineering & Computer Science, delivers his "News at Noon" talk at the MIT Museum.
What is data visualisation?
Data visualisation involves the visual presentation of data to communicate the stories contained in the dataset. Data visualisation can communicate complex ...
Design and Evaluation of Effective, Interactive, and Interpretable Machine Learning
Machine learning is ubiquitous in domains such as criminal justice, credit, lending, and medicine. Traditionally, these models are evaluated based on their ...
AI in the Admin State | AI and Biomedical Resource Creation, Biopharmaceuticals and Digital Health
Moderators: Nita Farahany, Duke Law School, Duke Initiative for Science & Society Arti Rai, Duke Law School, The Center for Innovation Policy at Duke Law ...
Machine Learning Applied To Cancer! WoW!! 7/6/2017
Learn how Machine Learning can change what we know about cancer diagnosis.