Regina Barzilay has been appointed the Delta Electronics Professor of Electrical Engineering and Computer Science at MIT. The appointment recognizes Barzilayâs leadership in the area of human language technologies and her outstanding mentorship and educational contributions.
"Professor Barzilay is internationally known in the fields of natural language processing and computational linguistics, and is widely respected as a creative thought leader," Anantha Chandrakasan, head of the Department of Electrical Engineering and Computer Science and the Vannevar Bush Professor of Electrical Engineering and Computer Science wrote in a note announcing the appointment. "In addition to this research, she has made truly outstanding educational contributions."
Barzilay's research on natural languages focuses on the development of models of natural language, and uses those models to solve real-world language processing tasks. Her research in computational linguistics deals with multilingual learning, interpreting text for solving control problems, and finding document-level structure within text. Barzilayâs work enables the automated summarization of documents, machine interpretation of natural language instructions, and the deciphering of ancient languages. As the world has more and more text to be searched and interpreted, applications for this work increase year by year....
MIT researchers have developed a novel âunsupervisedâ language translation model â meaning it runs without the need for human annotations and guidance â that could lead to faster, more efficient computer-based translations of far more languages.
Translation systems from Google, Facebook, and Amazon require training models to look for patterns in millions of documents â such as legal and political documents, or news articles â that have been translated into various languages by humans. Given new words in one language, they can then find the matching words and phrases in the other language.
But this translational data is time consuming and difficult to gather, and simply may not exist for many of the 7,000 languages spoken worldwide. Recently, researchers have been developing âmonolingualâ models that make translations between texts in two languages, but without direct translational information between the two.
In a paper being presented this week at the Conference on Empirical Methods in Natural Language Processing, researchers from MITâs Computer Science and Artificial Intelligence Laboratory (CSAIL) describe a model that runs faster and more efficiently than these monolingual models....
On the island of Java, in Indonesia, the silvery gibbon, an endangered primate, lives in the rainforests. In a behavior thatâs unusual for a primate, the silvery gibbon sings: It can vocalize long, complicated songs, using 14 different note types, that signal territory and send messages to potential mates and family.
Far from being a mere curiosity, the silvery gibbon may hold clues to the development of language in humans. In a newly published paper, two MIT professors assert that by re-examining contemporary human language, we can see indications of how human communication could have evolved from the systems underlying the older communication modes of birds and other primates.
From birds, the researchers say, we derived the melodic part of our language, and from other primates, the pragmatic, content-carrying parts of speech. Sometime within the last 100,000 years, those capacities fused into roughly the form of human language that we know today.
But how? Other animals, it appears, have finite sets of things they can express; human language is unique in allowing for an infinite set of new meanings. What allowed unbounded human language to evolve from bounded language systems?...
The Machine Making sense of AI
Read the VentureBeat Jobs guide to employer branding
MIT researchers have concluded that the well-known ImageNet data set has âsystematic annotation issuesâ and is misaligned with ground truth or direct observation when used as a benchmark data set.
âOur analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for,â the researchers write in a paper titled âFrom ImageNet to Image Classification: Contextualizing Progress on Benchmarks.â âWe believe that developing annotation pipelines that better capture the ground truth while remaining scalable is an important avenue for future research.â
When the Stanford University Vision Lab introduced ImageNet at the Conference on Computer Vision and Pattern Recognition (CVPR) in 2009, it was much larger than many previously existing image data sets. The ImageNet data set contains millions of photos and was assembled over the span of more than two years. ImageNet uses the WordNet hierarchy for data labels and is widely used as a benchmark for object recognition models. Until 2017, annual competitions with ImageNet also played a role in advancing the field of computer vision....