Vector Space Semantics Symposium Today

Come out to CSLI today for the Cognition & Language Workshop’s Symposium on Compositional Vector Space Semantics. The emerging field of compositional probabilistic vector space semantics for natural languages and other symbol systems is being approached from multiple perspectives: language, cognition, and engineering. This symposium aims to promote fruitful discussions of interactions between approaches, with the goal of increasing collaboration and integration. Be sure to click here if you’re planning to come, so they can estimate attendance.

Schedule of Events:

9:00 – 9:30 Light breakfast

9:30 – 11:00 Chung-chieh Shan (Indiana)
From Language Models to Distributional Semantics
Discussant: Noah Goodman

11:15 – 12:45 Richard Socher (Stanford)
Recursive Deep Learning for Modeling Semantic Compositionality
Discussant: Thomas Icard

12:45 – 2:00 Lunch

2:00 – 3:30 Stephen Clark (Cambridge)
A Mathematical Framework for a Compositional Distributional Model of Meaning
Discussant: Stanley Peters

3:45 – 5:00 Breakout Groups and Discussion

5:00 – Snacks & Beverages

Chung-chieh Shan
University of Indiana
Talk time: 9:30am – 11:00am

Title: From Language Models to Distributional Semantics

Abstract: Distributional semantics represents what an expression means as a vector that summarizes the contexts where it occurs. This approach has successfully extracted semantic relations such as similarity and entailment from large corpora. However, it remains unclear how to take advantage of syntactic structure, pragmatic context, and multiple information sources to overcome data sparsity. These issues also confront language models used for statistical parsing, machine translation, and text compression.

Thus, we seek guidance by converting language models into distributional semantics. We propose to convert any probability distribution over expressions into a denotational semantics in which each phrase denotes a distribution over contexts. Exploratory data analysis led us to hypothesize that the more accurate the expression distribution is, the more accurate the distributional semantics tends to be. We tested this hypothesis on two expression distributions that can be estimated using a tiny corpus: a bag-of-words model, and a lexicalized probabilistic context-free grammar a la Collins.

Richard Socher
Stanford University
Talk time: 11:15am – 12:45pm

Title: Recursive Deep Learning for Modeling Semantic Compositionality

Abstract: Compositional and recursive structure is commonly found in different modalities, including natural language sentences and scene images. I will introduce several recursive deep learning models that, unlike standard deep learning methods can learn compositional meaning vector representations for phrases, sentences and images. These recursive neural network based models obtain state-of-the-art performance on a variety of syntactic and semantic language tasks such as parsing, paraphrase detection, relation classification and sentiment analysis.

Besides the good performance, the models capture interesting phenomena in language such as compositionality. For instance the models learn different types of high level negation and how it can change the meaning of longer phrases with many positive words. They can learn that the sentiment following a “but” usually dominates that of phrases preceding the “but.”Furthermore, unlike many other machine learning approaches that rely on human designed feature sets, features are learned as part of the model.

Stephen Clark
University of Cambridge
Talk time: 2:00pm – 3:30pm

Title: A Mathematical Framework for a Compositional Distributional Model of Meaning

Abstract: In this talk I will describe a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types (based on categorial grammar). A key idea is that the meanings of functional words, such as verbs and adjectives, will be represented using tensors of various types. This mathematical framework enables us to compute the distributional meaning of a well-typed sentence from the distributional meanings of its constituents. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model.

There are two key questions that the framework leaves open: 1) what are the basis vectors of the sentence space? and 2) how can the values in the tensors be acquired? I will sketch some of the ideas we have for how to answer these questions.