Issue 2013/05/10

Vector Space Semantics Symposium Today

Come out to CSLI today for the Cognition & Language Workshop’s Symposium on Compositional Vector Space Semantics. The emerging field of compositional probabilistic vector space semantics for natural languages and other symbol systems is being approached from multiple perspectives: language, cognition, and engineering. This symposium aims to promote fruitful discussions of interactions between approaches, with the goal of increasing collaboration and integration. Be sure to click here if you’re planning to come, so they can estimate attendance.

Schedule of Events:

9:00 – 9:30 Light breakfast

9:30 – 11:00 Chung-chieh Shan (Indiana)
From Language Models to Distributional Semantics
Discussant: Noah Goodman

11:15 – 12:45 Richard Socher (Stanford)
Recursive Deep Learning for Modeling Semantic Compositionality
Discussant: Thomas Icard

12:45 – 2:00 Lunch

2:00 – 3:30 Stephen Clark (Cambridge)
A Mathematical Framework for a Compositional Distributional Model of Meaning
Discussant: Stanley Peters

3:45 – 5:00 Breakout Groups and Discussion

5:00 – Snacks & Beverages

Chung-chieh Shan
University of Indiana
Talk time: 9:30am – 11:00am

Title: From Language Models to Distributional Semantics

Abstract: Distributional semantics represents what an expression means as a vector that summarizes the contexts where it occurs. This approach has successfully extracted semantic relations such as similarity and entailment from large corpora. However, it remains unclear how to take advantage of syntactic structure, pragmatic context, and multiple information sources to overcome data sparsity. These issues also confront language models used for statistical parsing, machine translation, and text compression.

Thus, we seek guidance by converting language models into distributional semantics. We propose to convert any probability distribution over expressions into a denotational semantics in which each phrase denotes a distribution over contexts. Exploratory data analysis led us to hypothesize that the more accurate the expression distribution is, the more accurate the distributional semantics tends to be. We tested this hypothesis on two expression distributions that can be estimated using a tiny corpus: a bag-of-words model, and a lexicalized probabilistic context-free grammar a la Collins.

Richard Socher
Stanford University
Talk time: 11:15am – 12:45pm

Title: Recursive Deep Learning for Modeling Semantic Compositionality

Abstract: Compositional and recursive structure is commonly found in different modalities, including natural language sentences and scene images. I will introduce several recursive deep learning models that, unlike standard deep learning methods can learn compositional meaning vector representations for phrases, sentences and images. These recursive neural network based models obtain state-of-the-art performance on a variety of syntactic and semantic language tasks such as parsing, paraphrase detection, relation classification and sentiment analysis.

Besides the good performance, the models capture interesting phenomena in language such as compositionality. For instance the models learn different types of high level negation and how it can change the meaning of longer phrases with many positive words. They can learn that the sentiment following a “but” usually dominates that of phrases preceding the “but.”Furthermore, unlike many other machine learning approaches that rely on human designed feature sets, features are learned as part of the model.

Stephen Clark
University of Cambridge
Talk time: 2:00pm – 3:30pm

Title: A Mathematical Framework for a Compositional Distributional Model of Meaning

Abstract: In this talk I will describe a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types (based on categorial grammar). A key idea is that the meanings of functional words, such as verbs and adjectives, will be represented using tensors of various types. This mathematical framework enables us to compute the distributional meaning of a well-typed sentence from the distributional meanings of its constituents. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model.

There are two key questions that the framework leaves open: 1) what are the basis vectors of the sentence space? and 2) how can the values in the tensors be acquired? I will sketch some of the ideas we have for how to answer these questions.

Deo Colloq today, Social after

Our very own alum Ashwini Deo will be giving a colloquium today at 3:30pm in the Greenberg Room. Come on out to hear her talk “The Semantic and Pragmatic Underpinnings of Grammaticalization Paths: the progressive and the imperfective”. Afterwards there will be a social, so stick around for drinks, snacks, dinner, and progressively imperfect conversation!

In this talk I offer an analysis of a robustly attested semantic change in which progressive markers “spontaneously” emerge in languages, get entrenched in the grammatical system, and diachronically grammaticalize into imperfective markers. Read the rest of this entry »

Boyd-Meredith and Connor MS Project Presentations Monday

The SymSys Forum presents M.S. Project Presentations by Jonathan Tyler Boyd-Meredith and Miriam Connor. Join them on Monday, 5/13 in the Greenberg Room from 12:15-1:05 to hear the following talks:

Detecting Long-Term, Autobiographical Memories Using fMRI (Jonathan Tyler Boyd-Meredith, advised by Anthony Wagner, Psychology)
Machine learning techniques are being applied increasingly to the field of neuroscience to interpret and make use of the data sets generated by fMRI experiments. This has led to striking results for both basic and applied research in many subfields of neuroscience, including learning and memory. In particular, there has been preliminary success in classifying previously encountered and novel stimuli as either remembered by a subject, or perceived as novel by a subject. However, these experiments have frequently been limited to memory for stimuli encountered in the lab shortly before the memory test. This study uses images collected at three time intervals (6 months, 3 months, and 2 weeks) using a wearable camera that regularly takes still photographs to investigate the performance of similar classifiers on memories for events that happen outside the lab, at multiple time intervals before the memory test.

Unsupervised Disambiguation of Preposition Senses with an LDA Topic Model (Miriam Connor, advised by Beth Levin, Linguistics)
Though it has received relatively little attention in the sense disambiguation literature, preposition sense disambiguation (PSD) represents a challenging task with important applications in machine translation and relation extraction. Most work on PSD has involved on supervised systems, but only a small amount of reliable annotated data is available for preposition sense. I present an unsupervised model for PSD, which performs sense discrimination using a Latent Dirichlet Allocation model and discovers semantic relations among the prepositions with group average agglomerative clustering. I compare my system’s performance with previous work on both supervised and unsupervised PSD models and suggest future directions and applications for PSD.