Author Archive

Cibelli for P&P Workshop

Today at 12:15 in the Greenberg Room, Emily Cibelli (Berkeley) will be presenting on some of her neurolinguistic work for the Phonetics and Phonology Workshop.

Early processing pathways of words and pseudowords: Evidence from electrocorticography

Pseudowords – phonotactically-legal novel forms like “blick” and “piteretion” – are common tools employed in studies of lexical processing. They are often compared to words, under the assumption that these novel forms isolate sub-lexical levels of processing; however, there is some debate about whether words and pseudowords utilize shared or distinct pathways at early stages of processing. Critically, the answer to this question affects the interpretation what is being isolated in word-pseudoword comparisons.
Read the rest of this entry »

McCloskey Colloquium Today, Social After

Come to the Greenberg Room (460-126) this afternoon for Jim McCloskey‘s (UCSC) colloquium on syntax and Irish. The talk will start at 3:30, and there will be a social after!

Examining the syntax of nonfinite clauses in modern varieties of Irish reveals a pattern of variation which is intricate, rapidly shifting, and revealing about how the fundamental grammatical relations should be understood. This paper tries to better understand those patterns and to learn from them about how variation could or should be understood in theoretical terms.

Markman SymSys Forum Monday

Come to the Greenberg Room on Monday 5/20 from 12:15-1:05. Ellen Markman (Psychology) will be presenting for the SymSys Forum, a talk entitled “How children generalize what they have learned: Factors that affect the scope, importance, and robustness of generalization”.

A fundamental component of learning is how to extend what was learned to new exemplars, situations, and contexts. Recent advances in the field have revealed that accumulating statistical evidence over time is only one of the factors that effects generalization. Moreover generalization is itself multifaceted: Is the new information deemed applicable to a narrow or broad range of exemplars or situations? Is the information acquired construed as central, definitive, essential or as less important? Is the generalization robust, made with confidence, or tentative and easily revised? To sort all of this out, children rely on a variety of sources of information including: (a) prior knowledge (b) linguistically conveyed information such as generic versus non-generic language (c) other communicative and social means of conveying information such as pragmatics, intentional versus accidental actions, the pedagogical stance, and trust in testimony. I will review recent research that highlights how children navigate these complicated issues.

Edwards for Anthro brown bag Monday

Terra Edwards (Berkeley) will be presenting for the Anthropology brown bag forum on Monday 5/20 from 12-1:05 in 50-51A. The title and abstract are below. Come on by!

Language Emergence as Condensation in the Seattle Deaf-Blind Community
This paper examines the socio-genesis of a tactile language currently emerging among Deaf-Blind people in Seattle, Washington. Language emergence has been understood in recent work on signed languages as a moment when form-meaning correspondences abstract away from the contexts of their use. Language emergence in the Seattle Deaf-Blind community suggests instead that via “condensation”, the linguistic system grows dense with its history of use. Read the rest of this entry »

Vector Space Semantics Symposium Today

Come out to CSLI today for the Cognition & Language Workshop’s Symposium on Compositional Vector Space Semantics. The emerging field of compositional probabilistic vector space semantics for natural languages and other symbol systems is being approached from multiple perspectives: language, cognition, and engineering. This symposium aims to promote fruitful discussions of interactions between approaches, with the goal of increasing collaboration and integration. Be sure to click here if you’re planning to come, so they can estimate attendance.

Schedule of Events:

9:00 – 9:30 Light breakfast

9:30 – 11:00 Chung-chieh Shan (Indiana)
From Language Models to Distributional Semantics
Discussant: Noah Goodman

11:15 – 12:45 Richard Socher (Stanford)
Recursive Deep Learning for Modeling Semantic Compositionality
Discussant: Thomas Icard

12:45 – 2:00 Lunch

2:00 – 3:30 Stephen Clark (Cambridge)
A Mathematical Framework for a Compositional Distributional Model of Meaning
Discussant: Stanley Peters

3:45 – 5:00 Breakout Groups and Discussion

5:00 – Snacks & Beverages

Chung-chieh Shan
University of Indiana
Talk time: 9:30am – 11:00am

Title: From Language Models to Distributional Semantics

Abstract: Distributional semantics represents what an expression means as a vector that summarizes the contexts where it occurs. This approach has successfully extracted semantic relations such as similarity and entailment from large corpora. However, it remains unclear how to take advantage of syntactic structure, pragmatic context, and multiple information sources to overcome data sparsity. These issues also confront language models used for statistical parsing, machine translation, and text compression.

Thus, we seek guidance by converting language models into distributional semantics. We propose to convert any probability distribution over expressions into a denotational semantics in which each phrase denotes a distribution over contexts. Exploratory data analysis led us to hypothesize that the more accurate the expression distribution is, the more accurate the distributional semantics tends to be. We tested this hypothesis on two expression distributions that can be estimated using a tiny corpus: a bag-of-words model, and a lexicalized probabilistic context-free grammar a la Collins.

Richard Socher
Stanford University
Talk time: 11:15am – 12:45pm

Title: Recursive Deep Learning for Modeling Semantic Compositionality

Abstract: Compositional and recursive structure is commonly found in different modalities, including natural language sentences and scene images. I will introduce several recursive deep learning models that, unlike standard deep learning methods can learn compositional meaning vector representations for phrases, sentences and images. These recursive neural network based models obtain state-of-the-art performance on a variety of syntactic and semantic language tasks such as parsing, paraphrase detection, relation classification and sentiment analysis.

Besides the good performance, the models capture interesting phenomena in language such as compositionality. For instance the models learn different types of high level negation and how it can change the meaning of longer phrases with many positive words. They can learn that the sentiment following a “but” usually dominates that of phrases preceding the “but.”Furthermore, unlike many other machine learning approaches that rely on human designed feature sets, features are learned as part of the model.

Stephen Clark
University of Cambridge
Talk time: 2:00pm – 3:30pm

Title: A Mathematical Framework for a Compositional Distributional Model of Meaning

Abstract: In this talk I will describe a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types (based on categorial grammar). A key idea is that the meanings of functional words, such as verbs and adjectives, will be represented using tensors of various types. This mathematical framework enables us to compute the distributional meaning of a well-typed sentence from the distributional meanings of its constituents. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model.

There are two key questions that the framework leaves open: 1) what are the basis vectors of the sentence space? and 2) how can the values in the tensors be acquired? I will sketch some of the ideas we have for how to answer these questions.