Workshop (10-12 June, 2024)
Artificial Neural Networks (ANNs) have proven to be powerful learning devices for language-related tasks, as demonstrated by recent progress in artificial intelligence driven by large, Transformer-based language models. But how can ANNs inform us about human language learning and processing? Our three-day workshop brings together researchers working on cognitively motivated and linguistic questions in studying the language processing mechanisms and learning trajectories of ANNs.
For the first two days of the programme, we hope to stimulate discussion on the workshop theme through contributed presentations from our workshop participants and keynote speakers. The final day is focussed on active interactions and collaboration between participants, through small-scale tutorials and joint group work on a collaborative task. See our provisional programme below for more information and currently confirmed participants!
Registration
Registration is now closed, and we have reached our maximum capacity for in-person participation. We have opened a
to sign up for the wait list. Please note that spaces may only become available on very short notice before the workshop.
Streaming Items marked with
on our
programme will be streamed via Zoom. Please leave your e-mail address in this
if you would like to receive the link!
Organizers
Tamar Johnson (t.johnson@uva.nl)
Marianne de Heer Kloots (m.l.s.deheerkloots@uva.nl)
Venue
Institute for Logic, Language and Computation
SustainaLab event space, Amsterdam Science Park campus, University of Amsterdam
Keynote speakers:
Arianna Bisazza (University of Groningen) Despite their impressive advances, modern Language Models (LMs) are still far from reaching language equality, i.e. comparable performance in all languages.
The uneven amount of data available in different languages is often recognized as the main culprit. However, another obstacle to language equality is posed by the observation that some languages are intrinsically more difficult to model than others by modern LM architectures, even when training data size is controlled for. In this talk, I will present evidence supporting this observation, coming from different tasks and different evaluation methodologies (e.g. using natural versus synthetic languages).
I will then argue for the usefulness of artificial languages to unravel the complex interplay between language properties and learnability by neural networks.
Finally, I will provide an outlook of my upcoming project aimed at improving language modeling for (low-resource) morphologically rich languages, taking inspiration from child language acquisition.Can modern Language Models be truly polyglot? Language learnability and inequalities in NLP
Eva Portelance (Mila; HEC Montréal) How can modern neural networks like large language models be useful to the field of language acquisition, and more broadly cognitive science, if they are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT-4, the question of how they can inform our understanding of human language acquisition has re-emerged. This talk address how AI models as objects of study can indeed be useful tools for understanding how humans learn language. It will present three approaches for studying human learning behaviour using different types of neural networks and experimental designs, each illustrated through a specific case study. Understanding how humans learn is an important problem for cognitive science and a window into how our minds work. Additionally, human learning is in many ways the most efficient and effective algorithm there is for learning language; understanding how humans learn can help us design better AI models in the future.What neural networks can teach us about how we learn language
Ethan Wilcox (ETH Zürich)Using artificial neural networks to study human language processing: Two case studies and a warning
Neural network language models are pure prediction engines, they have no communicative intent, and they do not learn language through social interactions. Despite this, I argue that they can be used to study human language processing, in particular, to empirically evaluate theories that are based on probability distributions over words. In the first half of this talk, I discuss two case studies in this vein, focusing on psycholinguistic theories of incremental processing, as well as regressions, or backward saccades between words. In the second half of the talk, I take a step back and discuss the impact of scaling for the usefulness of ANNs in psycholinguistics research. Scaling is the trend toward producing ever-larger models, both in terms of parameter counts and in terms of the amount of data they are trained on. While largely beneficial to performance on downstream benchmarking tasks, scaling has several downsides for computational psycholinguistics. I will discuss the scientific and practical challenges presented by scaling for neural network modeling, as well as the benefits that would result from human-scale language modeling research.
Programme
▾ Monday June 10th: First day of keynote, talks and posters | |
---|---|
13.30 - 14.00 | Walk-in & Registration |
14.00 - 14.15 | Opening |
14.15 - 15.15 | Keynote lecture by Arianna Bisazza (University of Groningen): Can modern Language Models be truly polyglot? Language learnability and inequalities in NLP |
15.15 - 15.45 | Talk by Tessa Verhoef (Leiden University): The emergence of language universals in neural agents and vision-and-language models |
15.45 - 16.05 | Break |
16.05 - 16.35 | Talk by Lukas Galke (Max Planck Institute for Psycholinguistics): Emergent communication and learning pressures in language models |
16.35 - 18.00 | Poster session (see list of posters) |
19.00 - 21.00 | Workshop dinner |
▾ Tuesday June 11th: Second day of keynotes, talks and discussions | |
09.45 - 10.45 | Keynote lecture by Eva Portelance (Mila; HEC Montréal): What neural networks can teach us about how we learn language |
Session on language learning | |
10.45 - 11.15 | Talk by Raquel G. Alhama (University of Amsterdam): The Development of a Syntactic Category: Modeling Determiner Productivity in English-speaking Children |
11.15 - 11.45 | Break |
11.45 - 12.15 | Talk by Kyle Mahowald (University of Texas at Austin): ANNs and AANNs: Linguistic Insights from Language Models |
12.15 - 12.45 | Talk by Yevgen Matusevych (University of Groningen): Mutual exclusivity bias in visually grounded speech models |
12.45 - 13.15 | Discussion on using ANNs in modelling language learning Moderator: Jelle Zuidema (University of Amsterdam) |
13.15 - 14.30 | Lunch |
14.30 - 15.30 | Keynote lecture by Ethan Wilcox (ETH Zürich): Using artificial neural networks to study human language processing: Two case studies and a warning |
Session on language processing | |
15.30 - 16.00 | Talk by Irene Winther (University of Edinburgh): Cumulative frequency can explain cognate facilitation in language models |
16.00 - 16.20 | Break |
16.20 - 16.50 | Talk by Pierre Orhan (École Normale Supérieure): Algebraic structures emerge from the self-supervised learning of natural sounds |
16.50 - 17.20 | Position statements by Micha Heilbron (University of Amsterdam) and Stefan Frank (Radboud University Nijmegen) |
17.20 - 17.50 | Discussion on using ANNs in modelling language processing Moderator: Stefan Frank (Radboud University) |
17.50 - 18.00 | Closing |
▾ Wednesday June 12th: Interactions on learning trajectories in smaller- and larger-scale models | |
09.45 - 10.00 | Introduction to the tutorials and collaborative tasks |
10.00 - 10.30 | First tutorial by Oskar van der Wal & Marianne de Heer Kloots (University of Amsterdam): What's in a developmental phase? Training dynamics & Behavioural characterizations of grammar learning |
10.30 - 11.30 | Second tutorial by Henry Conklin (University of Edinburgh): Language learning as regularization: A non-parametric probing method for studying the emergence of structured representations over model training |
11.30 - 12.15 | Split up in groups & brainstorm |
12.15 - 13.15 | Group work on collaborative tasks |
13.15 - 14.15 | Lunch |
14.15 - 16.15 | Group work on collaborative tasks |
16.15 - 17.30 | Discussion of findings |
17.30 | Drinks |
List of posters
(this list is growing while registration is still open!)
- Georgia Carter (University of Edinburgh)
Predicting long context effects using surprisal - Victor Zimmermann (Göttingen University)
Compositional phrase embeddings from latent Tree-LSTM representations - Michelle Suijkerbuijk (Radboud University)
How (not) to test the syntactic knowledge in artificial neural networks: the case of island constraints - Michael Hanna (University of Amsterdam)
Building Mechanistic Bridges Between Biological and Artificial Neural Networks - Dota Dong (Max Planck Institute for Psycholinguistics)
Multimodal Video Transformers Partially Align with Multimodal Grounding and Compositionality in the Brain - Joséphine Raugel (École Normale Supérieure, Meta AI)
Decoding of Hierarchical Inference in the Human Brain during Speech Processing with Large Language Models - Shangmin Guo (University of Edinburgh)
Cultural evolution in the age of generative AI - Tom Kouwenhoven (Leiden University)
The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication - Jaap Jumelet (University of Amsterdam)
Do Language Models Exhibit Human-like Structural Priming Effects? - Henry Conklin (University of Edinburgh)
Representations as Language: An Information Theoretic Approach to Interpretability - Polina Tsvilodub (University of Tübingen)
Models of pragmatic language use with scaffolded LLMs - Torrey Snyder (University of Amsterdam)
Encoding Semantic Scene Descriptions: Cross-Modal Representations from Language to Vision - Nitya Shah (University of Amsterdam)
Social Reasoning (Theory of Mind) in Language-and-Vision Models and Humans - Matthias Brucklacher (University of Amsterdam)
Prediction as a mechanism for cognitive representation learning - Jelke Bloem (University of Amsterdam)
Using Collostructional Analysis to evaluate BERT’s representation of linguistic constructions - Dang Thi Thao Anh (Radboud University)
Harnessing Cross-lingual Morphological Generalization Abilities in Large Language Models with a Multilingual Wug Test
Acknowledgements
This workshop is supported by and organized as part of the Language in Interaction consortium (NWO Gravitation Grant 024.001.006). We are also very thankful to the SustainaLab for lending us their space, to Jelle Zuidema, Robert van Rooij and the ILLC office for organizational advice and support, and to Maithe van Noort for on-site assistance!