Using Artificial Neural Networks for Studying Human Language Learning and Processing

Workshop (10-12 June, 2024)

Artificial Neural Networks (ANNs) have proven to be powerful learning devices for language-related tasks, as demonstrated by recent progress in artificial intelligence driven by large, Transformer-based language models. But how can ANNs inform us about human language learning and processing? Our three-day workshop brings together researchers working on cognitively motivated and linguistic questions in studying the language processing mechanisms and learning trajectories of ANNs.

For the first two days of the programme, we hope to stimulate discussion on the workshop theme through contributed presentations from our workshop participants and keynote speakers. The final day is focussed on active interactions and collaboration between participants, through small-scale tutorials and joint group work on a collaborative task. See our provisional programme below for more information and currently confirmed participants!

Registration

Registration is now closed, and we have reached our maximum capacity for in-person participation. We have opened a to sign up for the wait list. Please note that spaces may only become available on very short notice before the workshop.
Streaming Items marked with on our programme will be streamed via Zoom. Please leave your e-mail address in this if you would like to receive the link!

Organizers

Tamar Johnson (t.johnson@uva.nl)
Marianne de Heer Kloots (m.l.s.deheerkloots@uva.nl)

Venue

Institute for Logic, Language and Computation
SustainaLab event space, Amsterdam Science Park campus, University of Amsterdam

Keynote speakers:

Arianna Bisazza (University of Groningen)

Can modern Language Models be truly polyglot? Language learnability and inequalities in NLP

Despite their impressive advances, modern Language Models (LMs) are still far from reaching language equality, i.e. comparable performance in all languages. The uneven amount of data available in different languages is often recognized as the main culprit. However, another obstacle to language equality is posed by the observation that some languages are intrinsically more difficult to model than others by modern LM architectures, even when training data size is controlled for.

In this talk, I will present evidence supporting this observation, coming from different tasks and different evaluation methodologies (e.g. using natural versus synthetic languages). I will then argue for the usefulness of artificial languages to unravel the complex interplay between language properties and learnability by neural networks. Finally, I will provide an outlook of my upcoming project aimed at improving language modeling for (low-resource) morphologically rich languages, taking inspiration from child language acquisition.

Eva Portelance (Mila; HEC Montréal)

What neural networks can teach us about how we learn language

How can modern neural networks like large language models be useful to the field of language acquisition, and more broadly cognitive science, if they are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT-4, the question of how they can inform our understanding of human language acquisition has re-emerged. This talk address how AI models as objects of study can indeed be useful tools for understanding how humans learn language. It will present three approaches for studying human learning behaviour using different types of neural networks and experimental designs, each illustrated through a specific case study.

Understanding how humans learn is an important problem for cognitive science and a window into how our minds work. Additionally, human learning is in many ways the most efficient and effective algorithm there is for learning language; understanding how humans learn can help us design better AI models in the future.

Ethan Wilcox (ETH Zürich)

Using artificial neural networks to study human language processing: Two case studies and a warning

Neural network language models are pure prediction engines, they have no communicative intent, and they do not learn language through social interactions. Despite this, I argue that they can be used to study human language processing, in particular, to empirically evaluate theories that are based on probability distributions over words. In the first half of this talk, I discuss two case studies in this vein, focusing on psycholinguistic theories of incremental processing, as well as regressions, or backward saccades between words. In the second half of the talk, I take a step back and discuss the impact of scaling for the usefulness of ANNs in psycholinguistics research. Scaling is the trend toward producing ever-larger models, both in terms of parameter counts and in terms of the amount of data they are trained on. While largely beneficial to performance on downstream benchmarking tasks, scaling has several downsides for computational psycholinguistics. I will discuss the scientific and practical challenges presented by scaling for neural network modeling, as well as the benefits that would result from human-scale language modeling research.

Programme

▾ Monday June 10th: First day of keynote, talks and posters
13.30 - 14.00	Walk-in & Registration
14.00 - 14.15	Opening
14.15 - 15.15	Keynote lecture by Arianna Bisazza (University of Groningen): Can modern Language Models be truly polyglot? Language learnability and inequalities in NLP
15.15 - 15.45	Talk by Tessa Verhoef (Leiden University): The emergence of language universals in neural agents and vision-and-language models
15.45 - 16.05	Break
16.05 - 16.35	Talk by Lukas Galke (Max Planck Institute for Psycholinguistics): Emergent communication and learning pressures in language models
16.35 - 18.00	Poster session (see list of posters)
19.00 - 21.00	Workshop dinner
▾ Tuesday June 11th: Second day of keynotes, talks and discussions
09.45 - 10.45	Keynote lecture by Eva Portelance (Mila; HEC Montréal): What neural networks can teach us about how we learn language
Session on language learning
10.45 - 11.15	Talk by Raquel G. Alhama (University of Amsterdam): The Development of a Syntactic Category: Modeling Determiner Productivity in English-speaking Children
11.15 - 11.45	Break
11.45 - 12.15	Talk by Kyle Mahowald (University of Texas at Austin): ANNs and AANNs: Linguistic Insights from Language Models
12.15 - 12.45	Talk by Yevgen Matusevych (University of Groningen): Mutual exclusivity bias in visually grounded speech models
12.45 - 13.15	Discussion on using ANNs in modelling language learning Moderator: Jelle Zuidema (University of Amsterdam)
13.15 - 14.30	Lunch
14.30 - 15.30	Keynote lecture by Ethan Wilcox (ETH Zürich): Using artificial neural networks to study human language processing: Two case studies and a warning
Session on language processing
15.30 - 16.00	Talk by Irene Winther (University of Edinburgh): Cumulative frequency can explain cognate facilitation in language models
16.00 - 16.20	Break
16.20 - 16.50	Talk by Pierre Orhan (École Normale Supérieure): Algebraic structures emerge from the self-supervised learning of natural sounds
16.50 - 17.20	Position statements by Micha Heilbron (University of Amsterdam) and Stefan Frank (Radboud University Nijmegen)
17.20 - 17.50	Discussion on using ANNs in modelling language processing Moderator: Stefan Frank (Radboud University)
17.50 - 18.00	Closing
▾ Wednesday June 12th: Interactions on learning trajectories in smaller- and larger-scale models
09.45 - 10.00	Introduction to the tutorials and collaborative tasks
10.00 - 10.30	First tutorial by Oskar van der Wal & Marianne de Heer Kloots (University of Amsterdam): What's in a developmental phase? Training dynamics & Behavioural characterizations of grammar learning
10.30 - 11.30	Second tutorial by Henry Conklin (University of Edinburgh): Language learning as regularization: A non-parametric probing method for studying the emergence of structured representations over model training
11.30 - 12.15	Split up in groups & brainstorm
12.15 - 13.15	Group work on collaborative tasks
13.15 - 14.15	Lunch
14.15 - 16.15	Group work on collaborative tasks
16.15 - 17.30	Discussion of findings
17.30	Drinks

List of posters

(this list is growing while registration is still open!)

Georgia Carter (University of Edinburgh)
Predicting long context effects using surprisal
Victor Zimmermann (Göttingen University)
Compositional phrase embeddings from latent Tree-LSTM representations
Michelle Suijkerbuijk (Radboud University)
How (not) to test the syntactic knowledge in artificial neural networks: the case of island constraints
Michael Hanna (University of Amsterdam)
Building Mechanistic Bridges Between Biological and Artificial Neural Networks
Dota Dong (Max Planck Institute for Psycholinguistics)
Multimodal Video Transformers Partially Align with Multimodal Grounding and Compositionality in the Brain
Joséphine Raugel (École Normale Supérieure, Meta AI)
Decoding of Hierarchical Inference in the Human Brain during Speech Processing with Large Language Models
Shangmin Guo (University of Edinburgh)
Cultural evolution in the age of generative AI
Tom Kouwenhoven (Leiden University)
The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication
Jaap Jumelet (University of Amsterdam)
Do Language Models Exhibit Human-like Structural Priming Effects?
Henry Conklin (University of Edinburgh)
Representations as Language: An Information Theoretic Approach to Interpretability
Polina Tsvilodub (University of Tübingen)
Models of pragmatic language use with scaffolded LLMs
Torrey Snyder (University of Amsterdam)
Encoding Semantic Scene Descriptions: Cross-Modal Representations from Language to Vision
Nitya Shah (University of Amsterdam)
Social Reasoning (Theory of Mind) in Language-and-Vision Models and Humans
Matthias Brucklacher (University of Amsterdam)
Prediction as a mechanism for cognitive representation learning
Jelke Bloem (University of Amsterdam)
Using Collostructional Analysis to evaluate BERT’s representation of linguistic constructions
Dang Thi Thao Anh (Radboud University)
Harnessing Cross-lingual Morphological Generalization Abilities in Large Language Models with a Multilingual Wug Test

Acknowledgements

This workshop is supported by and organized as part of the Language in Interaction consortium (NWO Gravitation Grant 024.001.006). We are also very thankful to the SustainaLab for lending us their space, to Jelle Zuidema, Robert van Rooij and the ILLC office for organizational advice and support, and to Maithe van Noort for on-site assistance!