Why do infants learn language so fast? A reverse engineering approach

This project develops a computational model to explore how infants efficiently learn language through statistical learning and three additional mechanisms, aiming to produce comparable outcomes to children's language acquisition.

Subsidie
€ 2.494.625
2025

Projectdetails

Introduction

How do infants learn their first language(s)? The popular yet controversial 'statistical learning hypothesis' posits that they learn by gradually collecting statistics over their language inputs. This is strikingly similar to how current AI's Large Language Models (LLMs) learn and shows that simple statistical mechanisms may be sufficient to attain adult-like language competence.

Language Input Estimates

But does it? Estimates of language inputs to children show that by age 3, they have received 2 or 3 orders of magnitude less data than LLMs of similar performance. The gap grows exponentially larger with children's age. Worse, when models are fed with speech instead of text, they learn even slower.

Research Question

How are infants such efficient learners? This project tests the hypothesis that in addition to statistical learning, infants benefit from three mechanisms that accelerate their learning rate:

  1. They are born with a vocal tract which helps them understand the link between abstract motor commands and speech sounds, and decode noisy speech inputs more efficiently.
  2. They have an episodic memory enabling them to learn from unique events, instead of gradually learning from thousands of repetitions.
  3. They start with an evolved learning architecture optimized for generalization from few and noisy inputs.

Methodology

Our approach is to build a computational model of the learner (an infant simulator), which when fed by realistic language input produces outcome measures comparable to children's (laboratory experiments, vocabulary estimates). This gives a quantitative estimate of the efficiency of each of the three mechanisms, as well as new testable predictions.

Language Focus

We start with English and French that have both accessible large annotated speech corpora and documented acquisition landmarks and focus on the first three years of life.

Community Building

We then help build similar resources across a larger set of languages by fostering a cross-disciplinary community that shares tools, data, and analysis methods.

Financiële details & Tijdlijn

Financiële details

Subsidiebedrag€ 2.494.625
Totale projectbegroting€ 2.494.625

Tijdlijn

Startdatum1-1-2025
Einddatum31-12-2029
Subsidiejaar2025

Partners & Locaties

Projectpartners

  • ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALESpenvoerder
  • ECOLE NORMALE SUPERIEURE

Land(en)

France

Vergelijkbare projecten binnen European Research Council

ERC STG

MANUNKIND: Determinants and Dynamics of Collaborative Exploitation

This project aims to develop a game theoretic framework to analyze the psychological and strategic dynamics of collaborative exploitation, informing policies to combat modern slavery.

€ 1.497.749
ERC STG

Elucidating the phenotypic convergence of proliferation reduction under growth-induced pressure

The UnderPressure project aims to investigate how mechanical constraints from 3D crowding affect cell proliferation and signaling in various organisms, with potential applications in reducing cancer chemoresistance.

€ 1.498.280
ERC STG

Uncovering the mechanisms of action of an antiviral bacterium

This project aims to uncover the mechanisms behind Wolbachia's antiviral protection in insects and develop tools for studying symbiont gene function.

€ 1.500.000
ERC STG

The Ethics of Loneliness and Sociability

This project aims to develop a normative theory of loneliness by analyzing ethical responsibilities of individuals and societies to prevent and alleviate loneliness, establishing a new philosophical sub-field.

€ 1.025.860

Vergelijkbare projecten uit andere regelingen

ERC COG

Multiple routes to memory for a second language: Individual and situational factors

This project investigates alternative routes to second language acquisition by applying memory research theories to understand individual and situational differences in learning processes.

€ 2.000.000
ERC STG

Infant verbal Memory in Development: a window for understanding language constraints and brain plasticity from birth

IN-MIND investigates the development of verbal memory in infants to understand its role in language learning, using innovative methods to identify memory capacities and intervention windows.

€ 1.499.798
ERC COG

DEep COgnition Learning for LAnguage GEneration

This project aims to enhance NLP models by integrating machine learning, cognitive science, and structured memory to improve out-of-domain generalization and contextual understanding in language generation tasks.

€ 1.999.595
ERC STG

Gates to Language

The GALA project investigates the biological mechanisms of language acquisition in humans and nonhuman species to uncover why only humans can learn language.

€ 1.490.057