How Do Infants Learn Sounds in Their Native Language?

Infants can differentiate most sounds soon after birth, and by age 1, they become language-specific listeners. But researchers are still trying to understand how babies recognize which acoustic dimensions of their language are contrastive, a linguistics term that describes differences between speech sounds that can change the meanings of words. For example, in English, [b] and [d] are contrastive, because changing the [b] in ‘ball’ to a [d] makes it into a different word, ‘doll’.

A recent paper in The Proceedings of the National Academy of Sciences (PNAS) by two computational linguists affiliated with the University of Maryland offers new insight on this topic, which is imperative for a better understanding of how infants learn what the sounds of their native language are.

Their research shows that an infant’s ability to interpret acoustic differences as either contrastive or non-contrastive may come from the contexts that different sounds occur in.

For a long time, researchers believed that there would be obvious differences between the way that contrastive sounds, such as short and long vowels in Japanese, are pronounced. However, although the pronunciations of these two sounds are different in careful speech, the acoustics are often much more ambiguous in more natural settings.

“This is one of the first phonetic learning accounts that has been shown to work on spontaneous data, suggesting that infants could be learning which acoustic dimensions are contrastive after all,” says Kasia Hitczenko, lead author of the paper.

Hitczenko graduated from the University of Maryland in 2019 with a doctorate in linguistics. She is currently a postdoctoral scholar in the Cognitive Sciences and Psycholinguistics Laboratory at Ecole Normale Supérieure in Paris.

Hitczenko’s work shows that babies can differentiate acoustic sounds based on context clues, such as neighboring sounds. Her team tested their theory in two case studies with two different definitions of context, by comparing data on Japanese, Dutch, and French.

The researchers collected speech that occurred in different contexts and made plots summarizing what the vowel durations were in each context. In Japanese, they found that these vowel duration plots distinctly varied in different contexts, because some contexts had more short vowels, whereas other contexts had more long vowels. In French, these vowel duration plots were similar in all the contexts.

“We believe this work presents a compelling account on how infants learn the speech contrasts of their language, and shows that the necessary signal is present in naturalistic speech, advancing our understanding of early language learning,” says co-author Naomi Feldman, an associate professor of linguistics with an appointment in the University of Maryland Institute for Advanced Computer Studies.

Feldman adds that the signal they studied holds true across most languages, and it’s likely that their result can be generalized to other contrasts.

The recently published research is an extension of Hitczenko’s Ph.D. thesis, which examined how to use context for phonetic learning and perception from naturalistic speech.

The work was supported in part by the National Science Foundation, including a $520K award on “Modeling the Development of Phonetic Representations” and a $240K award on “Cognitive Models of the Acquisition of Vowels in Context.”

–Original story by Maria Herd published on the University of Maryland Institute for Advanced Computer Studies website.

Published September 19, 2022