| Time: | Thursday 13:30 | Place: | 302 | Type: | Oral |
| Chair: | Keiichi Tajima | ||||
| 13:30 | Modelling the effect of speaker familiarity and noise on infant word recognition |
| (Centre for Language and Speech Techonology, Radboud University Nijmegen, The Netherlands; International Max Planck Research School for Language Sciences, Radboud University Nijmegen, The Netherlands) (Centre for Language and Speech Techonology, Radboud University Nijmegen, The Netherlands) (Centre for Language and Speech Techonology, Radboud University Nijmegen, The Netherlands) | |
| In the present paper we show that a general-purpose word learning model can simulate several important findings from recent experiments in language acquisition. Both the addition of background noise and varying the speaker have been found to influence infants' performance during word recognition experiments. We were able to replicate this behaviour in our artificial word learning agent. We use the results to discuss both advantages and limitations of computational models of language acquisition. | |
| 13:50 | Unsupervised Learning of Vowels from Continuous Speech based on Self-organized Phoneme Acquisition Model |
| (Graduate School of Human Sciences, Waseda University, Japan) (Graduate School of Human Sciences, Waseda University, Japan) (RIKEN Brain Science Institute, Japan) | |
| All normal humans can acquire native phoneme systems naturally. However, it is unclear as to how infants learn the acoustic expression of each phoneme of their languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However these studies have used a reading speech that has a limited vocabulary as input and do not handle a continuous speech. Therefore, we use a natural speech and build a self-organization model that simulates the cognitive ability, and we analyze the information that is necessary for the acquisition of the native vowels. Our model is designed to learn a natural continuation utterance and to estimate the number and boundaries of the vowel categories. In the simulation trial, we investigate the relationship between the quantity of learning and the accuracy for the vowels in a single Japanese speaker’s speech. As a result, it is found that the vowel recognition rate of our model is comparable to that of an adult. | |
| 14:10 | Learning speaker normalization using semisupervised manifold alignment |
| (Department of Linguistics, The Ohio State University, Columbus, OH, USA) (Department of Linguistics, The Ohio State University, Columbus, OH, USA) (Dept. of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA) (Dept. of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA) (Department of Speech-Language-Hearing Sciences, University of Minnesota, USA) | |
| As a child acquires language, he or she: perceives acoustic information in his or her surrounding environment; identifies portions of the ambient acoustic information as language-related; and associates that language-related information with his or her perception of his or her own language-related acoustic productions. The present work models the third task. We use a semisupervised alignment algorithm based on manifold learning. We discuss the concepts behind this approach, and the application of the algorithm to this task. We present experimental evidence indicating the usefulness of manifold alignment in learning speaker normalization. | |
| 14:30 | Fully Unsupervised Word Learning from Continuous Speech Using Transitional Probabilities of Atomic Acoustic Events |
| (Department of Signal Processing and Acoustics, Aalto University School of Science and Technology, Finland) | |
| This work presents a learning algorithm based on transitional probabilities of atomic acoustic events. The algorithm learns models for word-like units in speech without any supervision, and without a priori knowledge of phonemic or linguistic units. The learned models can be used to segment novel utterances into word-like units, supporting the theory that transitional probabilities of acoustic events could work as a bootstrapping mechanism of language learning. The performance of the algorithm is evaluated using a corpus of Finnish infant-directed speech. | |
| 14:50 | Language acquisition and cross-modal associations - computational simulation of the results of infant studies |
| (Radboud University Nijmegen) (Radboud University Nijmegen) | |
| This paper discusses recent results obtained with a computational model of language acquisition. This model, developed in the ACORNS project, has shown to be able to learn word-like units from stimuli in which utterances are paired with visual information. In this paper we extend the ACORNS experiments to ambiguous stimuli, as to obtain a computational correlate of the findings by Smith and Yu in 2008. Smith and Yu stipulate that a young infant is confronted with an uncertainty problem, how to pair a word, embedded in a sentence, and a referent, embedded in a rich visual scene. They show that young infants can resolve the uncertainty problem by evaluating the statistical evidence across many individually ambiguous words and scenes. We investigate to what extent the ACORNS model is able to deal with cross-modal ambiguity. Moreover, we show the positive effect of an 'active' role during learning when confronted with ambiguity, based on internal confidence. | |
| 15:10 | Active word learning under uncertain input conditions |
| (Radboud University Nijmegen, International Max Planck Research School for Language Sciences, Nijmegen) (Radboud University Nijmegen) (Radboud University Nijmegen) | |
| In this paper we investigate a computational model of word learning that is cognitively plausible. The model is partly trained on incorrect form-referent pairings, modelling the input to a word-learning child that may contain such mismatches due to inattention to a joint communicative scene. We introduce a procedure of active learning, based on attested cognitive processes. We then show how this procedure can help overcome the unreliability of the input by detecting and correcting the mismatches by reliance on previously built up experience. |