The Talkidz project, funded by Fondazione CRT, aims to develop intelligent software to assist speech therapists working with children with speech disorders.
The main aim is to create a speech-to-text tool capable of transcribing children’s real speech, without the correction and error correction that commercial tools require.
Currently, the Paideia Foundation has developed software that analyses speech samples produced by normative and speech-impaired children. So far, however, the transcription is still in the hands of the operator, which is time-consuming and subjective in the collection of data, with a negative impact on the earliness and objectivity of the diagnosis.
We therefore intend to integrate and then validate machine learning models into the existing software, which can both automate the data collection process (transcription of speech without correction of perceived errors) and provide intelligent suggestions in the post-assessment phase, from the point of view of the habilitation-rehabilitation pathway.
In order to be able to make the machine learning processes underlying the machine learning models used effective, it is necessary to carry out a normative collection of speech samples using ad hoc created illustrations that allow the stimulation of the entire phonological inventory of the Italian language.
The final objective is to provide a software with an interface where the speech therapist can upload an audio file in .wav format and receive as output the phonetic transcription and certain linguistic analyses (lexical, morphological, syntactic) in order to support the diagnosis and therapy of the patient.