Mireia Farrús: “Technology can normalise a language like Catalan”
Mireia Farrús Cabeceran works in two fields that may not seem related at first glance, but which can potentially be interconnected. Farrús studied physics and linguistics and now focuses her research on speech technology and prosody, specifically in Catalan. She says that “the voice is a reflection of the soul” and her voice conveys serenity and didactics in a subject as complex and with so many applications as computational linguistics: accessibility, detection of illnesses or normalisation of Catalan in the digital sphere.
You studied physics and then linguistics, how do you bring these two fields together?
I always liked mathematics and language. I loved logic and syntactic structures, the formalisms with which we can describe the world. I hesitated a lot between mathematics and philology, and in the end my physics teacher in my final year of secondary school, who was very good, led me to choose physics. I had also studied music for many years, and I loved acoustics: understanding how sound is transmitted, speech, phonetics, and so on. And so I ended up studying physics and linguistics, which for me have a lot in common, and doing a PhD on speaker recognition, whose aim is to automatically identify people using their voice.
And then you turned to computational linguistics, what is it?
Computational linguistics is the processing of language using computers. We develop language-related tools that facilitate communication between us or with machines. Machine translation, for example, is one of the star applications of computational linguistics. Nowadays, voice-operated applications are becoming increasingly popular, such as speech recognition, which attempts to simulate human understanding computationally: you speak and a machine understands and transcribes it; and speech synthesis, which simulates the reverse process: you provide a text and the machine reads it. These technologies have allowed us to develop speech-enabled virtual assistants, such as Siri or Alexa, which, in addition to understanding what you say, can establish a dialogue using the information they have in storage.
Can speech synthesis and speech recognition be applied to accessibility issues, which are some of the great social challenges of this decade?
Undoubtedly. It is essential to work on the social component of these technologies. Both speech recognition and speech synthesis can also be applied to the specific needs of a person. For a person with a hearing impairment, it may be useful to use a speech recogniser and have in writing what is being said so that they can read it, while for a visually impaired person, having a machine that reads newspapers makes things easier. It is very important that these systems work, and above all that they are accessible to all and adapted to the needs of each user group. It is also important for the voices generated by these systems not to be robotic, but rather expressive and as close as possible to the human voice. Expressiveness and authenticity are sometimes difficult to achieve, and we research the characteristics of the human voice in terms of pitch, prosody, expression, etc. in order to incorporate them into these systems.
You also works with medical professionals to detect clinical disorders through speech.
Yes, it is sometimes said that the voice is a reflection of the soul. When we speak we communicate many things along with the message we want to convey: what we are like, our age and gender, whether we are tired, happy, or worried, among others. Our voice can reflect both our emotional state and our physical conditions, and therefore it can help us detect possible illnesses, both physical and emotional. In addition to the voice itself, which has certain characteristics, speech also reveals much through the way in which we communicate linguistic content: vocabulary, syntactic and discursive constructions, intonation, etc. And this is where linguistics plays an important role.
And what can you detect through speech?
On a physical level, diseases such as COPD (chronic obstructive pulmonary disease), for example, have an impact on the characteristics of the voice, but also on speech. Language is a very important cognitive process, and is therefore connected to cognitive disorders or emotional changes, such as bipolar disorder or Alzheimer’s disease. For instance, a depressed person will speak with a much flatter intonation. This is a phenomenon also observed in people suffering from Alzheimer’s, who also use a more generic vocabulary, or particular syntactic structures. Speech allows us to make an early detection of the disease by means of automatic learning algorithms and to act quickly, as well as to monitor its evolution without the patient having to see a doctor every week for an exhaustive check-up. It should be borne in mind though that we provide a warning indicator, an aid to physicians, and not a diagnosis.
Not only do we do research on these topics, but we also work directly with companies to implement real-time applications.
A few months ago, the Plataforma per la Llengua warned that Catalan had lost over half a million speakers in the last 16 years. How can technology help to reverse this situation?
Technology can normalise a language like Catalan. Nowadays, when everything is digital, if a language does not jump on the digital bandwagon, it has no chance. It is not only audiovisual products that need to be in Catalan, but also video games and other computer applications. In the case of video games and other applications, which tend to use a reduced vocabulary, it is as easy as automatically translating commands and implementing a speech synthesiser when necessary. If we develop the necessary technologies and make them open and free, with institutional support, companies will find it much easier to implement Catalan in all their applications. Another example: manual captioning is very expensive, but if you have a speech recognition system and effective automatic translation into Catalan in place, you can caption thousands of films and TV series. However, all this must go hand in hand with the social use of the language, otherwise it will be useless.
There is the handicap of being a minority language and in a context of diglossia.
Catalan is in a minority position; there is a dominant language that is taking over, and Catalan is in danger of becoming redundant. As far as the computational resources of Catalan are concerned, this context could also be an opportunity. Current systems are based on neural networks, on very powerful mathematical algorithms. The problem is that they need a lot of data, millions of data. English, Chinese, Spanish and a few other languages can afford it. Now there are algorithms that make it possible to adapt a system that uses a language with a lot of data to a language with less data, and the more closely it resembles it linguistically, the better. We must take advantage of all these advances to promote technologies in minority and minoritised languages.
More about Mireia Farrús
The best invention in history
Without a doubt, life-saving inventions, such as penicillin. As a physicist I would add the telescope, because it was the key instrument that marked the beginning of modern science. And as a linguist, I like the printing press, which democratised books and brought about a revolution against the power of the Church.
What would you like to see in the future?
I would like to see more collaboration between researchers and less egotism in universities. I would also like to see more recognition for teaching and not just for research. In the field of linguistics, I would like to see greater recognition for languages, especially minority languages, because they are a treasure for humanity.
A future advancement that scares you
Anything that has to do with artificial intelligence and its use. There will come a time when a machine will be able to reproduce all the characteristics of a human being, not just our voice, and a lot of ethical questions will arise.
A role model
I will mention three classics. Galileo as a great communicator, because he began to publish in Italian at a time when all science was in Latin. Einstein because he represents an entirely different dimension of thought, and Maria Skłodowska (Marie Curie) as a female role model.
The FBG is…
It has been such a discovery! A bridge to reality, and a fantastic team that supports you in everything you need.