Semantic Reconstruction of Language

Researchers have introduced a new non-invasive brain decoder that can reconstruct continuous speech based on brain activity measured by fMRI.This technology has the potential to decode a person’s thoughts without surgery, unlike previous brain-computer interfaces.This discovery can also be narrowed down to artistic research.As an example, I present the work Voice in My Head by Lauren Lee McCarthy in 2023.

In May 2023, the study „Semantic reconstruction of continuous language from non-invasive brain recordings“ [1] was introduced, which contains an extensive analysis of experimental results using a thought decoder application. This non-invasive decoder has the ability to reconstruct language based on semantic representations captured via functional Magnetic Resonance Imaging (fMRI). These records decode and generate „understandable word sequences.“ To solve the issue of reading thoughts, it was proposed to deal with the problem that cortical activity measured by fMRI, based on blood oxygenation, manifests significantly later, about 10 seconds after the thought in the form of speech, whether spoken or internal. To fill the missing link in understanding thoughts, methods were designed with the help of training the thought and language specifics of the studied subjects, also through GPT-1 used for reconstruction.

Attempts to create interfaces between computers and the human brain have existed for quite a long time [2, 3]. However, most of these attempts required intracranial surgical intervention. Such implementations were carried out on people with damaged speech comprehension centers or speech impairments, with the aim of regenerating or augmenting linguistic-speech defects. For instance, the study „High-performance communication via handwriting“ [4] explored the use of an intracortical brain-computer interface (BCI), which decodes attempts at handwriting movements from the motor cortex and translates it into text in real time using a recurrent neural network decoding approach. Another example is the study „Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria“ [5], which explores the possibilities of decoding words and sentences directly from the cortical activity of paralyzed patients, potentially representing an advancement over existing assisted communication methods. Non-invasive methods have great potential to capture a wide range of linguistic information. The study „Natural speech reveals the semantic maps that tile human cerebral cortex“ [6] refers to findings that indicate language is represented in areas of the brain cortex collectively known as the „semantic system.“ The authors mapped semantic selectivity across the cortex using voxel modeling of functional MRI data collected while subjects listened to hours of narrative stories, thus elucidating a system organized into complex patterns. Subsequently, a new generative model was used to create a detailed semantic atlas. The results were helpful in formulating the hypothesis that most areas within the semantic system represent information about specific semantic domains or groups of related concepts, and our atlas shows which domains are represented in individual areas [6].

The decoder mentioned in the source study non-invasively scans the cortical activity of the human brain through functional fMRI and reconstructs the conscious perception or imaginative stimuli of continuous natural language. Natural language should be understood as language that has naturally evolved through human use [7]. To achieve the expected goal, it was necessary to overcome the low temporal resolution of fMRI. This means that the level of oxygen in the blood (BOLD) rises and falls over approximately 10 seconds. Just to illustrate, the cadence of words in European languages ranges between 2-3 words per second [8, 9]. In the relevant interval, more than 20 images-thought representations can occur in the brain. More words than decodable images presented a conceptual problem, but its resolution through AI marked a significant advancement.

The interpretation of human speech based on the measurement of oxygenation in cortical areas and further interpreted through artificial intelligence systems represents a significant contribution to cognitive sciences, but also to artistic research itself. Understanding how we think, or the content of our internal speech, is also important for understanding the so-called emergence of art, which occurs only in the artist’s mind. The cited study also opened up the possibility of using the measurement of oxygenation in cortical areas using a portable device. Mobile fNIRS technology is based on the principle of measuring blood oxygenation through near-infrared spectroscopy. Infrared sensors can be installed on the surface of the skull in a headset.

There are likely many proposals for artistic research that could be realized through the process of semantic reconstruction of language using fNIRS. One applicable project is „Voice in My Head“ by Kyle McDonald and Lauren Lee McCarthy [10]. This artistic research project offers a ChatGPT-based chatbot model aimed at influencing and shaping personal conversational decisions. The principle of interaction with a project participant is built on the application of instructions received through a wireless earpiece in a so-called reception room. Symbolically, the voice from the earpiece becomes the inner voice, regardless of the questioning and self-answering process. However, the chatbot is prepared to intervene in the participant’s social reality.

KW: semantic, reconstruction, language, non-invasive, brain, recordings, fMRI, decoder, cortical, oxygenation, GPT-1, interfaces, intracranial, BCI, neuroprosthesis, anarthria, semantic maps, voxel modeling, generative model, semantic atlas, natural language, cognitive sciences, artistic research, fNIRS, chatbot, Lauren Lee McCarthy, Voice in My Head,

Voice in My Head 2023

Source: https://lauren-mccarthy.com/Voice-In-My-Head

Author: Tomas Marusiak 2024

PREFERENCES:

1. TANG, Jerry, LEBEL, Amanda, JAIN, Shailee and HUTH, Alexander G. Semantic reconstruction of continuous language from non-invasive brain recordings. Nature Neuroscience. 1 May 2023. Vol. 26, no. 5, p. 858–866. DOI 10.1038/s41593-023-01304-9.

2. PASLEY, Brian N., DAVID, Stephen V., MESGARANI, Nima, FLINKER, Adeen, SHAMMA, Shihab A., CRONE, Nathan E., KNIGHT, Robert T. and CHANG, Edward F. Reconstructing Speech from Human Auditory Cortex. PLoS Biology. 31 January 2012. Vol. 10, no. 1, p. e1001251. DOI 10.1371/journal.pbio.1001251.

3. ANUMANCHIPALLI, Gopala K., CHARTIER, Josh and CHANG, Edward F. Speech synthesis from neural decoding of spoken sentences. Nature. 24 April 2019. Vol. 568, no. 7753, p. 493–498. DOI 10.1038/s41586-019-1119-1.

4. WILLETT, Francis R., AVANSINO, Donald T., HOCHBERG, Leigh R., HENDERSON, Jaimie M. and SHENOY, Krishna V. High-performance brain-to-text communication via handwriting. Nature. 13 May 2021. Vol. 593, no. 7858, p. 249–254. DOI 10.1038/s41586-021-03506-2.

5. MOSES, David A., METZGER, Sean L., LIU, Jessie R., ANUMANCHIPALLI, Gopala K., MAKIN, Joseph G., SUN, Pengfei F., CHARTIER, Josh, DOUGHERTY, Maximilian E., LIU, Patricia M., ABRAMS, Gary M., TU-CHAN, Adelyn, GANGULY, Karunesh and CHANG, Edward F. Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria. New England Journal of Medicine. 15 July 2021. Vol. 385, no. 3, p. 217–227. DOI 10.1056/NEJMoa2027540.

6. HUTH, Alexander G., DE HEER, Wendy A., GRIFFITHS, Thomas L., THEUNISSEN, Frédéric E. and GALLANT, Jack L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature. 27 April 2016. Vol. 532, no. 7600, p. 453–458. DOI 10.1038/nature17637.

7. LANGENDOEN, D. Terence and LYONS, John. Natural Language and Universal Grammar. Language. December 1993. Vol. 69, no. 4, p. 825. DOI 10.2307/416893.

8. CRYSTAL, Thomas H. and HOUSE, Arthur S. Articulation rate and the duration of syllables and stress groups in connected speech. The Journal of the Acoustical Society of America. 1 July 1990. Vol. 88, no. 1, p. 101–112. DOI 10.1121/1.399955. <p>Further analyses have been made on readings of two scripts by six talkers [T. H. Crystal and A. S. House, J. Acoust. Soc. Am. 72, 705–716 (1982); 84, 1932–1935 (1988); 83, 1553–1573 (1988)].

9. LIBERMAN, A. M., COOPER, F. S., SHANKWEILER, D. P. and STUDDERT-KENNEDY, M. Perception of the speech code. Psychological Review. 1967. Vol. 74, no. 6, p. 431–461. DOI 10.1037/h0020279.

10. MCCARTHY, Lauren. Voice-In-My-Head. Online. 2023. Available from: https://lauren-mccarthy.com/Voice-In-My-Head