Auteur : Dagenais, Mylène

Séminaire DIC-ISC-CRIA – 29 février 2024 par Alessandro LENCI

Alessandro LENCI – 29 février 2024

TITRE: The Grounding Problem in Language Models is not only about Grounding

RÉSUMÉ:

The Grounding Problem is typically assumed to concern the lack of referential competence of AI models. Language Models (LMs) that are trained only on texts without direct access to the external world are indeed rightly regarded to be affected by this limit, as they are ungrounded. On the other hand Multimodal LMs do have extralinguistic training data and show important abilities to link language with the visual world. In my talk, I will argue that incorporating multimodal data is a necessary but not sufficient condition to properly address the Grounding Problem. When applied to statistical models based on distributional co-occurrences like LMs, the Grounding Problem should be reformulated in a more extensive way, which sets an even higher challenge for current data-driven AI models.

BIOGRAPHIE:

Alessandro LENCI is Professor of linguistics and director of the Computational Linguistics Laboratory (CoLing Lab), University of Pisa. His main research interests are computational linguistics, natural language processing, semantics and cognitive science.

Lenci A., & Sahlgren (2023). Distributional Semantics, Cambridge, Cambridge University Press.

Lenci, A. (2018). Distributional models of word meaning. Annual review of Linguistics, 4, 151-171.

Lenci, A. (2023). Understanding Natural Language Understanding Systems. A Critical Analysis. Sistemi Intelligenti, arXiv preprint arXiv:2303.04229.

Lenci, A., & Padó, S. (2022). Perspectives for natural language processing between AI, linguistics and cognitive science. Frontiers in Artificial Intelligence, 5, 1059998.

Séminaire DIC-ISC-CRIA – 22 février 2024 par Gary LUPYAN

Gary LUPYAN – 22 février 2024

TITRE : What counts as understanding?

RÉSUMÉ:

The question of what it means to understand has taken on added urgency with the recent leaps in capabilities of generative AI such as large language models (LLMs). Can we really tell from observing the behavior of LLMs whether underlying the behavior is some notion of understanding? What kinds of successes are most indicative of understanding and what kinds of failures are most indicative of a failure to understand? If we applied the same standards to our own behavior, what might we conclude about the relationship between between understanding, knowing and doing?

BIOGRAPHIE:

Gary Lupyan: is Professor of Psychology at the University of Wisconsin-Madison. His work has focused on how natural language scaffolds and augments human cognition, and attempts to answer the question of what the human mind would be like without language. He also studies the evolution of language, and the ways that language adapts to the needs of its learners and users.

Liu, E., & Lupyan, G. (2023). Cross-domain semantic alignment: Concrete concepts are more abstract than you think. Philosophical Transactions of the Royal Society B. DOI: 10.1098/rstb.2021-0372 Duan, Y., & Lupyan, G. (2023). Divergence in Word Meanings and its Consequence for Communication. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 45, No. 45)

van Dijk, B. M. A., Kouwenhoven, T., Spruit, M. R., & van Duijn, M. J. (2023). Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on Understanding (arXiv:2310.19671). arXiv.

Aguera y Arcas, B. (2022). Do large language models understand us? Medium. Titus, L. M. (2024). Does ChatGPT have semantic understanding? A problem with the statistics-of-occurrence strategy. Cognitive Systems Research, 83.

Pezzulo, G., Parr, T., Cisek, P., Clark, A., & Friston, K. (2024). Generating meaning: Active inference and the scope and limits of passive AI. Trends in Cognitive Sciences, 28(2), 97–112.

Séminaire DIC-ISC-CRIA – 1er février 2024 par Robert GOLDSTONE

Robert GOLDSTONE – 1er février 2024

TITRE : Learning Categories by Creating New Descriptions

RÉSUMÉ:

In Bongard problems, problem-solvers must come up with a rule for distinguishing visual scenes that fall into two categories. Only a handful of examples of each category are presented. This requires the open-ended creation of new descriptions. Physical Bongard Problems (PBPs) require perceiving and predicting the spatial dynamics of the scenes. We compare the performance of a new computational model (PATHS) to human performance. During continual perception of new scene descriptions over the course of category learning, hypotheses are constructed by combining descriptions into rules for distinguishing the categories. Spatially or temporally juxtaposing similar scenes promotes category learning when the scenes belong to different categories but hinders learning when the similar scenes belong to the same category.

BIOGRAPHIE:

Robert GOLDSTONE is a Distinguished Professor in the Department of Psychological and Brain Sciences and Program in Cognitive Science at Indiana University. His research interests include concept learning and representation, perceptual learning, educational applications of cognitive science, and collective behavior.

Goldstone, R. L., Dubova, M., Aiyappa, R., & Edinger, A. (2023). The spread of beliefs in partially modularized communities. Perspectives on Psychological Science, 0(0). https://doi.org/10.1177/17456916231198238

Goldstone, R. L., Andrade-Lotero, E., Hawkins, R. D., & Roberts, M. E. (2023). The emergence of specialized roles within groups. Topics in Cognitive Science, DOI: 10.1111/tops.12644.

Weitnauer, E., Goldstone, R. L., & Ritter, H. (2023). Perception and simulation during concept learning. Psychological Review, https://doi.org/10.1037/rev0000433.

Séminaire DIC-ISC-CRIA – 25 janvier 2024 par Stevan HARNAD

Stevan HARNAD - 25 janvier 2024

TITRE: Language Writ Large: LLMs, ChatGPT, Meaning and Understanding

RÉSUMÉ:

Apart from what (little) OpenAI may be concealing from us, we all know (roughly) how ChatGPT works (its huge text database, its statistics, its vector representations, and their huge number of parameters, its next-word training, etc.). But none of us can say (hand on heart) that we are not surprised by what ChatGPT has proved to be able to do with these resources. It has even driven some of us to conclude that it actually understands. It’s not true that it understands. But it is also not true that we understand how it can do what it can do. I will suggest some hunches about benign “biases” -- convergent constraints that emerge at LLM-scale that may be helping ChatGPT do so much better than we would have expected. These biases are inherent in the nature of language itself, at LLM-scale, and they are closely linked to what it is that ChatGPT lacks, which is direct sensorimotor grounding to connect its words to their referents and its propositions to their meanings. These benign biases are related to (1) the parasitism of indirect verbal grounding on direct sensorimotor grounding, (2) the circularity of verbal definition, (3) the “mirroring” of language production and comprehension, (4) iconicity in propositions at LLM-scale, (5) computational counterparts of human “categorical perception” in category learning by neural nets, and perhaps also (6) a conjecture by Chomsky about the laws of thought.

BIOGRAPHIE:

Stevan HARNAD is Professor of psychology and cognitive science at UQÀM. His research is on category-learning, symbol-grounding, language-evolution, and Turing-Testing

Bonnasse-Gahot, L., & Nadal, J. P. (2022). Categorical perception: a groundwork for deep learning. Neural Computation, 34(2), 437-475.

Harnad, S. (2012). From sensorimotor categories and pantomime to grounded symbols and propositions In: Gibson, KR & Tallerman, M (eds.) The Oxford Handbook of Language Evolution 387-392.

Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery, and Intelligence. In: Epstein, R, Roberts, Gary & Beber, G. (eds.) Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer, pp. 23-66.

Thériault, C., Pérez-Gay, F., Rivas, D., & Harnad, S. (2018). Learning-induced categorical perception in a neural network model. arXiv preprint arXiv:1805.04567.

Vincent‐Lamarre, P; Blondin-Massé, A; Lopes, M; Lord, M; Marcotte, O; & Harnad, S (2016). The latent structure of dictionaries. Topics in Cognitive Science 8(3): 625-659.

Pérez-Gay Juárez, F., Sicotte, T., Thériault, C., & Harnad, S. (2019). Category learning can alter perception and its neural correlates. PloS one, 14(12), e0226000.

Séminaire DIC-ISC-CRIA – 18 janvier 2024 par Ben GOERTZEL

Ben GOERTZEL - 18 janvier 2024

Titre: Toward AGI via Embodied Neural-Symbolic-Evolutionary Cognition

RÉSUMÉ:

A concrete path toward AGI with capability at the human level and beyond is outlined, centered on a common mathematical meta-representation capable of integrating neural, symbolic, evolutionary and autopoietic aspects of intelligence. The instantiation of these ideas in the OpenCog Hyperon software framework is discussed. An in-progress research programme is reviewed, in which this sort of integrative AGI system is induced to ground its natural language dialogue in its experience, via embodiment in physical robots and virtual-world avatars.

BIOGRAPHIE:

Ben Goertzel is a cross-disciplinary scientist, entrepreneur and author. He leads the SingularityNET Foundation, the OpenCog Foundation, and the AGI Society which runs the annual Artificial General Intelligence conference. His research work encompasses multiple areas including artificial general intelligence, natural language processing, cognitive science, machine learning, computational finance, bioinformatics, virtual worlds, gaming, parapsychology, theoretical physics and more. He has published 25+ scientific books, ~150 technical papers, and numerous journalistic articles, and given talks at a vast number of events of all sorts around the globe.

Goertzel, B. (2023). Generative AI vs. AGI: The Cognitive Strengths and Weaknesses of Modern LLMs. arXiv preprint arXiv:2309.10371.

Rodionov, S., Goertzel, Z. A., & Goertzel, B. (2023). An Evaluation of GPT-4 on the ETHICS Dataset. arXiv preprint arXiv:2309.10492.

Huang, K., Wang, Y., Goertzel, B., & Saliba, T. (2023). ChatGPT and Web3 Applications. In Beyond AI: ChatGPT, Web3, and the Business Landscape of Tomorrow (pp. 69-95). Cham: Springer Nature Switzerland.

Séminaire DIC-ISC-CRIA – 11 janvier 2024 par Raphaël MILLIÈRE

Raphaël Millière -11 janvier 2024

Titre: Mechanistic Explanation in Deep Learning

RÉSUMÉ:

Deep neural networks such as large language models (LLMs) have achieved impressive performance across almost every domain of natural language processing, but there remains substantial debate about which cognitive capabilities can be ascribed to these models. Drawing inspiration from mechanistic explanations in life sciences, the nascent field of "mechanistic interpretability" seeks to reverse-engineer human-interpretable features to explain how LLMs process information. This raises some questions: (1) Are causal claims about neural network components, based on coarse intervention methods (such as “activation patching”), genuine mechanistic explanations? (2) Does the focus on human-interpretable features risk imposing anthropomorphic assumptions? My answer will be "yes" to (1) and "no" to (2), closing with a discussion of some ongoing challenges.

BIOGRAPHIE:

Raphael Millière is Lecturer in Philosophy of Artificial Intelligence at Macquarie University in Sydney, Australia. His interests are in the philosophy of artificial intelligence, cognitive science, and mind, particularly in understanding artificial neural networks based on deep learning architectures such as Large Language Models. He has investigated syntactic knowledge, semantic competence, compositionality, variable binding, and grounding.

Elhage, N., et al. (2021). A mathematical framework for transformer circuits. Transformer Circuits Thread. Machamer, P., Darden, L., & Craver, C. F. (2000).

Thinking about Mechanisms. Philosophy of Science, 67(1), 1–25. Millière, R. (2023).

The Alignment Problem in Context. arXiv preprint arXiv:2311.02147. Mollo, D. C., & Millière, R. (2023).

The vector grounding problem. arXiv preprint arXiv:2304.01481. Yousefi, S., et al. (2023).

In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations. arXiv preprint arXiv:2310.00313.

Séminaire DIC-ISC-CRIA – 14 décembre 2023 par Frédéric ALEXANDRE

Frédéric ALEXANDRE – 14 décembre 2023

Titre : Apprentissage continu et contrôle cognitif

Résumé :

Frédérique Alexandre explore la différence entre l'efficacité de l'apprentissage humain et celle des grands modèles de langage en termes de temps de calcul et de coûts énergétiques. L'étude se focalise sur le caractère continu de l'apprentissage humain et les défis associés, tels que l'oubli catastrophique. Deux types de mémoires, la mémoire de travail et la mémoire épisodique, sont examinés. Le cortex préfrontal est décrit comme essentiel pour le contrôle cognitif et la mémoire de travail, tandis que l'hippocampe est central pour la mémoire épisodique. Alexandre suggère que ces deux régions collaborent pour permettre un apprentissage continu et efficace, facilitant ainsi la pensée et l'imagination.

Bio :

Frédéric ALEXANDRE est directeur de recherche à l'Inria et dirige l'équipe Mnemosyne à Bordeaux, spécialisée en Intelligence Artificielle et Neurosciences Computationnelles. L'équipe étudie les différentes formes de mémoire cérébrale et leur rôle dans des fonctions cognitives telles que le raisonnement et la prise de décision. Ils explorent la dichotomie entre mémoires explicites et implicites et comment elles interagissent. Leurs projets récents s'étendent de l'acquisition du langage à la planification et la délibération. Les modèles créés sont validés expérimentalement et ont des applications médicales, industrielles, ainsi qu'en sciences humaines, notamment en éducation, droit, linguistique, économie, et philosophie.

Quelques références en ligne:

Frédéric Alexandre. A global framework for a systemic view of brain modeling. Brain Informatics, 2021, 8 (1), pp.22. https://braininformatics.springeropen.com/articles/10.1186/s40708-021-00126-4 

Snigdha Dagar, Frédéric Alexandre, Nicolas P. Rougier. From concrete to abstract rules : A computational sketch. The 15th International Conference on Brain Informatics, Jul 2022. https://inria.hal.science/hal-03695814

Randa Kassab, Frédéric Alexandre. Pattern Separation in the Hippocampus: Distinct Circuits under Different Conditions. Brain Structure and Function, 2018, 223 (6), pp.2785-2808. https://link.springer.com/article/10.1007/s00429-018-1659-4 

Hugo Chateau-Laurent, Frédéric Alexandre. The Opportunistic PFC: Downstream Modulation of a Hippocampus-inspired Network is Optimal for Contextual Memory Recall. 36th Conference on Neural Information Processing System, Dec 2022, New Orleans, United States. https://hal.science/hal-03885715 

Pramod Kaushik, Jérémie Naudé, Surampudi Bapi Raju, Frédéric Alexandre. A VTA GABAergic computational model of dissociated reward prediction error computation in classical conditioning. Neurobiology of Learning and Memory, 2022, 193 (107653), https://www.sciencedirect.com/science/article/abs/pii/S1074742722000776 

À NOTER: La vidéo du séminaire sera mise en ligne le jour suivant la présentation.

Séminaire DIC-ISC-CRIA – 7 décembre 2023 par Jake HANSON

Jake HANSON – 7 décembre 2023

Titre : Falsification of the Integrated Information Theory of Consciousness

Résumé :

Integrated Information Theory is a prominent theory of consciousness in contemporary neuroscience, based on the premise that feedback, quantified by a mathematical measure called Phi, corresponds to subjective experience. A straightforward application of the mathematical definition of Phi fails to produce a unique solution due to unresolved degeneracies inherent in the theory. This undermines nearly all published Phi values to date. In the mathematical relationship between feedback and input-output behavior in finite-state systems automata theory shows that feedback can always be disentangled from a system's input-output behavior, resulting in Phi=0 for all possible input-output behaviors. This process, known as "unfolding," can be accomplished without increasing the system's size, leading to the conclusion that Phi measures something fundamentally disconnected from what could ground the theory experimentally. These findings demonstrate that IIT lacks a well-defined mathematical framework and may either be already falsified or inherently unfalsifiable according to scientific standards.

Bio :

Jake HANSON is a Senior Data Scientist at a financial tech company in Salt Lake City, Utah. His doctoral research in Astrophysics from Arizona State University focused on the origin of life via the relationship between information processing and fundamental physics. He demonstrated that there were multiple foundational issues with IIT, ranging from poorly defined mathematics to problems with experimental falsifiability and pseudoscientific handling of core ideas. 

References

Hanson, J.R., & Walker, S.I. (2019). Integrated information theory and isomorphic feed-forward philosophical zombies. Entropy, 21.11, 1073.

Hanson, J.R., & Walker, S.I. (2021). Formalizing falsification for theories of consciousness across computational hierarchies. Neuroscience of Consciousness, 2021.2, niab014.

Hanson, J.R., & Walker, S.I. (2021). Falsification of the Integrated Information Theory of Consciousness. Diss. Arizona State University, 2021.

Hanson, J.R., & Walker, S.I. (2023). On the non-uniqueness problem in Integrated Information Theory. Neuroscience of Consciousness, 2023.1, niad014.

Séminaire DIC-ISC-CRIA – 30 novembre 2023 par Christoph DURT

Christoph DURT – 30 novembre 2023

Titre : LLMs, Patterns, and Understanding

Résumé :

It is widely known that the performance of LLMs is contingent on their being trained with very large text corpora. But what in the text corpora allows LLMs to extract the parameters that enable them to produce text that sounds as if it had been written by an understanding being? In my presentation, I argue that the text corpora reflect not just “language” but language use. Language use is permeated with patterns, and the statistical contours of the patterns of written language use are modelled by LLMs. LLMs do not model understanding directly, but statistical patterns that correlate with patterns of language use. Although the recombination of statistical patterns does not require understanding, it enables the production of novel text that continues a prompt and conforms to patterns of language use, and thus can make sense to humans.

Biographie :

Christoph DURT is a philosophical and interdisciplinary researcher at Heidelberg university. He investigates the human mind and its relation to technology, especially AI. Going beyond the usual side-to-side comparison of artificial and human intelligence, he studies the multidimensional interplay between the two. This involves the study of human experience and language, as well as the relation between them. If you would like to join an international online exchange on these issues, please check the “courses and lectures” section on his website.

References

Durt, Christoph, Tom Froese, and Thomas Fuchs. preprint. “Against AI Understanding and Sentience: Large Language Models, Meaning, and the Patterns of Human Language Use.”

Durt, Christoph. 2023. “The Digital Transformation of Human Orientation: An Inquiry into the Dawn of a New Era” Winner of the $10.000 HFPO Essay Prize.

Durt, Christoph. 2022. “Artificial Intelligence and Its Integration into the Human Lifeworld.” In The Cambridge Handbook of Responsible Artificial Intelligence, Cambridge University Press.

Durt, Christoph. 2020. “The Computation of Bodily, Embodied, and Virtual Reality” Winner of the Essay Prize “What Can Corporality as a Constitutive Condition of Experience (Still) Mean in the Digital Age?”Phänomenologische Forschungen, no. 2: 25–39.

Séminaire DIC-ISC-CRIA – 23 novembre 2023 par Anders SOGAARD

Anders SOGAARD – 23 novembre 2023

Titre : LLMs: Indication or Representation?

Résumé :

People talk to LLMs - their new assistants, tutors, or partners - about the world they live in, but are LLMs parroting, or do they (also) have internal representations of the world? There are five popular views, it seems:

(i)         LLMs are all syntax, no semantics.

(ii)        LLMs have inferential semantics, no referential semantics.

(iii)        LLMs (also) have referential semantics through picturing

(iv)       LLMs (also) have referential semantics through causal chains.

(v)        Only chatbots have referential semantics (through causal chains)

I present three sets of experiments to suggest LLMs induce inferential and referential semantics and do so by inducing human-like representations, lending some support to view (iii). I briefly compare the representations that seem to fall out of these experiments to the representations to which others have appealed in the past.

Biographie :

Anders SOGAARD is University Professor of Computer Science and Philosophy and leads the newly established Center for Philosophy of Artificial Intelligence at the University of Copenhagen. Known primarily for work on multilingual NLP, multi-task learning, and using cognitive and behavioral data to bias NLP models, Søgaard is an ERC Starting Grant and Google Focused Research Award recipient and the author of Semi-Supervised Learning and Domain Adaptation for NLP (2013), Cross-Lingual Word Embeddings (2019), and Explainable Natural Language Processing (2021). 

References

Søgaard, A. (2023). Grounding the Vector Space of an Octopus. Minds and Machines 33, 33-54.

Li, J.; et al. (2023) Large Language Models Converge on Brain-Like Representations. arXiv preprint arXiv:2306.01930

Abdou, M.; et al. (2021) Can Language Models Encode Perceptual Structure Without Grounding? CoNLL

Garneau, N.; et al. (2021) Analogy Training Multilingual Encoders. AAAI

Suivez-nous