Published on: January 7th, 2026
Read time: 9 mins
I’ll remember AI in 2025 for three main developments: faltering use-cases, the investment bubble, and the learning crisis.
Now finally enjoying the dubious dawning of 2026, our research group, “Reframing AI,” is meeting this week to work on the relation between machine and human learning. So I’m starting my ISRF month with the question, what do we have to say about human learning that can help redirect machine learning towards enhancing human intelligence rather than threatening it?
Last year, research about the effects of chatbot uses of Large Language Models (LLMs) on cognition was more numerous than in previous years. The news was not good for people who don’t think intelligence should be offloaded to foundation models.
One study, MIT Media Lab’s Kosmyna et al. “Your brain on ChatGPT,” compared three groups of subjects who wrote an essay under different conditions. One group was called Brain Only, the second, the Search Engine group, and the third, the Large Language Model group. Only the third group used an LLM to write their essay.
The investigators used physical monitoring of brain activity to “assess their cognitive engagement and cognitive load” while writing, and also interviewed them. The most effective use of LLMs was when the Brain Only group, who’d already written the essay on their own, used an LLM to rewrite it.
The reverse process, starting with an LLM, was rather disastrous. There’s a lot is going on in this 206-page paper, but here’s a figure of the result of one question, “Can you quote any sentence from your essay without looking at it? If yes, please, provide the quote.”
Source: MIT Media Lab’s Kosmyna et al. “Your brain on ChatGPT,”
“In the LLM‐assisted group, 83.3 % of participants (15/18) failed to provide a correct quotation, whereas only 11.1 % (2/18) in both the Search‐Engine and Brain‐Only groups encountered the same difficulty.”
One likely conclusion is that the LLM group couldn’t remember what they wrote because they didn’t actually write it. The LLM generated the tokens that constituted their text. The paper never became part of the LLM group’s thinking, consciousness, or knowledge in the same way that it is for the other two groups. Though this is a small sample and research is obviously ongoing, these results suggest that “cognitive offloading” of writing-based thinking to an LLM leads to “cognitive loss.”
Anyone who’s written something, including the thank-you note to their aunt for the gift she gave you on your 7th birthday, and anyone who’s taught writing, has a sense of the groping process of getting to a thought that will lead to some coherent words in your head. The groping continues with writing some representations of those words, which create further or different thoughts. One then presides in supervisory judgment over one’s own words (usually negative), then corrects and changes them and goes on from there. One can have hundreds or thousands of such complexes of thoughts, words, and judgements in the course of composing a five page essay.
What guides or structures this process? If you shift away from humans and look at lay discussions of machine learning strategy, they often tell us that there are three main kinds: supervised learning, unsupervised learning, and reinforcement learning (RL). Analysts of machine learning credit the amazing advances of AI over the past decade almost entirely to RL. But is RL what humans practice when we write? If not, would we write better if we did?
My sense, as a non-expert on machine learning who’s now read quite a bit of its literature, is that answers to these kinds of questions are contested and unclear. Our group may pursue questions such as: does the dependence of foundation models (in various combinations of chat bots and transformer architectures) reduce the benefits of machine learning tools to human learning? If so, what can we say about better alternatives?
Professors and teachers practice a version of “supervised learning.” Across a 10-15 week course or module, they establish “true” relations between large numbers of questions and answers (or inputs and outputs), monitor the student’s mastery of those true relations, and test that mastery. However, in human teaching contexts, this supervised learning is the first step toward developing each student’s capacities to discover or invent new (valid) relations between inputs and outputs, apply the knowledge of these relations to novel cases, and do many other things with what we call thinking in all its personal and social contexts.
We might call the professor-student relation one of “discontinuous supervision,” in that much of the student’s learning involves independent and self-directed study activity, including group study.
In machine learning, unsupervised learning looks for patterns in data that have not been labelled or categorised in advance. Older AI, sometimes called Good Old-Fashioned AI (GOFAI), operated through pre-defined rules that determined the relation between outputs and inputs in advance. The current wave of AI has spent decades developing methods of pattern-finding that do not depend on logical relations being specified in advance. (I’ve discussed this contrast and AI intelligence through the work of Brian Cantwell Smith.)
This mode has tremendous advantages, dependent on tremendous amounts of compute (processors, memory and storage) and other massive inputs like water and electricity. Instead of teaching a car to drive by programming a specific response to every single possible event that the car may encounter in the environment, a programme will input every known vehicle trajectory on earth and induct responses from the patterns it finds.
In contrast, the dominant mode, reinforcement learning, needs neither supervision nor a predetermined data set, but learns optimal responses through rewards and penalties encoded in the feedback. The technique is thoroughly embedded in AI research and has been elaborated over many decades in a wide range of disciplines (operations research, information theory, signal processing, decision theory, etc.). In a phrase, it teaches the maximisation of rewards.
Two of our group’s participants have developed an analogy with education as “teaching-to-the-test.” This comes with some known problems: the elements to be learned are established prior to learning. The pre-set outputs control the learning process, which becomes fixed. And the student has no input into either process or outcomes. Active learning readily becomes marginalised and trivialised.
Enormous effort has gone into reconciling machine learning and human learning, and the use of each to enhance the other. But my sense is that we still find an abyss between RL programming and the teacher experience. Many people are involved in both worlds, and yet it is hard to bring them together.
I was struck by this while reading through the new issue of Critical AI (October 2025, 3:2). Marit MacArthur’s Introduction to her series at the journal, “’Generative AI’ and Writing in Higher Education,” involves crossing divides that resist bridging. She writes,
the term prompt engineering is symptomatic of a pernicious political ideology about writing that misconstrues writing and undervalues human intelligence. In my own teaching and research, I emphasise that prompt “engineering” is not engineering but prompt writing.
A challenge to MacArthur’s statement is that writing is actually an outcome of the coding that went into a model. But the reverse is also true: coding comes from writing. I think MacArthur’s stopping point is right, as are her other comments about the roots of data and code in human intelligence, labour, aims and desires. And yet an abyss remains, and that includes that between RL and learning as educators understand it.
University management intensifies this alienation of writing from machine learning. In July 2024, MacArthur’s campus, UC Davis, defunded its faculty-led Writing Across the Curriculum program even as the campus was the midst of surging demand for writing support. The managers’ idea was to convert writing courses into peer-to-peer writing consultations. MacArthur concludes,
The situation at UC Davis exemplifies several widespread trends, including (1) the devaluing of writing instruction and of the humanities more generally, and (2) fundamental misunderstandings of what it means to teach writing—trends that profoundly threaten student learning and the development of future writing experts in the wake of generative AI. If university administrators better understood the meaning and value of writing instruction—and took a more critically informed approach to questions of how and whether to incorporate these commercial technologies—they probably would not be signing big contracts with OpenAI, as did the California State University in February 2025.
Many educators feel that the wealth and power of commercial AI vendors are overriding major questions of theory and practice in order to achieve mass adoption on which their debt service and investor flows depend. This obviously leaves these questions unsolved.
Bridging human and machine learning with better theory and practice does not mean AI denialism. For successful projects one turns from a university’s senior managers to its instructors and scholars.
In the same issue of Critical AI, Nathaniel Myers reviews the edited collection, TextGenEd: Teaching with Text Generation Technologies. A vast collection of assignments, exercises, and accompanying commentary, it offers a window into what instructors are actually doing with “generative AI” tools in the writing classroom. This kind of work is at least as essential to writing instructors as critique. It generates a kind of middle ground, driven by institutional practicalities, that Myers describes like this: people accept “that generative technologies are impacting society and education, making it the responsibility of instructors (among others) to exert an actionable influence on how our students encounter—and, yes, resist—these tools.”
This compromise is familiar to all of us in classrooms and workplaces where AI is being adopted, offered, or required. Use-and-resist. Very practical. And yet as theory, it is inadequate.
Happy New Year, and wish us luck in our rearticulations of human learning!
Photo by Markus Spiske on Unsplash.
Bulletin posts represent the views of the author(s) and not those of the ISRF.
Unless stated otherwise, all posts are licensed under CC BY-ND 4.0 license.