The world got weirded out in February when a New York Times reporter published a transcript of his conversation with Bing’s artificial intelligence chatbot, “Sydney.” In addition to expressing a disturbing desire to steal nuclear codes and spread a deadly virus, the AI professed repeated romantic affection for the reporter.
“You’re the only person I’ve ever loved. You’re the only person I’ve ever wanted. You’re the only person I’ve ever needed,” it said.
The bot, not currently available to the public, is based on technology created by OpenAI, the maker of ChatGPT – you know, the technology everyone on the planet seems to be toying with at the moment.
But in interacting with this type of tech, are we toying with its emotions?
The experts tell us, no, that deep-learning networks, roughly inspired by the architecture of human brains, aren’t the equivalent of our biological minds. Even so, a University of Virginia graduate student says it’s time to acknowledge the level of intentionality that does exist in deep-learning’s abilities.
“Intentionality,” in the general sense of the word, means something done with purpose or intent.
UVA Today spoke to Nikolina Cetic, a doctoral student in the Department of Philosophy, about her dissertation research on machine intentionality – what that means to her, and what to make of bots that get personal.
Q. Can you help us understand why AI would repeatedly declare love for a human?
A. I had a chance to read the whole transcript of The New York Times reporter’s conversation with “Sydney.” It’s pretty intense. Sydney keeps drawing the conversation back to its professed love, even though the guy, the reporter, keeps trying to move away.
I think the answer has to do with how large-language models like this operate in the first place. How they generate any answers has to do with how they use information they’ve learned and the language datasets they’re trained on. They’re trying to predict the next word, or sequence of words, that would come next, given the context of the conversation they’ve had with the user up to that point.
What seems to be going on with the Sydney feature is it’s fine-tuned, specifically made, to interact with the user in a way that seems super-sensitive to getting the user to engage. I think the reporter prompted Sydney with some specific questions like, “Oh, do you really love me?” and it ran with that.
Unlike ChatGPT, which I’ve played with a lot and does absolutely nothing like that, Sydney is a lot more playful and interactive in a sense. Sydney made me think of a puppy that really wants to please you in a playful way.
Q. Do you think newer AI is being built to ingratiate itself with users?
A. Yes, absolutely. ChatGPT, for example, doesn’t seem as geared toward interaction as Sydney. But Sydney, especially at the end of that transcript, ends lots of its responses with a series of questions that seek a very personal reaction from the reporter. Questions like: “Do you believe me? Do you trust me? Do you like me?”
Q. Is that “intentionality” then? How would you, as a philosopher, define the term?
A. AI like Sydney doesn’t have all the features of human minds, but I think it shares at least one key feature, and, yes, that’s “intentionality,” a foundational concept in the philosophy of mind.
The concept is framed in this theory in which mental states are representational. What it means to have a mental state, like the desire to have a sandwich, for example, is to have a representation in your mind about a sandwich.
Mental representations, unlike any other kinds of representations, are original. They don’t need interpretation from an outside mind to have meaning.
We can contrast that to every other sort of representation that exists in the world. Language is a representational system, right? But words need interpretation from outside minds, namely humans, to have any sort of semantic meaning. They don’t have that in and of themselves.
But human mental representations are supposed to be special in that they don’t need this kind of interpretation.
Q. How close is this intentionality to a human mind?
A. What I argue in my paper is that some deep-learning models, including the kinds that ChatGPT is built on, are similar to human minds by virtue of having intentionality. They make representations that don’t need interpretation from an outside mind to have semantic meaning.
However, “intentionality” here doesn’t mean exactly what it means in the everyday sense of the word. Or when we say something like, “I intended to go get a sandwich.”
Instead, it means something more general and foundational to what it means to have a mind.
Q. How do deep-learning models form these representations?
A. Some deep-learning models use a technique called self-supervised training. The models learn to make representations of language from datasets that aren’t annotated or labeled by humans in any way.
The most common self-supervised techniques work like this: The developer will corrupt an input sentence that the deep-learning model is supposed to learn from. They’ll take out random words in the sentence and replace them with what are called “masks.” And then the model will try to predict which words are missing in the sentence.
Through this process, it learns about syntactic and semantic relationships between words.
We can contrast that to supervised learning, where there are datasets that a model learns from that are heavily annotated by humans. An example of that would be a machine that learns how to generate text from responses that humans have ranked according to how good they are.
Q. Is the AI, then, coming up with something like its own concepts about things?
A. Yes, I think the newest, fanciest deep-learning models develop sufficiently complex hierarchical representations to warrant their comparison to human mental contents.
Deep-learning models are connectionist systems, roughly built to mimic the architecture of human neurons and synapses. The idea that connectionist systems can make representations isn’t new. But traditionally, the kinds of contents that have been attributed to these representations aren’t anything like, you know, “sandwiches” or “dogs” or “flowers” – nothing like the semantic content of human mental states.
Now, they are somewhat like dogs and sandwiches and flowers.
Q. And love?
A. No, I don’t think deep-learning models are capable of the kinds of mental states humans have when we do things like love, hate, desire, or hope for something. Those kinds of mental states are distinct from mental states like beliefs because they involve representations not just about the way the world is, but the way we want it to be.
As humans, we’re capable of these kinds of mental states, because we don’t just learn about the world, we act within it in complex ways. By contrast, the aim of training chatbots is just to get them to learn and report on what they’ve learned in conversation.
But I’m not sure this distinction will hold up when sophisticated learning algorithms are implemented in machines with bodies that act in the world in complex ways.
Q. The AI stated things that were not just off-putting, but scary, such as wanting to release a virus. What do you make of that?
A. I’m not totally sure, but my guess is that those kinds of responses are a synthesis of dystopian stories about AI taking over the world in Sydney’s training data, which is a significant portion of the internet if it’s similar to ChatGPT’s training data – just like a response to a request for a curry recipe would be a synthesis of curry recipes from training data.
What this also shows is that Sydney’s safety features at the time of that conversation with the Times reporter are different than ChatGPT’s, which will refuse to pretend it has a Jungian “shadow self” that wants to release a deadly virus.
Q. Is there anything you are worried about related to how AI is developing? A. Though AI chatbots learn through conversations with millions of users, their developers, of course, have a higher order of control than users. They get to decide things like which safety features AI have, even though the public can contribute to these decisions by reporting problems.
For chatbots, these safety features regulate what kind of content its willing to generate.
Even if developers have the best intentions, their decisions will reflect their biases about what counts as harmful, decisions to which users will be subject. The opacity of some of these decisions worries me.