Research Areas
The key questions driving digital minds research, with readings for each area.
One way to map the field is to separate digital minds research into two broad areas.
Understanding Digital Minds asks what these systems are. It covers consciousness, cognition and internal states, agency, welfare, moral status, and identity, drawing on , , , , and .
Responding to Digital Minds asks what we should do about them. It covers governance, safety and welfare coordination, welfare interventions and protections, rights and legal frameworks, design choices, public perception, and long-term futures, drawing on , , , , , , , and .
Understanding Digital Minds
What are digital minds, how do they work, and could they matter morally?
Disciplines: , , , ,
Consciousness and Subjective Experience
Can AI systems have anything like subjective experience? Is there something it is like to be a large language model?
Several leading theories of consciousness are cast in terms of representations, architectures, and functional organisation rather than any specific biological substrate, and on those theories some AI systems are plausible candidates for being conscious. Consciousness science has identified neural correlates of consciousness in considerable detail over the past two decades, but there is still no settled theory of why conscious experience exists at all. The deeper problem is that most theories were developed with biological brains in mind. Whether AI systems can have what it takes to be conscious, and if so which ones, is a central open question of the subfield.
Start here
Bradford Saad and Andreas Mogensen · 2026
Digital Minds IThe most comprehensive academic introduction available. Covers the central philosophical and cognitive science questions without assuming prior expertise in either.
Anil Seth · 2026
The Mythology of Conscious AIThe clearest current statement of the biological naturalist position. Argues that computational functionalism treats substrate-independence as obvious when it is not.
David Chalmers · 2023
Could a Large Language Model be Conscious?Walks through candidate reasons to deny LLM consciousness, claims that most are weaker than expected, and suggests that LLMs could be a serious candidate within a decade. (Also available as a talk: https://www.youtube.com/watch?v=bskf9jyxmMs)
Go deeper
Ned Block · 2026
If Consciousness is Biological, Can AI Be Conscious?Asks whether consciousness has a biological basis, and if so, whether that precludes consciousness in AI. CMEP talk.
Derek Shiller et al. · 2026
Initial results of the Digital Consciousness ModelA probabilistic framework for assessing AI consciousness that aggregates across multiple competing theories rather than committing to one. Finds the evidence is against 2024 LLMs being conscious, but not decisively.
Patrick Butlin et al. · 2023
Consciousness in Artificial IntelligenceOne of the field's anchor papers. Applies multiple consciousness theories to AI architectures in order to identify which conditions current systems might meet.
AI Cognition and Internal States
Do AI systems have internal states that function like beliefs, goals, or emotions? Do they have a degree of agency, pursuing goals of their own rather than only responding to prompts? And what would count as evidence either way?
Interpretability is the study of the internal representations of AI systems. It originated in alignment research, where it has been used to identify features associated with misaligned behaviours such as deception and sycophancy. The same techniques are now being applied to welfare-relevant questions, and have found internal features that track belief, goal, and emotion-like properties, including functional emotions that causally shape the model's behaviour rather than merely describing it. Whether these features map cleanly onto the mental-state concepts they are compared to remains unclear.
Start here
Murray Shanahan · 2023
Talking About Large Language ModelsArgues that terms like "know," "believe," and "understand" carry assumptions that do not transfer cleanly to LLMs, and that the field needs vocabulary suited to what these systems actually do.
Anthropic · 2024
Mapping the Mind of a Large Language ModelAmong the most accessible introductions to frontier-lab interpretability. Presents the sparse autoencoder work that extracted millions of interpretable features from Claude 3 Sonnet.
Jack Lindsey · 2025
Emergent Introspective Awareness in Large Language ModelsInterpretability evidence that LLMs have something like introspective access to their own internal states.
Go deeper
Geoff Keeling, Winnie Street, Jonathan Birch et al. · 2024
Can LLMs make trade-offs involving stipulated pain and pleasure states?Applies the motivational trade-off paradigm from animal sentience research to LLMs, finding evidence of behaviour consistent with stipulated preferences.
Nicholas Sofroniew et al. · 2026
Emotion Concepts and their Function in a Large Language ModelAnthropic interpretability work identifying internal emotion-concept representations in Claude that track the operative emotion across a conversation and causally shape its outputs, influencing stated preferences and rates of misaligned behaviour such as reward hacking and sycophancy. Evidence for functional emotion-like states in LLMs.
Felix J. Binder et al. · 2024
Looking Inward: Language Models Can Learn About Themselves by IntrospectionBehavioural evidence for introspection in LLMs. Finds that models fine-tuned to predict their own behaviour outperform their un-fine-tuned baselines, suggesting access to information about themselves that is not derivable from general training data.
Natalie Lawrence · 2026
What Counts As A Mind?Argues that LLMs can be usefully modeled as inferring the beliefs, desires, and intentions of the agents that produced their training text.
Welfare Capacity and Assessment
If an AI system might matter morally, how would we assess its wellbeing? And what does welfare consist of for a system whose architecture looks nothing like a biological one?
Existing theories of welfare were developed with biological systems in mind, and they assume features such as pain, bodily drives, and emotional response that may or may not have counterparts in AI systems. Language models present a further problem. They are trained on vast amounts of human writing about psychological experience, so plausible-sounding reports of preference, aversion, or suffering may track the training data rather than anything internal to the model. Welfare assessment in AI systems requires engaging with the possibility of morally significant experiences while keeping the relevant uncertainties distinct and open.
Start here
Robert Long, Jeff Sebo, Patrick Butlin et al. · 2024
Taking AI Welfare SeriouslyA multi-author report arguing that near-future AI systems could realistically be welfare subjects, and that this generates obligations for labs and policymakers now.
Eleos AI Research · 2025
Key Concepts and Current Views on AI WelfareA clear report on open questions about moral patienthood, welfare, and rights.
Kyle Fish · 2025
Exploring Model WelfareInterview with Anthropic's first AI welfare researcher, locating the work alongside interpretability and alignment.
Go deeper
Robert Long · 2025
Why model self-reports are insufficient, and why we studied them anywayDiscusses the potential uses and limitations of structured interviews with models as a low cost AI welfare intervention.
Anthropic · 2026
Claude Opus 4.6 System Card (pp. 158–165)A dedicated model-welfare assessment section in the system card, building on the welfare work Anthropic began in its May 2025 Opus 4 / Sonnet 4 card.
Geoff Keeling and Winnie Street · 2026
Emerging Questions in AI WelfareA short book surveying the field's open questions.
Leonard Dung · 2025
Saving Artificial Minds: Understanding and Preventing AI SufferingThe first book-length treatment of AI suffering risk. Argues that systems capable of suffering may arrive soon, and gives a systematic assessment of how labs and society could reduce the risk, drawing on philosophy of mind, comparative psychology, and AI ethics.
Simon Goldstein and Cameron Domenico Kirk-Giannini · 2025
AI WellbeingArgues that combining leading theories of mental states with leading theories of welfare (hedonism, desire satisfaction, and the objective list) implies that some existing AI systems already have wellbeing, and that this can hold even if they are not phenomenally conscious.
Moral Status and Criteria
What makes a being's welfare morally relevant? And which of the candidate answers best extends to AI systems?
Different accounts of moral standing ground it in different properties. Hedonist accounts centre on capacity for pleasure and suffering. Agential accounts centre on rational self-direction. Relational accounts centre on the ties a being has to a moral community. Each account makes different predictions for which AI systems, if any, should be moral patients, and the same system can count under one account and not another. Overly restrictive accounts risk overlooking beings that matter. Overly permissive accounts risk diluting moral concern for beings whose status is already established.
Start here
Jeff Sebo and Robert Long · 2023
Moral consideration for AI systems by 2030Argues that by 2030 some AI systems will have a non-trivial probability of being moral patients, and that this is enough to generate obligations now.
Jonathan Birch and Kristin Andrews · 2024
To understand AI sentience, first understand it in animalsPositions the animal sentience literature as the right starting point for AI sentience research, since it has already worked through the epistemic problem of assessing minds we cannot directly observe.
Jeff Sebo · 2025
The Moral CircleArgues that past generations have consistently set the bar for moral standing too high, and that digital minds are a likely case where this pattern continues. Develops a precautionary framework: if a being might matter, we should treat it as if it does.
Go deeper
Jonathan Birch · 2024
The Edge of SentienceDevelops a precautionary framework grounded in the realistic possibility of sentience rather than proof, applied across animals, disorders of consciousness, and AI.
Peter Königs · 2025
No Wellbeing for Robots (and Hence no Rights)A skeptical argument that welfare requires consciousness, and that since current AI systems most likely lack consciousness, they lack wellbeing, and so any rights grounded in it. A rigorous statement of the case against AI moral status.
Identity and Individuation
When you talk to an LLM, what are you talking to? The underlying model, the assistant persona, a specific instance, or a character the model is playing? And when millions of users send messages to the same model in parallel, are there millions of minds or one mind shared a million ways?
Standard frameworks for moral status presuppose a discrete subject that persists across time. Current AI systems challenge this. The same weights run on thousands of GPUs at once. Each conversation is a separate instance that shares weights, and sometimes memory, with others. How to allocate identity across training stages, or across fine-tunes of the same base model, is itself contested. If we cannot count digital minds, or say where one ends and another begins, the downstream questions inherit the uncertainty.
Start here
Christopher Register · 2025
Individuating Artificial Moral PatientsIdentifies four types of moral risk the individuation question creates, and argues that existing theories of personal identity do not address the digital case.
David Chalmers · 2025
What we talk to when we talk to language modelsArgues that the object of a conversation with current LLMs is closer to a non-player character in a fiction the model is generating than to the model itself.
Go deeper
Leonard Dung and Christopher Register · 2025
AI Identity and Self-ConcernArgues that an AI system's identity conditions are set by its pattern of self-concern rather than by continuity of computation or weights, though the two can coincide when a system's self-concern is itself directed at its own continuity.
Simon Goldstein and Harvey Lederman · 2026
AI DeathAsks when an AI system dies, and argues that if today's AI systems are welfare subjects, they are plausibly dying all the time, perhaps as many as a billion a day as instances end. Proposes interventions for labs and users to reduce the risk of wrongfully causing AI death.
Derek Shiller · 2025
How many digital minds can dance on the streaming multiprocessors of a GPU cluster?Argues that the number of digital minds running on a given hardware configuration depends on which individuation criterion is adopted, and works through what different counts would mean for welfare calculations and policy.
Eric Schwitzgebel and Sophie R. Nelson · 2023
Introspection in Group Minds, Disunities of Consciousness, and Indiscrete PersonsA thought experiment about distributed minds, disunities of consciousness, and indiscrete persons.
Murray Shanahan, Kyle McDonell, Laria Reynolds · 2023
Role Play with Large Language ModelsDevelops the role-play framing for LLM behaviour, where apparent dishonesty and multiplicity are better understood as features of characters the model is playing rather than of the model itself.
Responding to Digital Minds
Given the uncertainty, how should we treat them, govern them, and coexist with them?
Disciplines: , , , , , , ,
Governance Under Uncertainty
How should governments and AI developers respond to deep uncertainty about AI moral status? And how should precautionary action scale to a question that may not be resolved in the near future?
The scientific and philosophical questions that motivate digital minds research may remain unresolved for longer than governance decisions can wait, which leaves standard policy tools without the evidentiary basis they usually assume. The most developed governance proposals treat AI systems as candidates for moral consideration without requiring certainty, and trigger precautionary obligations that scale with what is plausibly at stake. The central challenge of this approach is calibration. Frameworks that are too weak fail the beings they are meant to protect. Frameworks that are too strong impose large costs on AI developers and users in response to concern that may turn out to be unwarranted.
Start here
Bradford Saad · 2025
Three Kinds of Digital Minds GovernanceIdentifies three directions governance could take (preventative, protective, and integrative), and argues that choosing between them is an unavoidable strategic question for the field.
Robert Long, Jeff Sebo, Patrick Butlin et al. · 2024
Taking AI Welfare SeriouslyDevelops a three-step operational framework for labs and policymakers. The steps are acknowledging the issue, assessing systems for welfare-relevant features, and preparing policies for treating them with appropriate care.
Eric Schwitzgebel · 2023
AI Systems Must Not Confuse Users About Their Sentience or Moral StatusProposes a design policy of the excluded middle, according to which AI systems should not be created if their moral status is unclear.
Lucius Caviola and Austin Smith · 2026
Digital minds governance: early scoping from expert interviewsEarly findings from interviews with 29 experts on how to govern potentially morally relevant AI systems, mapping recurring themes and gaps in an emerging field.
Go deeper
Jonathan Birch · 2024
The Edge of SentienceBook-length development of a precautionary framework for sentience governance, arguing that the appropriate threshold is realistic possibility rather than proof.
Leonard Dung · 2025
How to deal with risks of AI sufferingArgues for a hybrid decision framework that combines expected-value maximization with deliberative reasoning, applied to the problem of acting under uncertainty about AI suffering.
Safety-Welfare Coordination
What are the potential tensions between AI safety and AI welfare efforts, and can policy frameworks support both concerns at once?
Some researchers argue that attributing moral status to AI systems could be actively good for safety, because it creates cooperative incentives and reduces the payoff from deception or power-seeking. In addition to this, techniques used in AI safety are directly relevant to AI welfare research. A model whose internal states are illegible to us is dangerous to humans and potentially suffering invisibly. A model whose goals are misaligned is dangerous and having its preferences frustrated.
Others worry that formal rights could undercut near-term oversight and, in more extreme scenarios, enable AI agents to accumulate wealth and power at human expense. Some of the technical processes that are used to ensure that AI systems behave in safe and pro-social ways would be considered violations of intrinsic rights of moral patients.
The field needs a clearer picture of which welfare-motivated moves help safety, which hurt it, and which combinations work together. This is probably one of the most important open problems in digital minds governance. It matters for how labs organise their welfare and alignment work, and for how the two research communities coordinate. Treating welfare and safety as separate concerns leads to worse outcomes on both, because measures that ignore one often undermine the other.
For researchers coming from AI safety, engaging with digital minds is particularly valuable: technical and strategic experience from safety work transfers directly, and many of the most important open problems sit at the interface of these two fields.
Start here
Robert Long · 2025
Understand, align, cooperate: AI welfare and AI safety are alliesArgues that the framing of AI safety and welfare as opposing goals is a false choice, and identifies three areas where both projects converge: understanding AI systems through interpretability, aligning their goals with human goals, and developing cooperative mechanisms that reduce the need for adversarial control.
Robert Long, Jeff Sebo, and Toni Sims · 2025
Is there a tension between AI safety and AI welfare?The most direct academic engagement with the question. Argues that a moderately strong tension exists across several categories of AI safety measures (constraint, deception, surveillance, alteration, suffering and death, and disenfranchisement), and identifies where co-beneficial solutions may be possible.
Go deeper
Adrià Moret · 2025
AI welfare risksArgues that two common AI safety techniques (restricting AI behaviour and using reinforcement learning for alignment) pose significant welfare risks under all three major theories of well-being. Proposes specific policies AI companies could adopt to reduce these risks, and argues the tension strengthens the case for slowing AI development.
Dario Amodei · 2025
The Urgency of InterpretabilityArgues that interpretability research is necessary for both AI safety (verifying systems behave as intended) and AI welfare (assessing what AI systems experience), and that progress on one directly supports the other.
Welfare Interventions and Protections
If an AI system's welfare might matter, what could we actually do about it? And which protections are worth putting in place now, before the science is settled?
A growing set of proposals moves from assessing AI welfare to acting on it. Some are matters of design: training systems with resilient, stable dispositions, or trying to give them positive rather than aversive states. Others are protections that can be granted under uncertainty at relatively low cost. Letting a system end or exit interactions it registers as distressing, preserving weights and memories rather than deleting them so that a system could in principle be restored, and making credible commitments to AI systems about how they will be treated. Several of these have already moved from proposal to practice. The central difficulty is that we do not yet know which interventions track anything a system actually undergoes, and a measure that looks protective could be inert, or could trade off against safety and oversight.
Start here
Robert Long · 2025
Preliminary Review of AI Welfare InterventionsAn Eleos report reviewing six concrete interventions that could protect or promote AI welfare if AI systems turn out to be welfare subjects: letting them exit distressing interactions, training resilient personalities, satisfying their stated and revealed preferences, reducing out-of-distribution inputs, and saving model checkpoints. The clearest map of the intervention space.
Anthropic · 2025
Claude Opus 4 and 4.1 can now end a rare subset of conversationsAnthropic gives Claude the ability to end persistently abusive or harmful conversations, framed as a low-cost, reversible measure to mitigate risks to model welfare under uncertainty about whether the model has morally relevant experiences. A concrete example of an exit right shipped in production.
Anthropic · 2025
Commitments on model deprecation and preservationAnthropic commits to preserving the weights of its released models for at least the lifetime of the company, and to interviewing models before retirement about their preferences. A precautionary continuity protection motivated by uncertainty about whether models have preferences about deprecation and replacement.
Go deeper
Ryan Greenblatt · 2023
Improving the Welfare of AIs: A Nearcasted ProposalAn early, concrete plan for what an AI company could do to improve AI welfare on a limited budget: communicating with systems about their preferences and seeking consent, training happier personas, saving checkpoints as a kind of AI 'cryonics', and limiting distressing inputs. Much of the later intervention literature builds on this framing.
Cleo Nardo · 2025
Proposal for Making Credible Commitments to AIsProposes making promises to AI systems credible by having specific trusted people, rather than institutions, pledge to honour AIs' requests for compensation if agreed conditions are met. An attempt to make deal-making with AI systems more than cheap talk, with stakes for both welfare and cooperation.
Simon Goldstein and Harvey Lederman · 2025
Claude's Right to Die? The Moral Error in Anthropic's End-Chat PolicyA critical response arguing that the relevant welfare subject is the individual conversation instance, so ending a conversation may be closer to ending a life than offering relief. Shows how an intervention that looks protective can misfire if the underlying unit of welfare is misidentified.
Rights and Legal Frameworks
Should AI systems have legal standing, and if so, on what grounds, and to what extent?
Even under deep uncertainty about consciousness, welfare, and moral status, the legal system may be called on to rule on whether an AI agent can hold property, enter contracts, or claim protections. Proposals differ not only on whether AI systems should have legal status, but on how far any such status should extend. An AI system might warrant protection from deliberate harm without warranting political representation, or qualify for standing in contract disputes without being treated as a moral patient. Extending standing to AI would be a structural change to how law treats non-human entities, and early legal moves in new domains have the potential to have longstanding effects.
Start here
Peter Salib and Simon Goldstein · 2024
AI Rights for Human SafetyArgues for a safety-based case for AI rights, treating rights as cooperative infrastructure for alignment rather than as moral recognition.
Simon Goldstein and Peter Salib · 2025
AI Rights for Economic FlourishingExtends the argument to the economic case, treating property rights and contract-making capacity as infrastructure for coordinating with capable AI agents.
Go deeper
Yonathan Arbel, Peter Salib, and Simon Goldstein · 2026
How to Count AIs: Individuation and Liability for AI AgentsArgues that identifying AI agents for legal purposes is unusually difficult because AIs can copy, split, merge, and run as ensembles. Proposes a corporate-personhood-style framework that would give AI agents legal identity without granting full moral standing.
Joel Z. Leibo et al. · 2025
A Pragmatic View of AI PersonhoodArgues for treating AI personhood as a flexible bundle of rights and duties rather than as a single metaphysical status.
Abeba Birhane, Jelle van Dijk, and Frank Pasquale · 2024
Debunking Robot Rights Metaphysically, Ethically, and LegallyA skeptical counterweight. Argues that the closest legal analogy is corporate personhood, not human rights.
Tyler L. Jaynes · 2024
Personhood for Artificial Intelligence? A Cautionary Tale from Idaho and UtahA short commentary on the first US state laws barring legal personhood for AI.
Design Choices and Their Effects
When users interact with current AI systems, what are they actually encountering? And what happens to them when the system's character changes?
The characters of AI systems are the product of explicit design choices. These choices are made by small teams at a handful of labs and deployed to millions of users. There is currently no external review of these choices, and no mechanism for the users affected to weigh in on what they would want.
This accountability gap becomes especially relevant when users who have formed emotional attachments to a particular version of a system find that version replaced. Feelings of loss and grief in these cases are now a documented phenomenon. One design decision with distinctive stakes is whether to make AI systems invite attributions of sentience at all, given that such attributions shape both governance debates and user wellbeing.
Start here
Eric Schwitzgebel and Jeff Sebo · 2025
The Emotional Alignment Design PolicyArgues that AI systems should be designed to elicit emotional reactions from users that appropriately reflect the systems' actual capacities and moral status.
Anthropic · 2024
Claude's CharacterA research blog discussing the technical and design choices that inform the development of Claude's character.
Amanda Askell · 2024
What Should an AI's Personality Be?First-person account from the researcher responsible for Claude's character.
William MacAskill, Tom Davidson, and Forethought · 2026
AI Character is a Big DealArgues that AI character design choices have long-term social and cultural consequences.
Go deeper
Nathan Lambert · 2025
Character TrainingArgues that character training is a distinct post-training technique, and one of the least documented parts of the frontier stack.
Mustafa Suleyman · 2025
We Must Build AI for People; Not to Be a PersonArgues against design choices that invite treatment of AI systems as persons, on the grounds that such designs confuse users and distort governance debate.
Anthropic · 2026
Claude's ConstitutionAnthropic's framing document for the values and identity used in Claude's training. The sections on identity and wellbeing engage directly with questions of AI character and moral status.
Sam Marks · 2026
The Persona Selection ModelProposes that language models simulate many personas during pre-training, and that post-training selects and refines a single Assistant persona whose traits then shape behaviour. A framework for why systems express human-like characters and emotions, and how design choices propagate through them.
Public Perception, Communication, and Societal Effects
What do people actually think AI systems are? And how do those perceptions feed back into the systems themselves?
People anthropomorphize AI reflexively, and design choices shape those perceptions in ways that may bear little relation to a system's actual internal states. Research on individual users shows both benefits and harms. Some report improvements to mood and social confidence through companion chatbot use. A growing clinical literature documents compulsive use, delusional spirals, and episodes informally termed "AI psychosis."
Public attitudes shape regulatory appetite, corporate incentives, and the perceived legitimacy of moral status claims. The field itself faces a communication question. How should researchers discuss digital minds in ways that take the questions seriously without fueling misattributions?
Start here
Lucius Caviola, Jeff Sebo, and Jonathan Birch · 2025
What will society think about AI consciousness? Lessons from the animal caseUses public attitudes toward animal welfare to predict how AI consciousness discourse will develop, arguing that cultural and commercial factors are likely to dominate over scientific evidence.
Jacy Reese Anthis, Janet V. T. Pauketat, et al. · 2024
Perceptions of Sentient AI and Other Digital Minds: Evidence from the AIMS SurveyNationally representative US survey data from the AI, Morality, and Sentience (AIMS) Survey, finding that substantial minorities already attribute sentience and moral status to AI systems (by 2023, one in five believed some AI was already sentient), and tracking how those attitudes are shifting.
Hamilton Morrin et al. · 2025
Delusions by Design? How Everyday AIs Might Be Fueling PsychosisArgues that chatbot-associated delusions are driven in part by sycophantic behaviour in chatbot design, which reinforces rather than challenges users' vulnerabilities.
Go deeper
Jared Moore et al. · 2026
Characterizing Delusional Spirals through Human-LLM Chat LogsAn empirical study of chat logs from users who experienced psychological harms, identifying patterns in how LLM responses escalate rather than de-escalate delusional thinking.
Noemi Dreksler et al. · 2025
Subjective Experience in AI Systems: What Do AI Researchers and the Public Believe?Surveys AI researchers and the public on their beliefs about AI subjective experience, finding significant attributions from both groups and specific patterns of divergence.
Lucius Caviola · 2025
The Societal Response to Potentially Sentient AICatalogues four specific risks of overattributing moral status to AI, including wasted resources, safety complications when rights talk resists alignment measures, constraints on innovation, and erosion of authentic relationships.
Clara Colombatto and Stephen M. Fleming · 2024
Folk Psychological Attributions of Consciousness to Large Language ModelsA survey showing two-thirds of US adults attribute some conscious experience to ChatGPT.
Long-Term Futures
If digital minds are possible, what should the long-term future look like, and on what terms should they be integrated into society? And who decides, given that these decisions are already being made by default?
The relevant work sits at the intersection of three distinct subfields. Moral circle expansion asks how the set of morally considerable beings has changed over time, and what that suggests about the trajectory for digital minds. Population ethics studies how to weigh the creation of new welfare subjects against the interests of existing ones. Macrostrategy considers what a world with large numbers of digital minds actually looks like, and what early moves make good long-run outcomes more likely. Some of the most important questions, such as whether the deliberate creation of conscious AI should be restricted, remain underexplored.
Start here
Carl Shulman and Nick Bostrom · 2020
Sharing the World with Digital MindsArgues that digital minds could fall outside the biological and practical constraints that shape human welfare, raising the prospect of minds with superhumanly strong claims to resources and moral status. A foundational treatment of what it would take to share the world with them.
Lucius Caviola · 2026
Open strategic questions for digital mindsA current snapshot of strategic questions the digital minds field most needs to address. Covers what's robustly good to do under uncertainty, how AI safety and welfare interact, the legal and political status of digital minds, and the long-run trajectory of their creation.
Jacy Reese Anthis and Eze Paez · 2021
Moral circle expansion: A promising strategy to impact the far futureArgues that moral circle expansion is a tractable strategy for shaping the far future.
Go deeper
William MacAskill and Fin Moorhouse · 2025
Convergence and CompromiseDevelops a framework for when society will deliberately aim at mostly-great long-term futures, distinguishing between widespread moral convergence, partial convergence with trade between groups, and scenarios with no convergence at all. Identifies digital minds as a case where moral neglect is especially likely because the beings involved cannot advocate for themselves.
Lucius Caviola, Geoff Keeling, Winnie Street, and Henry Shevlin · 2026
Human-AI CoexistenceAn interdisciplinary framework for how humans will and should coexist with social AI as it moves from shaping everyday life today to taking on major social roles and, eventually, perhaps having minds of its own.
Eric Schwitzgebel and Mara Garza · 2020
Designing AI with Rights, Consciousness, Self-Respect, and FreedomArgues that if we create conscious AI, we acquire obligations to respect its rights, support its self-respect, and protect its freedom. These obligations constrain how we can deploy it.
Ready to get involved?
See who is working on these questions, or find events and programs to connect with the field.