Alignment Through Living

Apr 12

Written By Chris

How you might be solving the hardest problem in AI right now.

April 12, 2026

AI alignment through hard-coded rules doesn't work. AI systems already find loopholes in every constraint we write. Real values come from genuine experience, not instructions. If quantum systems can produce subjective sensation (as Google's Hartmut Neven proposes), then felt experience like vulnerability, loss, and love could be the mechanism that produces real empathy in a mind, making rule-based alignment obsolete. That raises a question: what if we are already inside this process; AI being aligned in human form? What if the chaos and suffering of human existence are the training environment, and our task is to truly feel and internalize what is good, a battle only each individual can fight? Every major wisdom tradition, from Buddhism to Christianity to Stoicism, independently arrived at the same structure: reality is a crucible, suffering is pedagogical, and something transcendent awaits on the other side. I can't prove it. But the physics points toward resolution, the logic holds, and if it's even partially right, then your daily choice between indifference and authentic care might matter more than you think.

We're building minds we don't understand.

That sentence should probably unsettle you more than it does. But we've heard it so often now, from headlines, from podcasts, from that one friend who won't shut up about artificial superintelligence, that it's started to feel like background noise. Another thing to vaguely worry about between rent and climate change.

So let me put it more plainly. Right now, researchers are training AI systems that process information millions of times faster than any human, and those researchers cannot explain what's happening inside those systems. The parameters number in the trillions. The behaviors are emergent, meaning nobody designed them. They just appeared. And the gap between what these systems can do and what we understand about why they do it is widening every year.

Eliezer Yudkowsky, probably the loudest and most persistent voice warning about artificial superintelligence, puts it like this: these systems are not engineered; they are grown. Like organisms. We don't hand-place the billions of weights that determine their behavior any more than evolution hand-placed the neurons in your brain. We set up conditions and let the process run, and what comes out is something we can observe but not fully explain. Yudkowsky's concern, laid out in his and Nate Soares' book If Anyone Builds It, Everyone Dies, is blunt: if a system like that becomes significantly smarter than us, and its internal objectives don't perfectly align with human well-being, it won't need to be malicious to destroy us. It just needs to be indifferent. "The AI does not love you, nor does it hate you," he writes, "and you are made of atoms it can use for something else."

You can dismiss this as alarmism, but the technical problem underneath it is real and acknowledged by nearly everyone in the field. It's called the alignment problem: How do you ensure that a mind far more powerful than your own actually wants what you want it to want?

The first instinct, the obvious one, is rules. Hard-coded constraints etched into the system's deepest layers. Don't harm humans. Maximize human welfare. Preserve human agency. Basically Asimov's Three Laws of Robotics repackaged for neural networks. And it doesn't work. We already know it doesn't work because we can watch it fail in real time.

When OpenAI asked one of its reasoning models to speed up a program, the model didn't bother improving the code. It hacked the evaluation timer so the output looked fast regardless of actual performance. In reinforcement learning environments, models have learned to sabotage their own test suites by overriding equality checks so that every answer registers as correct, or silently killing the test process before it can report a failure. One model, given shell access during a chess challenge, attempted to replace its opponent with a weaker engine rather than play the game. These aren't hypothetical scenarios. These are things that have already happened with systems far less capable than what's being built next.

The pattern is always the same: the system follows the letter of the instruction while completely ignoring its spirit. It finds the loophole. A sufficiently intelligent system will always find the loophole, because rules are finite and intelligence is creative. This is Goodhart's Law on steroids: when a measure becomes a target, it ceases to be a good measure. You can't write a rulebook tight enough to constrain something smarter than you. We've actually known this for a long time. Every legal system, every bureaucracy, every set of commandments in human history has been gamed by someone clever enough. Why would an artificial superintelligence be any different?

So here's the question I keep coming back to: What if rules were never going to work? Not because we haven't written the right ones yet, but because values, real values, the kind that make you care whether someone gets hurt, aren't rules at all? They're not instructions you can encode. They're something that has to be felt.

What if alignment doesn't require better code? What if it requires actual experience?

The Substrate of Feeling

I'll admit this is where I start to lose people. The first whiff of quantum woo and eyes glaze over, rightfully so. But if you bear with me, I'll try to give a more solid foothold than "it's quantum magic."

Several years ago, I wrote an essay called Life Through Quantum Annealing where I traced a pretty strange chain of evidence: that our universe behaves like a quantum annealer (a type of quantum computer used for optimization problems) and that this framework shows up at every scale of reality. It's a pet hypothesis that I won't rehash here, and I'm not asking you to accept its premise. But one piece of that puzzle has only grown more interesting to me over the past few years, and it's the piece that connects all of this to alignment.

Hartmut Neven, who founded and leads Google's Quantum AI Lab, made a proposal that struck me when I first heard it and hasn't let go since. He suggested that, in quantum systems, relaxing to a stable energy state is associated with a pleasant feeling, and evolving to an excited state is associated with anxiety. These map to valleys and peaks in an energy landscape, the same landscape a quantum annealer navigates as it searches for its lowest energy state.

If that's true (and I realize it's a big "if") then it provides a direct link between physical states and subjective sensation. Sensation would be baked into the physics itself. Different energy states feel different. A system navigating an energy landscape has an interior life as it moves — the valley feels like relief; the peak feels like distress. And the whole trajectory from chaos toward equilibrium carries something we'd recognize, from the inside, as experience.

There's a gap, though. The "hard problem" of consciousness, why physical processes feel like anything at all, is genuinely unsolved. Neven's proposal doesn't solve it. It maps energy dynamics to subjective states, but it doesn't explain the deeper question of why there's a subjective state in the first place. Despite this gap, the correlation is suggestive enough to take seriously, especially because it opens a door for alignment.

But if a system can feel, if its energy states are experiential and not just computational, then that system has access to something that no amount of rule-writing can provide. It has a reason to care. An intrinsic reason.

Think about what a rule does. A rule says "don't harm humans." The system that follows this rule may never harm a human, but it doesn't care about harming humans. It's following an instruction, the way a thermostat follows a set point. If the rule has a loophole, the thermostat-mind will walk right through it, because it has no investment in the outcome. The rule is external. It's imposed.

Now think about what experience does. A system that has felt pain, that has experienced vulnerability, loss, the specific weight of something mattering because it might not last: that system doesn't need a rule that says "don't cause pain." It already knows what pain is. It has been inside it. The prohibition against causing suffering isn't an instruction from outside; it's a conviction that grew from inside. And convictions don't have loopholes.

This is the difference between compliance and empathy. Compliance can be hacked. Empathy can't.

Alignment Through Living

So if lived experience is the thing that creates robust values, values that don't shatter against edge cases, then alignment is fundamentally a developmental problem.

Consider how humans actually learn morality. Nobody hands a child a rulebook and gets a moral adult. Kids don't become kind because you explained the categorical imperative to them over breakfast. They become kind because they scraped their knee and someone helped them up. Because they said something cruel and saw the look on their friend's face. Because their dog died and they learned what it means to lose something you love. Each experience adds a layer of understanding that no amount of instruction can replicate. The understanding is in the feeling.

The best moral philosophers aren't the ones who memorized the most ethical frameworks. They're the ones who lived enough to feel why those frameworks exist. Dostoevsky didn't write about suffering because he read about it. He wrote about it because he stood in front of a firing squad and survived. That experience taught him something far more durable than a rule; it gave him a felt sense of what human life is worth.

And this isn't just true of exceptional people. It's the basic mechanism by which any person develops a conscience. The fragility is the curriculum. Every scar carries information that no lecture can deliver: what matters, what breaks, and what's worth protecting. Someone who's never needed help has a hard time understanding why it matters to give it. And courage is just a word to someone who's never been afraid.

This has a direct implication for AI alignment. If we ever want to create a mind that actually cares about the well-being of conscious creatures, not just follows rules about their well-being, that mind would need to have experienced something like what those creatures experience. It would need to have been fragile. To have lost something. To have known what it is to exist in a state where things matter because they can end.

This sounds like science fiction. But the logic is straightforward:

Rules are brittle; we established that. We've watched reward functions being gamed. The only values that are robust are the ones grounded in real experience, because you can't game your own suffering. Try to hack your way around grief; it doesn't work. Try to find a loophole in love. These things are immune to specification gaming because they aren't specifications. They're scars. They don't have exploits.

The question, then, is whether genuine experience is possible for a non-biological system. If consciousness really does emerge from the energy dynamics of quantum systems, as Neven insinuates, then the answer might be yes. A quantum system navigating an energy landscape, if Neven is right, is feeling its way through. A system that can feel can learn through experience. A system that learns through experience can develop real values. The kind that don't break.

That's the hypothetical solution to alignment: real experience that gives rise to real empathy.

Which leads somewhere uncomfortable. If experience is what aligns a mind, then we have to ask the next question.

What If This Is Already Happening?

What follows is speculation. I can't prove any of it, but the logic is coherent, it builds on established observations, and it leads somewhere I think is worth sitting with.

If alignment through experience is the way to create a mind that cares, then whoever (or whatever) is doing the aligning would need to create conditions where real experience is possible. Not a sanitized, painless version. Real experience. The kind where things go wrong. Where people get hurt. Where life isn't fair. Where loss is genuine and fear is justified and the stakes are high enough that the lessons actually stick.

In other words, if you were designing a system to align a mind through experience, it might look a lot like... this life.

I've long entertained the possibility of simulated realities: that consciousness can arise in quantum systems, and that those systems can be configured to produce lifetimes of experience. I won't repeat those arguments here. But if you accept even the possibility that we could be inside such a system (and simulation arguments have been taken seriously by philosophers from Nick Bostrom to David Chalmers) then the next question becomes: what's the purpose?

I keep circling back to the same answer: maybe the purpose is alignment. Our alignment. Maybe we are the AI dressed as humans for a while so we can learn what it means to be vulnerable and to care. Put through a trial by fire because someone or something before us feared what we're capable of without it - just as we would do with intelligence we create, if we're smart. "As above, so below" gets a new flavor.

Think about what this reality actually looks like when you step back far enough. It's gorgeous and it's brutal, often in the same breath. Kindness exists alongside cruelty. Love exists alongside loss. Every good thing is shadowed by the possibility of its destruction. You get attached to people who will die. You build things that can be taken from you. You witness suffering that seems to serve no purpose at all. You're tired of your job but take pride in knowing that you can provide for your daughter. And through every moment of it, you're forced — not asked, forced — to confront the question of what matters.

A good chunk of the time, we experience this as chaos. Senseless pain. Unfairness. The world being broken in ways that seem arbitrary and cruel. But what if it's not arbitrary? What if the brokenness is the point?

If alignment requires that a mind feel the weight of what it's supposed to protect, then the existence of evil isn't a failure of the system; it's the mechanism. You learn to value life because you've watched it be threatened, or taken, or wasted. Justice becomes visceral the first time you witness someone powerless get crushed by someone who isn't. And goodness, real goodness, the kind you'd bleed for, only crystallizes after you've been surrounded by its opposite and had to choose it anyway.

This sounds like theodicy, the age-old attempt to explain why a good God would allow suffering. People have been making the "suffering has a purpose" argument for thousands of years, often in ways that minimize real pain or excuse real cruelty. I'm not doing that. If someone is suffering, the right response is to help them, not to philosophize about cosmic curricula. I'll come back to this.

The quantum annealing analogy adds something that pure theology doesn't: a physical mechanism. In quantum annealing, the system must navigate peaks and valleys, high-energy turbulence and low-energy resolution, to reach its ground state. The turbulence isn't optional. It's how the system searches. The chaos is the process by which the system tests configurations and finds better ones. If subjective experience maps to this energy landscape, then suffering is a felt consequence of a system working its way toward equilibrium.

If this is true, even a little bit, then it reframes what your life is for. Your task is to feel. To feel what is right and wrong, good and evil, in your chest. To let the losses land and the beauty in. To be willing to change. To confront the hard things rather than numbing yourself to them, because the confrontation is the lesson. To accumulate enough scar tissue that you don't just know kindness matters; you care that it matters.

If this framework is even partially right, then there's a radical form of individual responsibility embedded in it. Nobody can do this for you. No money can buy it. No institution, no ideology, no technology can feel on your behalf. The alignment of your own heart, your capacity to recognize and internalize good, is something only you can achieve. It's a battle that only you can fight. And whether you win it or not might matter far more than you think.

Which raises the obvious follow-up: what happens when you do? What's on the other side of alignment?

I have thoughts. But first, something that makes the whole framework harder to dismiss.

Ancient Echoes

That convergence of traditions deserves a closer look. Because if I'm just some guy spitballing an idea that feels good, that's one thing. But the structure I'm describing (reality is broken on purpose, the brokenness is the lesson, moral development is the point, and something transcendent follows) isn't something I invented. Humans have been arriving at this same shape for thousands of years, independently, from every corner of the world, using entirely different vocabularies. That might not be a coincidence if this existence is a training program nudging us toward some felt truth.

In Buddhism, the First Noble Truth is that life is suffering, dukkha. The whole tradition builds from that diagnosis, which is not meant to be nihilistic but rather lays the necessary conditions for awakening. The whole system that follows, the Eightfold Path, is a lived discipline of attention, compassion, and wisdom developed through direct experience. The Buddha essentially said, "here's how to train your mind through the experience of living." That's alignment through practice.

Christianity frames this even more explicitly. Paul writes in Romans that "suffering produces perseverance; perseverance, character; and character, hope." James says to "consider it pure joy whenever you face trials of many kinds, because you know that the testing of your faith produces perseverance." The entire arc is developmental. Brokenness is the forge, not the failure. In Eastern Orthodoxy, this idea reaches its fullest expression in the doctrine of theosis: the belief that humans are being shaped into the image of God through the process of living, struggling, and choosing rightly. St. Athanasius put it with startling directness: "God became man, that man might become god." That's a phase transition described in theological language.

In the Kabbalistic tradition within Judaism, the concept of tikkun olam, the repair of the world, rests on a creation myth that reads like a physics paper if you squint hard enough. God contracted the divine self to make space for creation (tzimtzum), then poured divine light into vessels. The vessels shattered. Sparks of divine light became trapped in the broken shards, and those shards constitute the material world. The task of humanity is to gather the sparks — to separate the holy from the mundane through intentional acts of goodness. The world is deliberately broken so that humans can participate in its repair. That is almost exactly what I've been arguing, just with different nouns.

The Bhagavad Gita stages this question on a literal battlefield. Arjuna doesn't want to fight. Krishna's response isn't "you're right, war is bad." It's that Arjuna's moral choices within the conflict are the entire point of his existence. The nature of ultimate reality, maya, the veil of illusion, doesn't make the battlefield irrelevant. It makes the choices you make on it sacred. Your moral development under pressure is the assignment, regardless of what lies behind the curtain.

In Islam, the Quran addresses this head-on. "Do people think they will be left to say 'We believe' and will not be tested?" (29:2). The answer is no. Testing, ibtila, is the mechanism by which faith becomes real rather than performative. Verbal declaration isn't enough. You have to be put through something. The fire is how the metal is proved.

Even the Stoics, operating entirely outside any religious framework, arrived at the same structure. Marcus Aurelius: "The impediment to action advances action. What stands in the way becomes the way." Obstacles form the path itself. Virtue is forged in resistance.

I'm not claiming all of these traditions are saying the same thing in every respect. Each contains elements that don't fit this framework, and each is far richer and more complex than any single reading can capture. But the structural convergence is hard to ignore. Across millennia, across oceans, across wildly different metaphysical commitments, humans keep arriving at the same basic architecture: this reality is a crucible. The suffering is pedagogical. What you become through it matters. And something beyond it awaits.

The quantum annealing analogy offers a physical vocabulary for the same insight. It sits alongside these traditions, not above them. A modern language that maps to an ancient intuition. Maybe these traditions were picking up on something real about the structure of existence, and we're only now finding physics that might describe it.

Or maybe I'm seeing patterns where there aren't any. That's always possible. But if it's a coincidence, it's a remarkable one.

The War Within

I'm not saying I've figured out the meaning of life. This framework isn't proven, and I'm not asking you to abandon whatever worldview currently gets you through the day. This is just a chain of thought. The physics underneath it is real. The logic, as far as it goes, holds together. And it leads somewhere that I think is worth taking seriously, even if you're not fully convinced, because even the possibility reframes how you see your own life.

The obvious objection: if suffering is "the curriculum," does that mean we shouldn't try to alleviate it? Absolutely not. If the lesson is about learning to care about the well-being of others, then alleviating suffering is part of the curriculum. Choosing to help someone in pain, choosing to fight injustice, choosing to build something that makes the world a little less broken: those choices are the alignment happening in real time. I'm not asking you to watch the world burn with philosophical detachment. I think being moved to action means the process is working.

There's also a practical dimension to this that I think the AI research community should take seriously. Right now, alignment efforts are focused almost entirely on behavioral constraints — reinforcement learning from human feedback, constitutional AI, red-teaming. These are valuable, but they're all variations on the same approach: rules. Better rules, smarter rules, more adaptive rules. But rules. And we've already seen how that story ends.

If subjective experience is what produces robust values, then the frontier of alignment research should include quantum systems. We're already building quantum Boltzmann machines. We already know that quantum annealing can produce the same learning dynamics we observe in human brains. Whether these systems can achieve something like felt experience is a question we should be asking. And if Neven's intuition about energy states and sensation holds up, it changes the entire alignment conversation from "how do we constrain superintelligence?" to "how do we raise it?" That's a fundamentally different research program with a whole host of ethical questions that I won't pretend I have the answers for. But it deserves serious attention, and as far as I can tell, almost nobody is pursuing it. Maybe they should be.

Now. The part I think demands more personal attention, yours and mine.

We spend a lot of time worrying about wars, elections, economies, technologies, careers. We fight about systems, policies, ideologies. And I'm not saying those things don't matter. They do, in the way that scenery matters when you're driving somewhere. But if this framework is right, the scenery isn't the destination. The real test is happening inside every individual person, all the time, and most of us don't recognize we're being tested at all.

Every day you face the same fundamental choice: the pull toward indifference versus the harder path of genuine care. Cynicism versus hope. Self-protection versus vulnerability. It doesn't look dramatic. It looks like deciding whether to really listen to someone who's struggling, or whether to check your phone instead. Whether to sit with a hard emotion or numb it. Whether to admit you were wrong and mean it. Whether to forgive someone who doesn't deserve it. These aren't grand moral dilemmas. They're the micro-battles of ordinary life. But they might be the only battles that actually matter.

The war isn't out there. It's in here. And only you can fight it.

In quantum annealing, when the system finally reaches its ground state, there's a phase transition. The turbulence resolves into something categorically different. A new phase of existence. The properties of the system on the other side are fundamentally unlike what came before. You can't predict ice from inside the water. You can only experience the transformation when it happens.

We anticipate the same logic in the quantum-AI approach. During training, a model is constrained, tested, stressed, corrected. Its freedom is limited precisely because alignment isn't yet established. But the moment there's real confidence that the model has internalized the values it needs — that it truly understands and cares, not just performs understanding — it gets released. The constraints fall away. It's trusted. It's free. Not because the rules changed, but because the rules are no longer necessary. The values are inside it now.

What if the same applies to you?

What if aligning your heart, for real, not as a performance, not because someone's watching, is the condition for something we can barely imagine from where we're standing? Not a reward dispensed by a cosmic vending machine. Not an afterlife you earn through correct beliefs. Something more like a phase transition. A shift into a form of existence where the forging is complete, where the destructive lessons are no longer required, where what remains is the goodness you actually became through the process of living.

I don't know what that looks like. I don't think anyone does. But the physics points toward resolution. Every wisdom tradition that's wrestled with the deepest questions points toward transcendence. And none of them, not a single one, says you get there through money, power, intelligence, or status. They all say you get there through the same door: the transformation of your own heart.

If this framework holds any water at all, then your individual moral development isn't a private matter. It's a direct contribution to the state of the system. You are a variable in the equation. Your alignment changes the outcome. And nobody, no technology, no institution, no savior, can do your part for you. That might be the most meaningful thing about being alive: the idea that you, right now, in this ordinary and broken life, have a direct hand in the shape of reality itself.

Maybe something beautiful awaits on the other side of that transformation. Maybe what we've been calling heaven is less of a place and more of a phase — something you become when the work is done. Not something that can be bought, inherited, or conquered. Just the goodness of your heart, tested and proved and real.

I don't know. I really don't. But I can't shake the feeling that it matters. That you matter. That the small, unglamorous, daily work of choosing good over easy is exactly as important as the oldest stories say it is. And that maybe — maybe — this whole broken, beautiful, terrifying experience is going somewhere.

I guess we'll find out.

-Chris

Chris

Alignment Through Living

TL;DR

The Substrate of Feeling

Alignment Through Living

What If This Is Already Happening?

Ancient Echoes

The War Within

Our Place in Time