MAN\SH AI

Linux Internals Deep Dive

When the Kernel
Loses Its Mind

A thorough guide to kernel panics, the OOM killer, hung tasks, soft lockups, and everything that makes a Linux system scream into the void.

📅 June 6, 2026 ⏱ 20 min read 🐧 Linux Internals 🔧 Intermediate / Advanced

§ 01 Introduction — The Operating System's Last Resort

Every operating system, no matter how well-engineered, has moments where it hits a wall it cannot safely recover from. On Linux these moments show up as kernel panics, OOM kills, soft lockups, and a whole taxonomy of other dramatic failures.

Understanding these events is not just academic. If you run servers, debug embedded systems, or maintain production infrastructure, these failure modes are the difference between a five-minute recovery and a three-hour war room call at 2 AM. This piece goes deep: the mechanics, the source-level logic, real log examples, and concrete mitigations.

Scope

Examples are drawn primarily from Linux kernel 5.x and 6.x on x86_64, though the concepts apply broadly. Where specific sysctls or file paths are mentioned, they refer to mainline Linux.

We start with the most dramatic case, the kernel panic, and work downward into softer but still dangerous failure modes. By the end, you should understand not just what happened, but why the kernel made the choice it did and how to bend that behavior toward safer production outcomes.

§ 02 Kernel Panic — The Point of No Return

A kernel panic is Linux throwing its hands up. It is the kernel's admission that continuing execution would be more dangerous than stopping entirely. Once panic() is invoked, the system halts or reboots, depending on configuration, and the message is sent to every console it can reach.

What actually triggers a panic?

The kernel calls panic() when continuing would risk data corruption, security boundary failure, or undefined behavior at the hardware level. Common triggers include null dereferences inside the kernel, fatal hardware exceptions, loss of PID 1, or policy decisions like panic_on_oops and panic_on_oom.

TriggerExample scenarioSeverity
Oops with panic_on_oopsKernel code dereferences a null pointerConfigurable
Double faultFault handler itself faultsAlways fatal
Machine check exceptionCPU detects an uncorrectable hardware errorUsually fatal
No init processPID 1 exits or never startsAlways fatal
OOM with panic_on_oomSystem runs out of memory and policy prefers panicConfigurable
Soft or hard lockup policyWatchdog sees CPU stuck in kernel modeConfigurable

A real panic message, annotated

dmesg output — kernel panic
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 2847 Comm: kworker/3:2 Tainted: G           O
RIP: 0010:my_driver_handle_event+0x1f/0x60 [my_driver]
Call Trace:
 process_one_work+0x1cd/0x380
 worker_thread+0x4d/0x3e0
 kthread+0x12a/0x160
CR2: 0000000000000008
Kernel panic - not syncing: Fatal exception in interrupt

The address 0x8 is a classic clue: something was null, and the kernel tried to dereference a struct field 8 bytes in. RIP tells you where execution was. CR2 confirms the faulting address. The call trace gives you the breadcrumb trail back to the source line or module.

The panic() path

kernel/panic.c — simplified flow
void panic(const char *fmt, ...) {
    pr_emerg("Kernel panic - not syncing: %s\n", buf);
    preempt_disable();
    crash_kexec(regs);
    atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
    if (panic_timeout > 0)
        mdelay(panic_timeout * 1000);
    machine_restart(0);
}

That sequence is why panic behavior matters operationally. A panic is not just a crash. It is a policy surface: crash dump, notifier chain, delay, and reboot behavior all live here.

§ 03 Kernel Oops — The Warning Before the Abyss

A kernel oops is a serious kernel fault that does not necessarily stop the machine immediately. Think of it as the kernel saying, “something illegal just happened, but I might limp onward.”

This makes oopses more dangerous in one sense than panics. A panic is loud and final. An oops can leave the system partially corrupted while workloads continue running. That is why many production teams prefer panic_on_oops=1 for critical environments: if the kernel has crossed an integrity boundary, fail loudly rather than continue in an undefined state.

Why oopses matter

An oops often means the kernel recovered control flow, not correctness. The machine may still be alive, but the assumptions you were relying on no longer hold cleanly.

Out-of-tree modules are a common source of oopses, and the kernel helpfully marks this in the taint flags. If you see taint plus an oops plus a custom module in the trace, start there before blaming the scheduler, VM, or filesystem.

§ 04 Virtual Memory — The Ground Beneath the OOM Killer

You cannot understand the OOM killer without understanding Linux virtual memory. Linux does not only track “free RAM.” It tracks committed memory, reclaimable memory, anonymous pages, page cache, slab usage, and the promises the system has already made to processes.

The kernel is usually willing to overcommit memory because not every process will touch every page it asked for. That policy is rational most of the time, but it means OOM events often arrive later than application teams expect. The system appears healthy until it suddenly is not.

Simplified memory pressure path
anonymous memory grows
        ↓
page cache reclaim starts
        ↓
slab and file cache pressure rises
        ↓
direct reclaim / compaction
        ↓
allocation still fails
        ↓
OOM killer enters the room

This is why “free memory” in dashboards is often misleading. MemAvailable, reclaim behavior, swap activity, and commit ratios are much better indicators of where the machine really stands.

§ 05 OOM Killer — Controlled Violence Under Memory Pressure

The OOM killer exists because sometimes the system simply cannot satisfy a memory allocation and no amount of reclaim will save it. At that moment, Linux makes a hard choice: kill something so the machine as a whole can survive.

How the kernel chooses a victim

The kernel computes a badness score using process memory footprint, privilege, adjustment knobs like oom_score_adj, and context about the allocation domain. Large, memory-hungry, non-privileged processes are natural targets. Important daemons can protect themselves by setting lower or negative OOM adjustment values.

typical OOM output
Out of memory: Killed process 18324 (python) total-vm:24567892kB,
anon-rss:17345012kB, file-rss:420kB, shmem-rss:0kB, UID:1000 pgtables:33124kB
oom_score_adj:0

That line is pure production gold. It tells you which task died, how much virtual memory it had, how much anonymous RSS it actually pinned, and whether it had any special OOM weighting.

Practical ops tip

If the wrong process keeps dying, do not just “add RAM” blindly. Inspect oom_score_adj, memory limits, reclaim behavior, and whether one runaway worker is forcing the kernel into terrible choices.

§ 06 Soft Lockups, Hard Lockups, Hung Tasks

Not every kernel crisis is a panic or OOM. Sometimes the machine is alive but one CPU is spinning forever, a task is stuck in uninterruptible sleep, or RCU grace periods never complete. These are the lockup family of failures.

Soft lockups

A soft lockup means a CPU stayed in kernel context too long without scheduling. Common causes include giant spinlock hold times, busy-wait loops, interrupt storms, or badly behaved driver code.

dmesg — soft lockup
BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:1938]
RIP: 0010:my_driver_poll+0x91/0x150 [my_driver]

Hard lockups

A hard lockup is worse: the CPU cannot even service watchdog NMIs. This usually means interrupts are disabled and the core is wedged in a much deeper way.

Hung tasks

Hung tasks are usually D-state processes waiting on I/O or kernel locks for too long. They are often symptoms of storage stalls, dead devices, network filesystem pain, or lock chains that never resolve.

Why D-state scares operators

A D-state task cannot be fixed with SIGKILL. The signal only lands when the wait completes, and the whole point is that the wait may not be completing.

§ 07 Debugging Toolkit — From Crash Site to Root Cause

Magic SysRq

SysRq is the emergency hatch when the system is half-dead but still has enough life to answer the kernel directly. It is one of the best tools for extracting forensic signal before a forced reboot.

sysrq via /proc/sysrq-trigger
echo t > /proc/sysrq-trigger   # dump task states
echo m > /proc/sysrq-trigger   # dump memory info
echo w > /proc/sysrq-trigger   # show blocked tasks
echo s > /proc/sysrq-trigger   # sync filesystems
echo u > /proc/sysrq-trigger   # remount read-only

KASAN, KMEMLEAK, perf, ftrace

When debugging kernel code, sanitizers and tracing tools are how you turn folklore into evidence. KASAN catches use-after-free and bounds bugs. KMEMLEAK helps with leaks. perf and ftrace help explain recurring soft lockups and CPU stalls.

§

If the machine is so gone that SSH, SysRq, and local consoles are worthless, a serial console or crash dump path becomes the only truthful witness left. That is why boring setup like kdump and serial console wiring pays off disproportionately when something genuinely bad happens.

§ 08 Production Hardening — A Practical Checklist

Armed with the failure taxonomy, the next step is hardening. Good production posture is not about preventing every failure forever. It is about choosing sane policies, preserving evidence, and making the blast radius smaller.

/etc/sysctl.d/99-production-hardening.conf
kernel.panic = 30
kernel.panic_on_oops = 1
kernel.hung_task_timeout_secs = 120
kernel.softlockup_panic = 0
kernel.watchdog_thresh = 10
vm.panic_on_oom = 0
vm.swappiness = 10
vm.vfs_cache_pressure = 50
kernel.nmi_watchdog = 1
kernel.sysrq = 176

What to monitor

MetricWhy it mattersAlert idea
CommittedAS vs CommitLimitApproaching commit exhaustion>80%
Swap in/outActive paging means degraded performance>0 sustained
MemAvailableBetter than raw free memory<10%
dmesg for OOM / BUG / OopsDirect kernel distress signalAny occurrence
D-state process countStorage or lock starvation>5 sustained

Closing thought

Kernel panics, OOM kills, and lockups are not shameful events. They are the kernel being maximally honest. A machine that crashes loudly with a vmcore is often more trustworthy than one that stays up while silently corrupting state.

The Linux kernel's failure philosophy is brutally useful: be explicit, be loud, preserve forensic signal. Learn to read that signal and these scary-looking messages become some of the most valuable debugging output in your stack.