Back to Writing
low level

Why am I trying to write an OS again?

I love distributed systems until they love me back. Notes from trying to stand up Ceph in a 3-node lab and why future-me expects pain.

So I'm back on my OS development kick again. I came across The Little Book about OS Development and given that understanding the futility of a task isn't enough to stop me from attempting it, here we are. :)

Instead of continuing down the monolithic kernel path (which is the right way to write a kernel), I'm exploring a microkernel design. The separation of concerns seems more manageable for someone who tends to work in concentrated bursts of motivation.

The microkernel approach brings some clear benefits for me:

  • only the absolute essentials run in ring 0
  • device drivers and other services run in userspace (ring 3)
  • much more fault-tolerant

The microkernel comes with some performance implications because of the context switching which is why mainstream operating systems are monolithic or a hybrid kernel, but I'm not writing a mainstream kernel. Realistically I'm probably not even writing a kernel.

I've been digging into Mach and L4 implementations for inspiration. The message-passing paradigm they use for inter-process communication is elegant in theory, but implementing it efficiently seems like it would be a bit out of my wheelhouse if we're being honest.

This time, I'm self aware enough to implement a decent kernel panic early on, knowing that it was going to get called - uh - a lot. It's obviously a small function but having this output on the serial makes troubleshooting a hell of a lot easier and running this in QEMU makes getting this output possible without actual hardware.

#include <stdarg.h>
 
void panic(const char *msg) {
    volatile char *vga = (volatile char *)0xB8000;
    const char *p = "KERNEL PANIC: ";
    int i = 0;
 
    while (*p) { vga[i++ * 2] = *p++; }
 
    while (*msg) { vga[i++ * 2] = *msg++; }
 
    while (1) { __asm__ volatile("cli; hlt"); }
 
}

My current progress: I've got the bootloader working with GRUB until I write one, and I'm setting up the GDT (Global Descriptor Table) for memory segmentation. Here's where I'm fighting with the segmentation descriptors:

// Setting up the GDT entries for a flat memory model
struct gdt_entry {
    uint16_t limit_low;
    uint16_t base_low;
    uint8_t base_middle;
    uint8_t access;
    uint8_t granularity;
    uint8_t base_high;
} __attribute__((packed));
 
void gdt_set_gate(int num, uint32_t base, uint32_t limit, uint8_t access, uint8_t gran) {
    gdt_entries[num].base_low = (base & 0xFFFF);
    gdt_entries[num].base_middle = (base >> 16) & 0xFF;
    gdt_entries[num].base_high = (base >> 24) & 0xFF;
 
    gdt_entries[num].limit_low = (limit & 0xFFFF);
    gdt_entries[num].granularity = (limit >> 16) & 0x0F;
 
    gdt_entries[num].granularity |= gran & 0xF0;
    gdt_entries[num].access = access;
}

The tricky part here is getting the access flags right. The wrong combination and you end up with either no protection at all or a CPU that refuses to execute your code. Last time around I spent three days debugging an issue that turned out to be a single bit flipped in the wrong direction in the access byte.

The next challenge is setting up paging, which is where things always seem to fall apart. Modern memory management with demand paging is surprisingly complex when you have to implement it from scratch. The book covers it in some detail and I've gotten a lot better at writing functional C code over the past few years. I think I'll be able to pull it off this time.

Why am I doing this? Partly because I'm fascinated by the lowest levels of system architecture, partly because there's something deeply satisfying about creating the fundamental software layer that everything else runs on top of, and obviously bragging rights. When this project eventually joins my graveyard of unfinished OS attempts, each iteration will have taught me something new about the intricate dance between hardware and software.

Next up: interrupts and a proper IDT (Interrupt Descriptor Table) implementation. Although I've been eyeing that Ben Eater breadboard computer series as well - building a physical CPU from scratch might be my next rabbit hole.

Game of Life

Wikipedia
Slow150msFast

Patterns