Bootloaders from Scratch: Implementing Fail-Safe OTA and A/B Switching
- Srihari Maddula
- 3 hours ago
- 3 min read
Author: Srihari Maddula
Reading Time: 25 mins
Topic: Flash Management & Firmware Reliability

A broken device is a design failure, not an accident. Photo via Unsplash.
Imagine you’ve just shipped 1,000 smart sensors to a client in another country. A week later, you find a small bug in the WiFi stack. No problem, you think—I’ll just push an Over-The-Air (OTA) update. You click "Send," the progress bar hits 50%, and then... nothing. The device disconnects. It never comes back online.
A single bit error or a power failure during that update has "Bricked" the device. It is now a $200 paperweight. To fix it, you have to fly an engineer to the site or pay for 1,000 returns.
But here is the industry reality: Bricking a device is a design failure, not a hardware accident. Professional embedded systems use custom Bootloaders and A/B Partitioning to ensure that a device can always recover, even if the power is cut in the middle of a flash write.
Senior Secret If you aren't writing your own bootloader, you aren't building a product; you're building a liability. The bootloader is the only piece of code that must be 100% bug-free.
1. Technical Pillar 1: The Boot Process (The Vector Table Offset)
What actually happens when you press the Reset button? The CPU looks at a specific address in Flash (usually 0x00000000) to find the Initial Stack Pointer and the Reset Handler. A bootloader is a small, permanent piece of code that lives at the very beginning of Flash.
The Bootloader's Job
The Switch: Instead of jumping to your app, the CPU starts the Bootloader.
The Jump: Once the bootloader verifies the application is valid, it updates the Vector Table Offset Register (VTOR) to point to your application's start address and then "Jumps" to the app.
Production Rule The Bootloader must be "Immutable." Once it is programmed at the factory, it should never be updated OTA. This is your "Golden Recovery" code that survives even the worst application crash.
2. Technical Pillar 2: A/B Switching (The Dual-Slot Strategy)
How do you survive a power failure during an update? You never overwrite the code that is currently running. Professional designs split the Flash memory into two equal slots: Slot A and Slot B.
Step | Action | Status |
1. Initial | Slot A Running | Stable |
2. Update | Download to Slot B | A is still safe |
3. Verify | Check Signature on B | Validating |
4. Swap | Bootloader activates Slot B | Transitioning |
5. Rollback | If B crashes, revert to A | Auto-Recovery |
Architecture Logic This is the same architecture used in Android phones and Tesla vehicles. By decoupling the "Downloading" from the "Executing," it makes "Bricking" mathematically impossible.
3. Technical Pillar 3: Flash Management & Atomicity
Flash memory cannot be written to like RAM. You must erase a "Sector" (often 4KB to 128KB) before you can write a single byte. If the power fails while you are erasing, the data is lost. We use Atomic Updates via a "Swap Buffer" to ensure the flash is never in an inconsistent state.
// The Atomic Swap Sequence
1. Write new data to Slot B
2. Verify Checksum of B
3. Update 'Slot_B_Ready' flag in Meta-Data
4. Reboot & Jump to B
5. If Application doesn't 'Check-In', Revert to A
Memory safety begins with the layout of the flash sectors. Photo via Unsplash.
4. Technical Pillar 4: Security (The Signed Binary)
An OTA update is the ultimate backdoor. If an attacker can push an update to your device, they own your network. Professional firmware must use Image Signing.
Your build server signs the firmware with a Private Key. The Public Key is burned into the bootloader's code. During the update, the bootloader uses that key to verify the firmware. If a single bit is changed, the signature will fail, and the device will reject the update.
Summary: The Bootloader Roadmap
Don't Use Defaults: Build a custom bootloader that fits your specific memory map and safety requirements.
Implement A/B Slots: If you have the Flash space, dual-slots are the only way to ensure 100% field reliability.
Use a Hardware Watchdog: Ensure the bootloader can detect a "Bad Boot" and roll back automatically.
Sign Your Images: Cryptographic verification is a mandatory foundation for any connected product.
Engineering at EurthTech
At EurthTech, we build for the "Unreliable Reality." We understand that power fails, WiFi drops, and hackers are watching. Our bootloader architectures ensure that no matter what happens in the field, your product remains a functional, secure asset—not a brick.
Ready to scale your next production-grade embedded project? Let’s get deep.
