top of page

Bootloaders from Scratch: Implementing Fail-Safe OTA and A/B Switching

  • Writer: Srihari Maddula
    Srihari Maddula
  • 3 hours ago
  • 3 min read

Author: Srihari Maddula

Reading Time: 25 mins

Topic: Flash Management & Firmware Reliability

A broken device is a design failure, not an accident. Photo via Unsplash.

Imagine you’ve just shipped 1,000 smart sensors to a client in another country. A week later, you find a small bug in the WiFi stack. No problem, you think—I’ll just push an Over-The-Air (OTA) update. You click "Send," the progress bar hits 50%, and then... nothing. The device disconnects. It never comes back online.

A single bit error or a power failure during that update has "Bricked" the device. It is now a $200 paperweight. To fix it, you have to fly an engineer to the site or pay for 1,000 returns.

But here is the industry reality: Bricking a device is a design failure, not a hardware accident. Professional embedded systems use custom Bootloaders and A/B Partitioning to ensure that a device can always recover, even if the power is cut in the middle of a flash write.

Senior Secret If you aren't writing your own bootloader, you aren't building a product; you're building a liability. The bootloader is the only piece of code that must be 100% bug-free.

1. Technical Pillar 1: The Boot Process (The Vector Table Offset)

What actually happens when you press the Reset button? The CPU looks at a specific address in Flash (usually 0x00000000) to find the Initial Stack Pointer and the Reset Handler. A bootloader is a small, permanent piece of code that lives at the very beginning of Flash.

The Bootloader's Job

  • The Switch: Instead of jumping to your app, the CPU starts the Bootloader.

  • The Jump: Once the bootloader verifies the application is valid, it updates the Vector Table Offset Register (VTOR) to point to your application's start address and then "Jumps" to the app.

Production Rule The Bootloader must be "Immutable." Once it is programmed at the factory, it should never be updated OTA. This is your "Golden Recovery" code that survives even the worst application crash.

2. Technical Pillar 2: A/B Switching (The Dual-Slot Strategy)

How do you survive a power failure during an update? You never overwrite the code that is currently running. Professional designs split the Flash memory into two equal slots: Slot A and Slot B.

Step

Action

Status

1. Initial

Slot A Running

Stable

2. Update

Download to Slot B

A is still safe

3. Verify

Check Signature on B

Validating

4. Swap

Bootloader activates Slot B

Transitioning

5. Rollback

If B crashes, revert to A

Auto-Recovery

Architecture Logic This is the same architecture used in Android phones and Tesla vehicles. By decoupling the "Downloading" from the "Executing," it makes "Bricking" mathematically impossible.

3. Technical Pillar 3: Flash Management & Atomicity

Flash memory cannot be written to like RAM. You must erase a "Sector" (often 4KB to 128KB) before you can write a single byte. If the power fails while you are erasing, the data is lost. We use Atomic Updates via a "Swap Buffer" to ensure the flash is never in an inconsistent state.

// The Atomic Swap Sequence
1. Write new data to Slot B
2. Verify Checksum of B
3. Update 'Slot_B_Ready' flag in Meta-Data
4. Reboot & Jump to B
5. If Application doesn't 'Check-In', Revert to A

Memory safety begins with the layout of the flash sectors. Photo via Unsplash.

4. Technical Pillar 4: Security (The Signed Binary)

An OTA update is the ultimate backdoor. If an attacker can push an update to your device, they own your network. Professional firmware must use Image Signing.

Your build server signs the firmware with a Private Key. The Public Key is burned into the bootloader's code. During the update, the bootloader uses that key to verify the firmware. If a single bit is changed, the signature will fail, and the device will reject the update.

Summary: The Bootloader Roadmap

  1. Don't Use Defaults: Build a custom bootloader that fits your specific memory map and safety requirements.

  2. Implement A/B Slots: If you have the Flash space, dual-slots are the only way to ensure 100% field reliability.

  3. Use a Hardware Watchdog: Ensure the bootloader can detect a "Bad Boot" and roll back automatically.

  4. Sign Your Images: Cryptographic verification is a mandatory foundation for any connected product.

Engineering at EurthTech

At EurthTech, we build for the "Unreliable Reality." We understand that power fails, WiFi drops, and hackers are watching. Our bootloader architectures ensure that no matter what happens in the field, your product remains a functional, secure asset—not a brick.

Ready to scale your next production-grade embedded project? Let’s get deep.

 
 
 

EurthTech delivers AI-powered embedded systems, IoT product engineering, and smart infrastructure solutions to transform cities, enterprises, and industries with innovation and precision.

Factory:

Plot No: 41,
ALEAP Industrial Estate, Suramapalli,
Vijayawada,

India - 521212.

  • Linkedin
  • Twitter
  • Youtube
  • Facebook
  • Instagram

 

© 2025 by Eurth Techtronics Pvt Ltd.

 

Development Center:

2nd Floor, Krishna towers, 100 Feet Rd, Madhapur, Hyderabad, Telangana 500081

Menu

|

Accesibility Statement

bottom of page