3. Continuous Deployment/Delivery

Continous Deployment/Delivery - Getting your code to the User

Everything we've talked about so far is about the developer side of a project - orgainizing the code changes, quality, and testing. This is well and good but now we need to consider our second goal - deploy new code and data architecture (databases, servers, etc) seamlessly to users with ways to address bugs and fatal crashes.

Like CI, Continuous Deployment(CD) may seem trivial at first - just let the user download the new update from your company website. But as we said before, it's all about scale and coordination.

NOTE: Like before, we will be talking about some other concepts in order to properly cover CD pipeline. If you are unfamiliar with bootloaders or general Internet of Things concepts, I'd sugest you follw are guides here and here, respectively.

Another Anecdote - Download Binaries

Lets say you're considering the simplest deployment method - letting the user download and install the latest software themselves. Perhaps this means letting a user flash the code to the device or maybe just load the binaries to a sd card for your bootloader. Mauybe it's ssh-ing into the device and running some scripts. However, while these are definitely options that are still used today, it has its own tradeoffs.

The main benefit is the simplicity. No extra teams, extra code, or even extra time is really needed in this method outside of hosting the correct binaries on your website and maybe maintaining some scripts for the bootloader. In a way, it's not too dissimilar to what programmer do for their code libraries or compilers or anyother code enviroment setup.

But there lies the problem - you must depend on your users to perform the upgrade. When your users are a bunch of programmers, engineers, and/or product managers, this is perfectly fine. But not all users are the same - some users may not be tech-savvy, some may find updates inconvienient and not bother, some might have their own special hardware/environment modifications that makes your update problematic. Even when your users are a bunch of programmers, engineers, and/or product managers, this isn't always trivial - these users might just have the tools or knowledge deal with it for one reason or another. But if your goal is to develop IoT for everyone, you have to make your deployment easy for everyone.

Additionally, there is security - when an update patches a major vulnerability, leaving a device out of the upgrade process is leaving a weak link for attacker to target. Depending on users can leave these weak link everywhere - when your user base is in the hundreds, the margin of human error can sky-rocket. Let alone what happens with thousands of users.

But speaking of security patches, we also must consider the vulnerabilities we introduce in your devices whenever there are code updates. there are also no version rollbacks (no way to recall a version from your devices). If you release code with bugs or major issues, there is no easy way to revise the code version for most system.

Giving the users the code binaries might be simple but only really scales to users that are already some level of developers themselves. If we want a truly scaleable solution for every user, we'll need to be more systematic.

Continuous Deployment - Live Service Updates

Let us look at a related non-embedded system that can show us a better path forward. Consider webpages and servers that operate 24/7 like Amazon.com or Google.com or the Apple Store. These systems cannot afford amy downtime - being down for 20 minutes may lose out on thousands of active users and cost thousands of dollars.

So, how could you roll out a system without interrupting the servers/service? This is where the CD pipeline comes in. A good continuous deployment(CD) pipeline will be able to track the active users and offload the old systems while onboarding updates behind the scenes.

Staged Rollouts and Resource Monitoring

This brings us to staged rollouts. Instead of releasing an update to everyone, you could release the update to a certian portion of users. For web servers, this might mean redirecting a certian portion of users to the new systems (based on their location, whether they are currently active, etc).

By keeping track of usage, it is possible to set up updates so that they occur when the user is not active on the service and so not affected by their indiviual downtime. Even more importantly, this allows for effective user testing - as early users adopt the update, any new issues from users can initiate bug fixes on the developer side or even prompt a reversal of the update - in the case of server, they'd just reverse the ration of users who are redirected to the updated systems.

Of course this is very high level - like with PRs and integration, the details really depend on what system you're are deploying and what pipeline services you have either developed or are using.

Continuous Deployment - Over-the-Air-Firmware-Updates and Golden Images

So how does this even work for embedded?

Well, first of, we're not a web server - all out systems are theoretically tied to a particular device and has some code file flashed to particular microcontrollers (we can just hot-swap the service).

However, we can borrow some concepts if we embrace the "I" in IoT - the Internet of Things. If our embedded device can connect to a network and thereby a server, we as developers can pull off similar tricks as web developers to deploy, update, and rollback our code as needed.

First off, we will require a bootloader, and some way to store our new firmware update (likely some non-volatile memory (NVM) like an SD card). This bootloader should be able to flash the files from the NVM into our device's flash memory like another other flash device. Now, we can make our main application code connect to our company server and download any code updates before it restarts.

We can now realize tall the tricks of live service system that are now at our finger tips. We can provide gradual rollouts as devices ping the company server for updates. We can also track errors and crashes from devices without needing user intervention. This practice is a bit less like web server deployment and so you might hear it called "Continuous Delievery" since the process is more about how code updates themselves are sent to devices. The folks at Memfault have made major strides in this and I'd highly recommend their articles about these topics.

Admittedly, it's not all sunshine and easy pickings - sending the firmware over wifi or any other type of network inherently exposes attack vectors for you device. Security measures in your network protocols and your bootloader become 1000% more important as your first line of defense. Additionally, rolling back updates require a bit of forethought. A good practice is for your devices to savea known working version of the firmware - the "golden image" - to be a backup that can be restored on the device in case of critical failure. However, this practice may not save you from malicious code or your board or over-exciting the electrons in your device (starting a electrical fire).

The Inbetween

We've covered both CI and CD, but it's worth noting that the line between these two can be non-existant - that's a good thing! With an ideal pipeline, a major PR being accepted and initiate an initial rollout phase for a few users which can initiate inital users feedback which can prompt the next changes to be integrated into the branch which starts the cycle over again.

The end result is a constance continuous development cycle that will grow and scale with the project.

Future of Embedded DevOps

All in all, CI/CD pipelines and DevOps is a relatively young field that lags behind the level of analogous pipelines and processes seen in web-based tech companies. But, in the same vien, it indicates a highly untapped niche with unique opportunities that, if perfected, could lend to vital role in embedded development.