July 2021, Porsche recalls 43 000 of its newest EVs: Taycan and Taycan Cross. Why? Due to software issues resulting in power loss. How could this have been prevented while reducing costs and fixing the defects in one go on all cars? The answer is short and comes from the mouths of everyone working in the automotive industry: Over-The-Air Upgrade.
Although hard to implement correctly, the cost of not having the ability to remotely upgrade software and firmware in the vehicle is huge. Today it’s not the question of “IF” and “WHEN”, (since the automotive industry has long known the answers to these questions), today it’s the question of „HOW”.
Upgrading a GPS or infotainment application is one thing, but upgrading the vehicle’s firmware is another. And it does not matter whether it’s a car, an e-scooter, or a smartphone. The principles are always the same. We will try to outline them in this article.
Let’s start from the beginning – what are the core benefits of the over-the-air upgrade.
OTA allows for remote diagnosis. Initial diagnosis done remotely helps with better planning of repairs, as well as with predictive maintenance – both giving a better customer experience and reducing the cost for the OEMs, especially during the warranty period.
The upgrade can also happen on the production line while waiting for shipment. The vehicle always has the newest stable version of the firmware and software, reducing the amount of manual work required for the entire vehicle lifecycle.
The only part of the car life cycle where the Over-The-Air Upgrade is not really useful is aftersales.
Key benefits of implementing an over-the-air upgrade are:
- An ability to remain compliant with evolving industry standards through vehicle lifetime.
- It helps to reduce warranty and recall costs by reducing service center visits or help desk calls for the vehicle (it also works on the production line, while waiting for shipment).
- The vehicle always has the newest stable version of the firmware and software, reducing the amount of manual work required for the entire vehicle lifecycle.
- An ability to resolve issues remotely, so the customer does not have to waste time traveling on-site.
- An ability to update multiple vehicles simultaneously, reducing time required to update the whole fleet.
SOTA – the most common implementation of over-the-air upgrade
SOTA is used widely by almost every OEM to update navigation systems (maps, POIs) and sometimes other infotainment applications, like voice assistance. As opposed to the firmware update, the failure of the software update is rarely critical to vehicle operations. It can result in inconvenience when due to update failure, the navigation system crashes or fails to display a map.
This is also the part that makes the customer experience bad if SOTA is done without due diligence because the software makes the infotainment appealing and responsive. And yet no one likes slow or difficult-to-use applications or services. Especially when they’re intended to boost driving satisfaction.
Firmware over-the-air upgrade is a different beast
With FOTA, we play a much more demanding game. That’s why it’s important to separate software updates from firmware updates.
First, it’s just easier for a developer to focus on his part of the job, the specific application. Secondly, the firmware part is riskier and more complex, and the update might not be required that often.
The complication comes partly from the idea of replacing the Operating System of the ECUSoC and partly from the criticality of the systems. Computers controlling engine operations, ESPTC, gearbox, or electronic chassis controller are required for safe and reliable operations of the vehicle.
Firmware Over-The-Air Update Failure in the update process, resulting in critical fault of this kind of subsystem, in most cases, makes the vehicle inoperable, beyond repair capabilities of regular users. The cost of restoring the vehicle to an operational state is fully on the manufacturer’s side. This is obviously the scenario that should be avoided at all costs.
Key requirements for implementation of (F) OTA successfully
- Automatic recovery corrupted updates
Firmware updates should be atomic. The whole process should be successful, or the system should automatically roll back to the previous / existing version of the software. The problem does not have to be caused by a bug in the original image – the package can be corrupted in transit, or the transfer might be interrupted and result in a partial package being in the process.
- Internet connectivity consistency
Parts of the firmware being updated, especially ones regarding device to network connectivity, should never break away if the SoC is connected to the internet – otherwise, the next version might never be installed automatically. It’s important especially if the device does not have a way to notify the user about the problem or allow them to reconfigure the network settings.
- Code provenance, code identity, code compatibility and code integrity – security of the executed program
Firmware update in most cases regards critical systems. The wireless update is tempting, but it must be secure, especially regarding verifying the identity of authors of change and source of the update – as well as if the code was not replaced or altered during transit. If the edge device can cryptographically confirm code signs, it can be installed. Additionally, there should be a way for the update system to confirm if the package is built for that specific it’s being installed on.
- Secure communication medium for package transport
All channels used for the update should be secure. Ideally, it should be a mutual TLS, but even a regular secure TLS connection is sufficient as long as the whole path is secure (both local connection and in the cloud).
- [NICE-TO-HAVE] Sending OTA firmware updates in chunks and partial updates support
It’s easier to handle updates that are sent in chunks. When the connection is unstable, the whole download process does not have to be repeated. Additionally, if partial updates are supported, a small update takes less time to install and less bandwidth to transfer.
- [NICE-TO-HAVE] Separate base system layer from the installed software
If the application and data layer is not part of the firmware update, it’s easier to develop the applications, safely update the system without breaking the data, and securely update the system without breaking the applications. Combined with partial updates, it also helps with making updates faster.
Opposite to the chip flashing using a wired connection, the failure is not really an option – if the device can not boot, even to some basic OS functions, it is bricked – unless you are an expert with specialist hardware, it may be really hard to directly write new firmware to the chip to overwrite the faulty or broken version.
And what if a broken package is written to the device?
Does not matter if it was a human error, device issue, or just really bad luck – in the end, the important part is to make sure the user does not end up with a broken vehicle. The battle-tested solution for this problem is AB filesystems – or AB slots.
The idea is rather simple – system areas in storage are duplicated. Graphically speaking, there are two fully operational versions of the system being installed simultaneously on the single device, and there is a programmatic switch in the bootloader which selects the OS to start.
In regular operation, a single system, let’s call it “A”, is being continuously used while the other one, “B”, is the exact copy of the “A”, but works as a backup. If the “A” fails to start, the bootloader switches to the other version. During the update, the inactive partition is overwritten with the update packages – either whole partition or subset of files, depending on the type of update. If the update finishes and the checksum of the result is correct, as the last step, the bootloader configuration is changed to run from the “B” slot, and the device restarts.
As previously stated – if something fails, the bootloader, after a failed attempt, will switch back to the previous, working version. This makes this approach safe, allowing us to retry the upgrade process. Otherwise, the update is successful and there are two approaches:
- Leave the old version on the other partition and remain to boot from the slot selected after the update process.
- Copy the contents of the upgraded partition to the other slot to have two copies of the same version.
The same approach is used in modern smartphones, and as a direct continuation, the same approach was selected for Android Automotive OS – which is a Google Android Open-Source Project (AOSP) implementation-specific for the automotive industry.
Currently, both Volvo (including, of course, Polestar) and General Motors use AAOS for their newest vehicles as an infotainment system. Being an open system, a lot of applications can be developed for cars from different OEMs and leverage the bigger, open market – plus of course, the code is open source, and a lot of work on things like upgrade system (OTA), application delivery, connection to subsystems (air conditioning, navigation, interior buttons) is already finished and can be reused.
Building using open and tested frameworks and code is just easier – and a proven way to update both application and system is an asset when starting from scratch with new infotainment firmware and software.