Performing manual Flatcar Container Linux rollbacks
In the event of an upgrade failure, Flatcar Container Linux will automatically boot with the version on the rollback partition. Immediately after an upgrade reboot, the active version of Flatcar Container Linux can be rolled back to the version installed on the rollback partition, or downgraded to the version current on any lower release channel. There is no method to downgrade to an arbitrary version number.
This section describes the automated upgrade process, performing a manual rollback, and forcing a channel downgrade.
Note: Neither performing a manual rollback nor forcing a channel downgrade are recommended.
Automated rollbacks
The rollback to the previously installed version is done by GRUB and happens automatically if update-engine
had no chance to mark the version as successful.
This marking happens when the new version is booted and keeps running for around two minutes, at which point update-engine
will mark the version as successful (how this works in detail is explained below).
To extend the automatic rollback logic to cover your important systemd services, you could make them as requirement for the update-engine.service
.
Note that update-engine
will still try to update which can cause a loop with disruptions due to the reboots.
You can disable automatic updates by setting SERVER=disabled
in /etc/flatcar/update.conf
.
Rollback with flatcar-update
While you can rollback to the previously installed version manually with the rest of this guide, you can also install any version to the inactive partition with the flatcar-update
tool.
To rollback to a known-good version, run it as follows:
$ sudo flatcar-update --to-version 2905.2.6 --disable-afterwards
The --disable-afterwards
switch writes SERVER=disabled
to /etc/flatcar/update.conf
which disables updates.
This ensures that you will stay on the version you specified.
How do updates work
The system’s GPT tables are used to encode which partition is currently active and which is passive. This can be seen using the cgpt
command.
$ cgpt show /dev/sda
start size part contents
0 1 Hybrid MBR
1 1 Pri GPT header
2 32 Pri GPT table
4096 262144 1 Label: "EFI-SYSTEM"
Type: EFI System Partition
UUID: 596FF08E-5617-4497-B10B-27A23F658B73
Attr: Legacy BIOS Bootable
266240 4096 2 Label: "BIOS-BOOT"
Type: BIOS Boot Partition
UUID: EACCC3D5-E7E9-461D-A6E2-1DCDAE4671EC
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=2 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=1 tries=0 successful=0
4464640 262144 6 Label: "OEM"
Type: Alias for linux-data
UUID: 726E33FA-DFE9-45B2-B215-FB35CD9C2388
4726784 131072 7 Label: "OEM-CONFIG"
Type: Flatcar Container Linux reserved
UUID: 8F39CE8B-1FB3-4E7E-A784-0C53C8F40442
4857856 37085151 9 Label: "ROOT"
Type: Flatcar Container Linux auto-resize
UUID: D9A972BB-8084-4AB5-BA55-F8A3AFFAD70D
41943007 32 Sec GPT table
41943039 1 Sec GPT header
Looking specifically at “USR-A” and “USR-B”, we see that “USR-A” is the active USR partition (this is what’s actually mounted at /usr; you can verify this with rootdev -s /usr
). Its priority is higher than that of “USR-B”. When the system boots, GRUB (the bootloader) looks at the priorities, tries, and successful flags to determine which partition to use.
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=2 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=1 tries=0 successful=0
You’ll notice that on this machine, “USR-B” hasn’t actually successfully booted. Not to worry! This is a fresh machine that hasn’t been through an update cycle yet. When the machine downloads an update, the partition table is updated to allow the newer image to boot.
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=1 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=2 tries=1 successful=0
In this case, we see that “USR-B” now has a higher priority and it has one try to successfully boot. Once the machine reboots, the partition table will again be updated.
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=1 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=2 tries=0 successful=0
Now we see that the number of tries for “USR-B” has been decremented to zero. The successful flag still hasn’t been updated though. Once update-engine has had a chance to run, it marks the boot as being successful.
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=1 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=2 tries=0 successful=1
Note: You may also see Alias for coreos-rootfs
shown for the /usr
partition instead of the flatcar-rootfs
. To refer to them you can use both names or the more appropriate flatcar-usr
name which we will use from now on.
Performing a manual rollback
So, now that we understand what happens when the machine updates, we can tweak the process so that it boots an older image (assuming it’s still intact on the passive partition). The first command we’ll use is cgpt find -t flatcar-usr
. This will give us a list of all of the USR partitions available on the disk.
$ cgpt find -t flatcar-usr
/dev/sda3
/dev/sda4
To figure out which partition is currently active, we can use rootdev
.
$ rootdev -s /usr
/dev/sda4
So now we know that /dev/sda3
is the passive partition on our system. We can compose the previous two commands to dynamically figure out the passive partition.
$ cgpt find -t flatcar-usr | grep --invert-match "$(rootdev -s /usr)"
/dev/sda3
In order to rollback, we need to mark that partition as active using cgpt prioritize
.
cgpt prioritize /dev/sda3
If we take another look at the GPT tables, we’ll see that the priorities have been updated.
270336 2097152 3 Label: "USR-A"
Type: Alias for flatcar-rootfs
UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132
Attr: priority=2 tries=0 successful=1
2367488 2097152 4 Label: "USR-B"
Type: Alias for flatcar-rootfs
UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C
Attr: priority=1 tries=0 successful=1
Composing the previous two commands produces the following command to set the currently passive partition to be active on the next boot:
cgpt prioritize "$(cgpt find -t flatcar-usr | grep --invert-match "$(rootdev -s /usr)")"
In the above scenario, tries can stay 0 because the partition was marked as successful. If the partition was not successfully booted, we also need to set the available tries to 1 again:
cgpt add -T 1 /dev/sda3
Forcing a Channel Downgrade
The procedure above restores the last known good Flatcar Container Linux version from immediately before an upgrade reboot. The system remains on the same
Flatcar Container Linux channel
after rebooting with the previous USR partition. It is also possible, though not recommended, to switch a Flatcar Container Linux installation to an older release channel, for example to make a system running an Alpha release downgrade to the Stable channel. Root privileges are required for this procedure, noted by sudo
in the commands below.
First, edit /etc/coreos/update.conf
to set GROUP
to the name of the target channel, one of stable
or beta
:
GROUP=stable
Next, clear the current version number from the release
file so that the target channel will be certain to have a higher version number, triggering the “upgrade,” in this case a downgrade to the lower channel. Since release
is on a read-only file system, it is convenient to temporarily override it with a bind mount. To do this, copy the original to a writable location, then bind the copy over the system release
file:
cp /usr/share/coreos/release /tmp
sudo mount -o bind /tmp/release /usr/share/coreos/release
The file is now writable, but the bind mount will not survive the reboot, so that the default read-only system release
file will be restored after this procedure is complete. Edit /usr/share/coreos/release
to replace the value of COREOS_RELEASE_VERSION
with 0.0.0
:
COREOS_RELEASE_VERSION=0.0.0
Restart the update service so that it rescans the edited configuration, then initiate an update. The system will reboot into the selected lower channel after downloading the release:
update_engine_client -update