Skip to content

OTA Updates

SocketXP OTA update tool is extremely useful when you have to deploy software updates on a fleet of Linux based edge devices.

The OTA update tool can be used to update software packages, applications binaries, config files, and run scripts on a group of remote edge devices

You can use the OTA update tool to automate your DevOps workflow.

Creating and Deploying OTA Updates

Creating and deploying OTA updates using the tool is a two step process:

  1. Create an OTA Update Workflow
  2. Create an OTA Update Deployment from the workflow

The basic concept behind this two step approach is to reuse the OTA update workflow code between multiple deployments.

OTA Update Best Practices

For a successful OTA update deployment, have a small fleet of devices in your development lab as a Test Group. These devices should be similar to the ones in the Production Group.

First, deploy your OTA updates to the Test Group. Verify that the update is successful and everthing is working as expected. Look at the OTA update deployment logs for each of the devices in the Test Group.

Fine tune your OTA update workflow or fix bugs in your app, until the OTA updates to the Test Group is successful.

Only then deploy your OTA updates to devices in your Production Group.

Some sophisticated development teams have Development, Testing, Stating and Production Groups. They even create a small subgroup under the Production Group, named as Canary Group, to rollout their OTA updates to the Canary Group first (following the Canary Deployment Model) before deploying the OTA update to all other devices in the Production Group.

Step #1: Create an OTA Update Workflow

OTA Update Workflow is the logic behind the OTA update.

You need to first define the software update workflow logic in the form of a Workflow Definition Script. The script can be written as a Linux shell script or Python/Perl script or any script.

The workflow script defines the various steps in the software update process and the order in which these various steps needs to be executed. The workflow script will also have a rollback on failure logic to rollback the device software to the original state (the state before the OTA update).

Workflow Script Blueprint

An OTA update workflow script should have the following logic:

  1. Stop the app or service running in the device ( On failure, do what? - Undo the current action.)
  2. Back up the existing working app/version and working config files (On failure, do what? - Undo the previous actions)
  3. Download and install the new app binary to the app directory (On failure, do what? - Restore from the backup)
  4. Update the app configuration and settings, or download a new config file with the required settings. (On failure, do what? - Restore from the backup)
  5. Start the app or service (On failure, do what? - Restore from the backup)
  6. Verify if the app is up and running (On failure, do what? - Restore from the backup)
  7. Clean up the back up.
  8. Done. Exit.

Here are some sample Workflow Script files written in Python and Shell script, respectively, using the above blueprint. You could leverage these scripts and adapt it for your application's OTA update workflow definition.

import os
import subprocess
import sys

def run_command(command):
    """
    Run a shell command and return the output, error and return code.
    """
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
    output, error = process.communicate()
    return output, error, process.returncode

def create_backup():
    """
    Create a backup of the myapp binary and configuration files.
    """
    print("Creating backup of myapp configuration files...")

    # Define the source directories for the backup
    src_dirs = ["/etc/myapp/", "/var/lib/myapp/"]
    dst_dir = "/tmp/myapp_backup/"

    # Create the destination directory if it doesn't exist
    os.makedirs(dst_dir, exist_ok=True)

    # Run the rsync command to create the backup
    for src_dir in src_dirs:
        dst_subdir = os.path.join(dst_dir, src_dir.lstrip('/'))
        os.makedirs(dst_subdir, exist_ok=True)
        output, error, return_code = run_command(f"rsync -a {src_dir} {dst_subdir}")

        if return_code != 0:
            print(f"Error creating backup: {error}")
            return False

    print("Backup created successfully.")
    return True

def restore_backup():
    """
    Restore the backup of the myapp configuration files.
    """
    print("Restoring backup of myapp binary and configuration files...")

    # Define the destination directories for the backup
    dst_dirs = ["/etc/myapp/", "/var/lib/myapp/"]
    src_dir = "/tmp/myapp_backup/"

    # Run the rsync command to restore the backup
    for dst_dir in dst_dirs:
        src_subdir = os.path.join(src_dir, dst_dir.lstrip('/'))
        output, error, return_code = run_command(f"rsync -a --delete {src_subdir}/ {dst_dir}")

        if return_code != 0:
            print(f"Error restoring backup: {error}")
            return False

    # Clean up the backup directory
    output, error, return_code = run_command(f"rm -rf {src_dir}")

    if return_code != 0:
        print(f"Error cleaning up backup directory: {error}")
        return False

    print("Backup restored and cleaned up successfully.")
    return True

def handle_error(error_message, restore=True):
    """
    Handle an error by printing an error message, restoring from backup if necessary,
    and restarting the myapp systemd service.
    """
    print(error_message)

    if restore:
        # Attempt to restore the backup and revert the changes
        if not restore_backup():
            sys.exit("Error restoring backup.")

    # Restart the myapp systemd service
    output, error, return_code = run_command("systemctl start myapp")

    if return_code != 0:
        sys.exit(f"Error starting myapp service: {error}")

    return False

def install_myapp():
    """
    Install the myapp package.
    """
    print("Installing myapp...")

    # Stop the myapp systemd service if it is running
    output, error, return_code = run_command("systemctl stop myapp")

    if return_code != 0:
        return handle_error(f"Error stopping myapp service: {error}", restore=False)

    # Create a backup before performing any changes
    if not create_backup():
        sys.exit("Error creating backup.")

    # Download the package from AWS S3 bucket
    url = "https://abcdefghxyz.amazonaws.com/v41/bin/myapp.deb"

    output, error, return_code = run_command(f"wget -O /tmp/myapp.deb {url}")

    if return_code != 0:
        return handle_error(f"Error downloading package: {error}")

    # Install the package
    output, error, return_code = run_command("dpkg -i /tmp/myapp.deb")

    if return_code != 0:
        return handle_error(f"Error installing package: {error}")

    # Start and enable the myapp systemd service
    output, error, return_code = run_command("systemctl start myapp && systemctl enable myapp")

    if return_code != 0:
        return handle_error(f"Error starting or enabling myapp service: {error}")

    print("myapp installed successfully.")
    return True

Here is a shell script version of the workflow script for downloading and installing myapp binary:

#!/bin/sh

#================================================
# Example Workflow Script:
# MyApp Update Workflow Script 
#================================================

# On-failure (before the backup) just restore the service
invoke_restore() {
    systemctl start myapp
    systemctl status myapp
    exit 1
}
# On-failure clean up and restore from backup.
invoke_restore_backup() {
    systemctl stop myapp
    cp /usr/local/bin/myapp.bkup /usr/local/bin/myapp
    cp -R /var/lib/myapp.bkup /var/lib/myapp
    rm -rf /var/lib/myapp.bkup
    cp /etc/myapp/config.json.bkup /etc/myapp/config.json   
    systemctl start myapp
    systemctl status myapp
    exit 1
}

debug_error() {
    echo "$(date +"%Y-%m-%d %H:%M:%S.%N"): Error: $1"
}

debug_log() {
    echo "$(date +"%Y-%m-%d %H:%M:%S.%N"): $1"
}

# Workflow Begins.

# stop the service first
systemctl stop myapp

# backup the existing working app and configuration.
cp /usr/local/bin/myapp /usr/local/bin/myapp.bkup
if [ $? -eq 0 ]; then
    debug_log "Binary copied."
else 
    debug_error "Binary backup failed."
    invoke_restore
fi

mkdir -p /var/lib/myapp.bkup
cp -R /var/lib/myapp /var/lib/myapp.bkup
if [ $? -eq 0 ]; then
    debug_log "Working dir copied."
else 
    debug_error "Working dir backup failed."
    invoke_restore
fi

cp /etc/myapp/config.json /etc/myapp/config.json.bkup
if [ $? -eq 0 ]; then
    debug_log "Config copied."
else 
    debug_error "Config backup failed."
    invoke_restore
fi

# Download the new myapp file from your online repository such as AWS S3 Bucket
curl -O https://abcdefghxyz.amazonaws.com/v41/bin/myapp && chmod +wx myapp && sudo mv myapp /usr/local/bin
if [ $? -eq 0 ]; then
    debug_log "New binary downloaded."
else 
    debug_error "New binary download failed."
    invoke_restore_backup
fi

# Download a new myapp config.json file from your online repository such as AWS S3 Bucket
curl -O https://abcdefghxyz.amazonaws.com/v41/cfg/config.json && sudo mv config.json /etc/myapp/config.json
if [ $? -eq 0 ]; then
    debug_log "New config file downloaded."
else 
    debug_error "New config file download failed."
    invoke_restore_backup
fi

# Start the service again
systemctl start myapp

# Check the status of the service
STATUS="$(systemctl status myapp | grep 'Active: active (running)')"
if [ -z $STATUS ]; then
    debug_error "myapp service failed to run."
    invoke_restore_backup
else 
    debug_log "myapp service is running."
fi

How to create an OTA Update Workflow

To create an OTA Update Workflow, go to the DevOps Automation section in the SocketXP Portal.

OTA Update to connected devices - Create a New Workflow

Click the CREATE WORKFLOW tab and start creating an OTA Update Worflow as described below:

  • Workflow Name - Provide some name to remember this workflow for your reference. Eg: MyIoTApp upgrade to v3.0 workflow, Fix antenna issue, security fixes

  • Upload a Workflow Script File - Upload a workflow script file that you have written to update the software in your devices.

  • Destination Filename - Specify the file path where the workflow script needs to be downloaded on your edge Linux devices. Eg: /usr/local/bin/workflow-script.sh, /var/lib/my-iot-app/my-workflow-script.py, /home/pi/wf-script.pl

  • Script Execution Command - Specify a Linux command to run on your edge devices to execute the workflow script downloaded in the previous step. Eg: sh /usr/local/bin/workflow-script.sh, python /var/lib/my-iot-app/my-workflow-script.py, perl /home/pi/wf-script.pl

  • Finally click the CREATE WORKFLOW button to save and finish.

Go to the WORKFLOWS tab and hit the Refresh button there to view your newly created workflow.

OTA Update to connected devices - Create a New Workflow

Now that the OTA Update Workflow is created, it is ready to be deployed on select group of devices.

Note

If you plan to use the Remote Jobs REST APIs to create a workflow and deploy a new job to your remote IoT devices, then the "Command" field in the JSON data in the REST API's HTTP request body should have the following command:

"echo `copy paste the contents of the script.sh file above` > /usr/local/bin/script.sh; sh /usr/local/bin/script.sh"

What the above command does is basically uploads the content of the local file script.sh to the /usr/local/bin/script.sh file in your remote IoT device and then executes the uploaded file as a shell script in the device.

Step #2: Create an OTA Update Deployment

OTA Update Deployment is the actual deployment of the OTA Update Workflow defined in the previous step on select group of devices. This step is very simple. You select a workflow and specify a device group or tag on which you want the workflow to be deployed.

How to create an OTA Update Deployment

Go to the WORKFLOWS tab and click the '+' icon (shown below) next to any workflow listed there. This will popup a Create a new deployment window.

OTA Update to connected devices - Create a New Workflow

Create a new deployment as described below:

  • Workflow ID - This pre-filled data shows the workflow ID used for the deployment to be created.
  • Workflow Name - This pre-filled data shows the workflow name used for the deployment to be created.
  • Deployment Name- Specify a name to remember this deployment for your reference.
  • Device Group or Tag - Specify a Device Group or a Device Tag ( Eg: testing, production, customer-xyz, temp-sensor etc.) on which to deploy the OTA update.
  • Finally, click the CREATE DEPLOYMENT button to save and finish.

Go to the DEPLOYMENTS tab and hit the Refresh button to view your newly created deployment.

OTA Update to connected devices - Create a New Workflow

Note

You can create as many number of OTA Update Deployments from an OTA Update Workflow. Each deployment can be for a different device group or tag such as development, testing, production etc. This way the workflow and the workflow script can be reused across deployments.

You can monitor the progress of the OTA Update Deployment by clicking the Refresh button.

You can monitor the progress of the deployment on individual devices in the Device Group or Device Tag, by simply clicking the Deployment row. A Deployment Progress window will popup. The window will have a table containing the deployment progress for each of the devices in the group or tag on which the deployment was scheduled. Hit the Refresh button in the window to view the progress.

OTA Update to connected devices - Create a New Workflow

You can also view the logs (workflow script execution logs and error logs) for each device by clicking the VIEW LOG button next to the device. If your job shows as failed or error, then check the stderr logs.