Production Hotfix Release¶

3AM Quick Guide¶

Use this when production needs one urgent fix and the current release branch may contain unreleased staging changes.
Record the currently deployed production SHA from the target repository's Operate -> Environments page.
Restore release back to what production is already running, commit that reset, and push it. This replaces the files in your local release checkout with the files from the production commit so your next commit makes release match production again.

git restore --source="${PRODUCTION_SHA1}" --staged --worktree .
git commit --allow-empty --message "fix: restore release to last production state"
git push origin release
git fetch origin release
git diff "${PRODUCTION_SHA1}"..origin/release
# output should be empty

After that clean baseline is on origin/release, cherry-pick the approved hotfix into release using the GitLab UI.
The release pipeline is manually activated, so it is safe to push, revert, and cherry-pick on release until you are ready to start the release.
Open the orchestration pipeline and run only release <service name>: https://gitlab.com/evenergi/develop/-/pipelines/new
Run the downstream manual production deployment pipeline as well. Both gates are manual by design for safety and control.
Do not run cut-off. That can repopulate release with unrelated unreleased changes.
Watch the production deploy settle, run the focused smoke checks, and update the GitLab incident record.

Purpose And When To Use It¶

This runbook explains how to rebuild the release branch from the last known production deployment, push that clean baseline, cherry-pick a merged hotfix onto it, and trigger a production release for the affected service.

Use this workflow when production needs an urgent fix and the current release branch may already contain unreleased staging changes that must not be promoted with the hotfix.

Do not use this workflow for the normal weekly release, for routine staging promotion, or when the issue can wait for the standard release cadence.

Prerequisites And Permissions¶

Before using this workflow:

the hotfix merge request must already be merged and approved for production
you must be able to identify the last known-good production deployment commit
you must have permission to push to the release branch
you must have permission to create and run the production release pipeline and the downstream manual production deployment pipeline
you must have access to the target service's production environment page, logs, and health checks
you must have permission to create or update the GitLab incident record
you must know the smoke checks or focused validations required after the production deploy

In the current BetterFleet setup, operators typically inspect the deployed production SHA from the target repository's Operate -> Environments page, and they trigger the shared orchestration pipeline from:

https://gitlab.com/evenergi/develop/-/pipelines/new

The service-specific manual job usually follows the pattern release <service name>, for example 01 release vemo-core.

Normal Procedure¶

Capture the last known-good production SHA. Open the target repository's production environment and record the commit currently deployed to production.
Restore, commit, and push release from that production SHA. Use a clean local checkout to replace the files on release with the files from the recorded production commit, then create and push a new commit that represents the rollback to the last production state. The git restore command below does not change which commit release points to yet. It simply swaps the checked-out files to match PRODUCTION_SHA1 so the next commit recreates that last known-good production state on release.

export PRODUCTION_SHA1="<last-production-sha>"

git fetch --all
git checkout -B release origin/release

git restore --source="${PRODUCTION_SHA1}" --staged --worktree .
git commit --allow-empty --message "fix: restore release to last production state"

git push origin release
git fetch origin release

git diff "${PRODUCTION_SHA1}"..origin/release
# output should be empty

Cherry-pick the merged hotfix into release. After the restore commit is on origin/release, use the GitLab merge request cherry-pick action so only the approved hotfix is layered on top of the restored production baseline. The release pipeline is manual, so pushing the restore commit first is safe and gives you a clean point from which to cherry-pick, revert, or retry as needed before releasing.
Verify the release branch contents. Confirm that origin/release now contains the restore commit plus only the intended hotfix commit or commits.
Create a new orchestration pipeline. Open the shared orchestration pipeline page and start a fresh pipeline for the release flow.
Run the service release job. Select the manual job for the target service, usually named release <service name>.
Do not run the cut-off job. The cut-off job syncs the default branch into release, which can pull in unrelated unreleased changes and defeat the purpose of this recovery path.
Run and watch the manual production deployment pipeline through to completion. The downstream production deployment pipeline is manual as well, for the same safety and control reasons as the release pipeline. Trigger only the intended production deployment path, then follow the downstream pipeline, environment page, service logs, and health checks until the deployment settles.
Validate the production hotfix. Run the focused production smoke checks needed for the incident or defect.
Record the operational incident. Create or update the GitLab incident with the problem, timeline, deployed fix, validation result, and any remaining follow-up actions.

Reference Screenshots¶

These screenshots use one example service and repository. Your target project name and release <service name> job label may differ.

Steps 1 And 8: Check The Environments Page Before And After The Release¶

Use Operate -> Environments to identify the last known-good production deployment before restoring release, and then revisit the same page after the release to confirm production now points at the intended hotfix deployment.

GitLab environments view showing the production and staging deployment history

Step 3: Cherry-Pick Only The Approved Hotfix Into `release`¶

What To Check	Screenshot
Start from the merged merge request and use the `Cherry-pick` action.
In the cherry-pick dialog, target the `release` branch so the hotfix lands on the restored production baseline.

Steps 5 Through 7: Run The Service Release Job¶

Use the orchestration pipeline to run the release <service name> job for the affected service. The key safety check is to choose the service release job and not the cut-off job, because cut-off can repopulate release with unrelated unreleased changes.

Release-stage pipeline showing a service-specific release job

Decision Points And Exceptions¶

If the current release branch already matches the last production state and contains no unreleased changes, you can skip the restore step and promote only the hotfix.
If the hotfix depends on unreleased release-branch changes, stop and escalate the decision. A selective production hotfix is no longer the right workflow.
If the production environment page does not clearly show the last deployed SHA, stop and gather evidence from deploy logs, release records, or the last successful production pipeline before changing release.
If the cherry-pick conflicts, resolve locally, inspect the final tree carefully, and push only after confirming that unrelated changes were not pulled into release.
If multiple services are involved, repeat the release and validation steps for each affected service and record which services were changed.
If the issue turns out to be operational rather than code-related, stop using this runbook and switch to the service-specific incident or infrastructure recovery path.

Validation And Evidence¶

Treat the production hotfix as complete only when you can point to evidence for each of these checks:

the last known-good production SHA was recorded before the restore
the git diff "${PRODUCTION_SHA1}"..release check was empty immediately after the restore commit
the release branch contains only the restore step plus the intended hotfix
the correct service release job was run, not cut-off
the production deployment completed successfully for the target service
the production environment reflects the intended hotfix commit or build
the focused production smoke checks passed
the GitLab incident records the timeline, fix, validation evidence, and any follow-up work

Rollback And Recovery¶

If the production hotfix fails or causes a regression:

repeat the release-branch restore procedure using the last known-good production SHA
push the restored release branch
rerun the relevant release <service name> job
recheck production health, smoke tests, and monitoring
update the GitLab incident with the rollback time and observed outcome

If the service does not recover after the code rollback, stop using this runbook and follow the service-specific incident, infrastructure, or data recovery documentation.

Links To Service-Specific Details¶

Shared CI and release context: CI and Release Integration
BetterFleet service lookup: Service Matrix
BetterFleet Manage service docs: Manage Services
BetterFleet Plan service docs: Plan Services
GitLab incident template: https://gitlab.com/evenergi/develop/-/issues/new?issuable_template=incident&type=INCIDENT
Target repository details: the target repository's .gitlab-ci.yml, release jobs, environment page, and service-specific operational docs