Production Hotfix Release¶
3AM Quick Guide¶
- Use this when production needs one urgent fix and the current
releasebranch may contain unreleased staging changes. - Record the currently deployed production SHA from the target repository's
Operate -> Environmentspage. - Restore
releaseback to what production is already running, commit that reset, and push it. This replaces the files in your localreleasecheckout with the files from the production commit so your next commit makesreleasematch production again.
git restore --source="${PRODUCTION_SHA1}" --staged --worktree .
git commit --allow-empty --message "fix: restore release to last production state"
git push origin release
git fetch origin release
git diff "${PRODUCTION_SHA1}"..origin/release
# output should be empty
- After that clean baseline is on
origin/release, cherry-pick the approved hotfix intoreleaseusing the GitLab UI. - The release pipeline is manually activated, so it is safe to push, revert,
and cherry-pick on
releaseuntil you are ready to start the release. - Open the orchestration pipeline and run only
release <service name>: https://gitlab.com/evenergi/develop/-/pipelines/new - Run the downstream manual production deployment pipeline as well. Both gates are manual by design for safety and control.
- Do not run
cut-off. That can repopulatereleasewith unrelated unreleased changes. - Watch the production deploy settle, run the focused smoke checks, and update the GitLab incident record.
Purpose And When To Use It¶
This runbook explains how to rebuild the release branch from the last known
production deployment, push that clean baseline, cherry-pick a merged hotfix
onto it, and trigger a production release for the affected service.
Use this workflow when production needs an urgent fix and the current
release branch may already contain unreleased staging changes that must not
be promoted with the hotfix.
Do not use this workflow for the normal weekly release, for routine staging promotion, or when the issue can wait for the standard release cadence.
Prerequisites And Permissions¶
Before using this workflow:
- the hotfix merge request must already be merged and approved for production
- you must be able to identify the last known-good production deployment commit
- you must have permission to push to the
releasebranch - you must have permission to create and run the production release pipeline and the downstream manual production deployment pipeline
- you must have access to the target service's production environment page, logs, and health checks
- you must have permission to create or update the GitLab incident record
- you must know the smoke checks or focused validations required after the production deploy
In the current BetterFleet setup, operators typically inspect the deployed
production SHA from the target repository's Operate -> Environments page, and
they trigger the shared orchestration pipeline from:
The service-specific manual job usually follows the pattern
release <service name>, for example 01 release vemo-core.
Normal Procedure¶
- Capture the last known-good production SHA. Open the target repository's production environment and record the commit currently deployed to production.
- Restore, commit, and push
releasefrom that production SHA. Use a clean local checkout to replace the files onreleasewith the files from the recorded production commit, then create and push a new commit that represents the rollback to the last production state. Thegit restorecommand below does not change which commitreleasepoints to yet. It simply swaps the checked-out files to matchPRODUCTION_SHA1so the next commit recreates that last known-good production state onrelease.
export PRODUCTION_SHA1="<last-production-sha>"
git fetch --all
git checkout -B release origin/release
git restore --source="${PRODUCTION_SHA1}" --staged --worktree .
git commit --allow-empty --message "fix: restore release to last production state"
git push origin release
git fetch origin release
git diff "${PRODUCTION_SHA1}"..origin/release
# output should be empty
- Cherry-pick the merged hotfix into
release. After the restore commit is onorigin/release, use the GitLab merge request cherry-pick action so only the approved hotfix is layered on top of the restored production baseline. The release pipeline is manual, so pushing the restore commit first is safe and gives you a clean point from which to cherry-pick, revert, or retry as needed before releasing. - Verify the
releasebranch contents. Confirm thatorigin/releasenow contains the restore commit plus only the intended hotfix commit or commits. - Create a new orchestration pipeline. Open the shared orchestration pipeline page and start a fresh pipeline for the release flow.
- Run the service release job.
Select the manual job for the target service, usually named
release <service name>. - Do not run the cut-off job.
The
cut-offjob syncs the default branch intorelease, which can pull in unrelated unreleased changes and defeat the purpose of this recovery path. - Run and watch the manual production deployment pipeline through to completion. The downstream production deployment pipeline is manual as well, for the same safety and control reasons as the release pipeline. Trigger only the intended production deployment path, then follow the downstream pipeline, environment page, service logs, and health checks until the deployment settles.
- Validate the production hotfix. Run the focused production smoke checks needed for the incident or defect.
- Record the operational incident. Create or update the GitLab incident with the problem, timeline, deployed fix, validation result, and any remaining follow-up actions.
Reference Screenshots¶
These screenshots use one example service and repository. Your target project
name and release <service name> job label may differ.
Steps 1 And 8: Check The Environments Page Before And After The Release¶
Use Operate -> Environments to identify the last known-good production
deployment before restoring release, and then revisit the same page after the
release to confirm production now points at the intended hotfix deployment.

Step 3: Cherry-Pick Only The Approved Hotfix Into release¶
| What To Check | Screenshot |
|---|---|
Start from the merged merge request and use the Cherry-pick action. |
![]() |
In the cherry-pick dialog, target the release branch so the hotfix lands on the restored production baseline. |
![]() |
Steps 5 Through 7: Run The Service Release Job¶
Use the orchestration pipeline to run the release <service name> job for the
affected service. The key safety check is to choose the service release job and
not the cut-off job, because cut-off can repopulate release with
unrelated unreleased changes.

Decision Points And Exceptions¶
- If the current
releasebranch already matches the last production state and contains no unreleased changes, you can skip the restore step and promote only the hotfix. - If the hotfix depends on unreleased release-branch changes, stop and escalate the decision. A selective production hotfix is no longer the right workflow.
- If the production environment page does not clearly show the last deployed
SHA, stop and gather evidence from deploy logs, release records, or the last
successful production pipeline before changing
release. - If the cherry-pick conflicts, resolve locally, inspect the final tree
carefully, and push only after confirming that unrelated changes were not
pulled into
release. - If multiple services are involved, repeat the release and validation steps for each affected service and record which services were changed.
- If the issue turns out to be operational rather than code-related, stop using this runbook and switch to the service-specific incident or infrastructure recovery path.
Validation And Evidence¶
Treat the production hotfix as complete only when you can point to evidence for each of these checks:
- the last known-good production SHA was recorded before the restore
- the
git diff "${PRODUCTION_SHA1}"..releasecheck was empty immediately after the restore commit - the
releasebranch contains only the restore step plus the intended hotfix - the correct service release job was run, not
cut-off - the production deployment completed successfully for the target service
- the production environment reflects the intended hotfix commit or build
- the focused production smoke checks passed
- the GitLab incident records the timeline, fix, validation evidence, and any follow-up work
Rollback And Recovery¶
If the production hotfix fails or causes a regression:
- repeat the release-branch restore procedure using the last known-good production SHA
- push the restored
releasebranch - rerun the relevant
release <service name>job - recheck production health, smoke tests, and monitoring
- update the GitLab incident with the rollback time and observed outcome
If the service does not recover after the code rollback, stop using this runbook and follow the service-specific incident, infrastructure, or data recovery documentation.
Links To Service-Specific Details¶
- Shared CI and release context: CI and Release Integration
- BetterFleet service lookup: Service Matrix
- BetterFleet Manage service docs: Manage Services
- BetterFleet Plan service docs: Plan Services
- GitLab incident template: https://gitlab.com/evenergi/develop/-/issues/new?issuable_template=incident&type=INCIDENT
- Target repository details:
the target repository's
.gitlab-ci.yml, release jobs, environment page, and service-specific operational docs

