Skip to content

Release pipeline follow-up — ADR-035

Status: Open Owner: backbone maintainers Tracking issue: #287 Source ADR: ADR-035 — Deployment Posture and Release Pipeline

This is the working document for the open items left behind by ADR-035. The release pipeline (.github/workflows/release.yml) and the self-hoster compose setup are shipped; what remains is validation, hardening, and the managed-hosting extension. Each section below says what to do, when to do it, and how to know it is done.


Item 1 — Smoke-test the release workflow

When: Immediately — do this before tagging any real release. Prerequisite: .github/workflows/release.yml merged to main (done as of 2026-04-28).

Steps

  1. Create a throwaway branch from main:
git checkout main && git pull
git checkout -b chore/smoke-test-release
  1. Push a pre-release tag from the branch. Use a clearly non-production name:
git tag v0.0.0-test1
git push origin v0.0.0-test1
  1. Watch the workflow run at https://github.com/CortoMaltese3/climate-lama/actions/workflows/release.yml. Both jobs (build-backbone and build-worker) must complete green.

  2. Verify tag semantics — because v0.0.0-test1 contains -rc… wait, it does not. If the tag does not contain -rc, the workflow will also push :latest. To test the rc path, use a tag that ends in -rc1:

git tag v0.0.0-rc1
git push origin v0.0.0-rc1

Confirm: - ghcr.io/cortomaltese3/climate-lama:v0.0.0-rc1 exists in GHCR. - ghcr.io/cortomaltese3/climate-lama:latest was not updated (check the "Last pushed" timestamp on the latest tag in GHCR). - Same for the worker image under climate-lama-worker.

  1. Verify stable tag pushes latest using a tag without -rc:
git tag v0.0.0-test1
git push origin v0.0.0-test1

Confirm both v0.0.0-test1 and latest are updated in GHCR.

  1. Test docker compose pull on a clean checkout:
cd /tmp && git clone https://github.com/CortoMaltese3/climate-lama.git cl-smoke
cd cl-smoke
CLIMATE_LAMA_TAG=v0.0.0-test1 docker compose pull api worker

Both images must pull without error.

  1. Clean up — delete the throwaway tags and GHCR images:
git push origin --delete v0.0.0-rc1
git push origin --delete v0.0.0-test1
git tag -d v0.0.0-rc1 v0.0.0-test1

Then delete the packages from GHCR UI: - https://github.com/CortoMaltese3/climate-lama/pkgs/container/climate-lama - https://github.com/CortoMaltese3/climate-lama/pkgs/container/climate-lama-worker

  1. Delete the smoke-test branch (nothing was committed to it):
git checkout main && git branch -d chore/smoke-test-release

Verify

  • Both build-backbone and build-worker jobs green.
  • rc tag present in GHCR; latest unchanged.
  • Stable tag present in GHCR; latest updated.
  • docker compose pull on a fresh checkout succeeds.
  • All throwaway tags and images deleted from GHCR and git.

Item 2 — Multi-arch builds (arm64)

When: When the managed-hosting platform is decided (see docs/plan/phase-managed-hosting.md). If the platform runs on arm64 nodes (e.g., Hetzner ARM, Graviton on AWS), or self-hosters on Apple Silicon request it, pull this item forward. Otherwise, defer until the hosting decision lands.

Effort: Low — one-line change to the release workflow plus a QEMU step.

Steps

  1. Confirm the target platform from the hosting ADR. Proceeding without a confirmed platform wastes CI minutes on an arch nobody runs.

  2. Add QEMU and Buildx setup to both jobs in .github/workflows/release.yml:

- name: Set up QEMU
  uses: docker/setup-qemu-action@v3

- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v3
  with:
    platforms: linux/amd64,linux/arm64
  1. Add platforms: to docker/build-push-action in both jobs:
- name: Build and push backbone image
  uses: docker/build-push-action@v6
  with:
    platforms: linux/amd64,linux/arm64
    # ... rest unchanged
  1. Verify Dockerfiles are architecture-clean. Run:
docker buildx build --platform linux/arm64 --load -t cl-arm-test -f docker/core.Dockerfile .
docker buildx build --platform linux/arm64 --load -t cl-worker-arm-test -f docker/worker.Dockerfile .

Both must complete without error. Common failure: pinned base images without multi-arch manifests, or native binary dependencies compiled for amd64 only.

  1. Push a test rc tag and verify both amd64 and arm64 manifests appear in the GHCR image's "OS / Arch" column.

  2. Update ADR-035 — add a one-liner under the Decision section noting the arm64 support was added and the date.

Verify

  • GHCR shows a multi-arch manifest list (not a single-arch image) for both climate-lama and climate-lama-worker.
  • docker pull --platform linux/arm64 ghcr.io/cortomaltese3/climate-lama:latest succeeds on an arm64 machine or via --platform emulation.

Item 3 — Image signing (cosign / Sigstore)

When: Hardening wave — after the first public release (v0.1.0) ships and the basic pipeline is proven stable. Filing a separate issue to track this is recommended before starting.

Rationale from ADR-035: Explicitly out of scope for §6.13; called out as a separate hardening item. Image signing proves supply-chain integrity — that the image in GHCR was built by this repo's workflow and not tampered with.

Steps

  1. Install cosign in the release workflow (keyless signing via OIDC — no secrets required):
- name: Install cosign
  uses: sigstore/cosign-installer@v3
  1. Add the id-token: write permission to the job (required for OIDC keyless signing):
permissions:
  contents: read
  packages: write
  id-token: write
  1. Sign the pushed image digest after the build-push-action step. Use the digest output from the build step (not the tag, which is mutable):
- name: Sign image
  env:
    DIGEST: ${{ steps.build.outputs.digest }}
    IMAGE: ghcr.io/cortomaltese3/climate-lama
  run: cosign sign --yes "${IMAGE}@${DIGEST}"
  1. Verify the signature in a test run:
cosign verify \
  --certificate-identity-regexp "https://github.com/CortoMaltese3/climate-lama/.*" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  ghcr.io/cortomaltese3/climate-lama:v0.1.0
  1. Document the verification command in docs/deployment/upgrading.md so self-hosters can verify images before running them.

  2. Update ADR-035 — note the signing method and date under Decision.

Verify

  • cosign verify succeeds for a freshly pushed image tag using the commands above.
  • The verification command is documented in docs/deployment/upgrading.md.

Item 4 — SBOM generation

When: Compliance wave — when a design partner or regulatory requirement demands it, or when the project reaches a maturity level where a published SBOM is expected. File a separate issue before starting.

Rationale from ADR-035: Explicitly out of scope for §6.13; called out as a separate compliance item.

Steps

  1. Add syft or docker/sbom-action to both release jobs after the image push:
- name: Generate SBOM
  uses: anchore/sbom-action@v0
  with:
    image: ghcr.io/cortomaltese3/climate-lama:${{ github.ref_name }}
    artifact-name: sbom-backbone-${{ github.ref_name }}.spdx.json
    output-file: sbom-backbone.spdx.json
  1. Attach the SBOM to the GitHub Release (create the release if not already part of the workflow) or upload it as a workflow artifact.

  2. Optionally attest the SBOM using cosign (requires Item 3 complete):

- name: Attest SBOM
  run: |
    cosign attest --yes \
      --predicate sbom-backbone.spdx.json \
      --type spdxjson \
      ghcr.io/cortomaltese3/climate-lama@${{ steps.build.outputs.digest }}
  1. Verify by downloading the attestation:
cosign download attestation ghcr.io/cortomaltese3/climate-lama:v0.1.0 \
  | jq '.payload | @base64d | fromjson | .predicate.packages | length'

Should return a non-zero package count.

  1. Update ADR-035 and add a note to docs/deployment/upgrading.md.

Verify

  • SBOM artifact is attached to the GitHub Release or GHCR attestation.
  • Package count in the SBOM is non-zero and includes known dependencies.

Item 5 — Helm chart skeleton (managed-hosting unpark)

When: Managed-hosting phase unparks. See docs/plan/phase-managed-hosting.md. Do not start until the platform decision (Kubernetes provider, ArgoCD vs. Helm-only) is made as part of that phase. Starting earlier just produces a chart that gets thrown away.

Scope: Helm chart per service (api, worker) under infra/helm/; ArgoCD Application manifest; cert-manager Certificate resource for TLS. See ADR-035 Decision §Hosted path (sketched) for the intended architecture.

Steps

  1. Confirm the platform from the hosting ADR (will be a follow-up to ADR-035 or a new ADR). The chart structure depends on whether the provider is a managed K8s (EKS, GKE, AKS, DOKS, Hetzner K8s) or a self-managed cluster.

  2. Scaffold the charts:

infra/helm/
  climate-lama/          # API service
    Chart.yaml
    values.yaml
    templates/
      deployment.yaml
      service.yaml
      ingress.yaml
      hpa.yaml
  climate-lama-worker/   # Worker service
    Chart.yaml
    values.yaml
    templates/
      deployment.yaml
      hpa.yaml

values.yaml for both charts must expose: - image.repository and image.tag (defaults to latest; CI overrides to the release tag) - replicaCount - resources.requests / resources.limits - env block for environment variables not sourced from the secrets backend - secretsBackend stanza (provider-specific; wired up in Item 6 of the secrets-backend follow-up — see docs/ops/secrets-backend-followup.md)

  1. Add an ArgoCD Application manifest under infra/argocd/ pointing at the chart source and the target cluster namespace.

  2. Add cert-manager Certificate and Issuer resources for TLS. Use Let's Encrypt staging first, then production once DNS is confirmed.

  3. Test the chart against a local kind/k3d cluster before pointing ArgoCD at it:

helm install climate-lama ./infra/helm/climate-lama \
  --set image.tag=latest \
  --dry-run --debug
  1. Wire up CI — add a helm lint + helm template check to the existing ci.yml workflow so chart regressions are caught on every PR.

  2. Update ADR-035 — mark the hosted path as "implemented" and record the platform and chart repo path.

Verify

  • helm lint infra/helm/climate-lama and helm lint infra/helm/climate-lama-worker both pass.
  • helm install --dry-run against a local kind cluster produces valid manifests.
  • ArgoCD Application syncs green in the staging namespace.
  • TLS certificate is issued by Let's Encrypt for the staging domain.

Definition of done (all items)

Item Done when
1 — Smoke test Both workflow jobs green; rc/stable tag semantics verified; all throwaway artefacts deleted
2 — arm64 Multi-arch manifest in GHCR; docker pull --platform linux/arm64 succeeds
3 — Image signing cosign verify passes for new tags; verification command in upgrading.md
4 — SBOM SBOM attached to GitHub Release or GHCR attestation; non-zero package count
5 — Helm chart helm lint green; ArgoCD sync green; TLS issued in staging

Item 1 must be done before any real release tag is pushed. Items 2-5 are independent of each other and can be picked up in any order once their respective "when" trigger fires.