v0.1.0 CI/Container Redesign Proposal

Purpose¶

Define a reproducible, modular, and cache-efficient developer/CI model for the v0.1.0 refactor so local macOS development and GitHub Actions execute the same verification logic with the same container identity.

Executive Summary¶

Keep the current fingerprinted CI image strategy and expand it into a single canonical cache architecture.
Use a configurable cache root for local container runs (default: .cargo-cache/) with explicit volume mounts for registry, git, sccache, and target outputs.
Keep one source of truth for commands in Make targets and scripts; GitHub Actions should only orchestrate those targets.
Use cargo nextest as the default PR test runner, with test tiers split by change risk (minimal PR gate vs full verification on push/main).
Preserve granular PR visibility (fmt, clippy, compile, test, smoke) as separate required checks.
Keep build and verify execution inside the containerized CI image to preserve environment parity with local repro.

Current-State Evaluation¶

This repository already has strong foundations:

Content-addressed image fingerprinting (Dockerfile hash) and GHCR-backed CI image identity.
Containerized local reproducibility via ./dflow ci-verify.
Shared cache directories for containerized runs: .cargo-cache/{registry,git,sccache} and target/ci.
nextest already used in test-ci.
Granular status reporting via gh-check-wrapper.sh.

Primary gaps to address for v0.1.0:

Fingerprint scope drift: docs/scripts currently state broader fingerprint inputs, while vars.mk hashes only Dockerfile.
Cache root is not fully standardized/configurable as one top-level contract for local + CI.
CI check topology is efficient but still tightly coupled to shell backgrounding in one step; explicit check job contracts can improve maintainability.
PR gating tiers are not explicitly documented as “minimal gate” vs “full verification”.
The design is highly optimized for this Rust repository but is not yet framed as a reusable cross-stack platform.

Target Design¶

1) Canonical Build Identity¶

Keep image fingerprinting as the environment identity key.
Expand fingerprint inputs to match policy: Dockerfile, Makefile, install.sh, src-scripts/**, and rust-toolchain.toml.
Publish CI image as ghcr.io/<repo>-ci:<fingerprint> and keep :latest as convenience only.
Use immutable digest pinning in CI job container declarations when practical.

2) Unified Cache Contract¶

Define one configurable cache root:

KROKI_CACHE_ROOT (default .cargo-cache)
Subdirs:
- ${KROKI_CACHE_ROOT}/registry
- ${KROKI_CACHE_ROOT}/git
- ${KROKI_CACHE_ROOT}/sccache
- ${KROKI_CACHE_ROOT}/target-ci
- ${KROKI_CACHE_ROOT}/image-tar
- ${KROKI_CACHE_ROOT}/buildx

Local container mounts:

Cargo registry/git -> mounted from KROKI_CACHE_ROOT.
SCCACHE_DIR -> mounted from KROKI_CACHE_ROOT/sccache.
CARGO_TARGET_DIR -> mounted from KROKI_CACHE_ROOT/target-ci.

Guidance:

Do not mount host ~/.cargo directly by default; use project-scoped cache roots for reproducibility and isolation.
Keep target caches scoped by Rust version + fingerprint + feature set to avoid artifact poisoning.

Environment nuance impact:

A standardized cache root does not remove environment-specific capability differences (kernel, filesystem semantics, CPU arch, cgroup limits, sandboxing).
The container image and run flags define behavior parity; the cache root only standardizes persisted artifacts and speeds reruns.
Cache key dimensions should include os/arch and runtime profile (for example Linux amd64 container vs local native macOS) to avoid cross-environment contamination.
For deterministic CI, treat cache as an optimization only; correctness must not depend on cache hits.

3) GitHub Actions Cache Layers¶

Use layered caching, ordered by ROI:

CI image reuse from GHCR by fingerprint.
Optional image tar cache (actions/cache) for fast same-runner restore.
Cargo + sccache cache (registry, git, sccache, target-ci) keyed by:
- runner.os
- image fingerprint
- hash(Cargo.lock, rust-toolchain.toml)
- cargo feature mode (for example native-browser vs lean)
BuildKit cache for image builds (cache-from/cache-to gha, optionally registry cache manifest).

4) Workflow Modularity and Check Visibility¶

Single CI workflow file (.github/workflows/ci.yml) for PR/push on main and develop.

Job topology:

prep: resolve fingerprint + image, restore image tar cache, emit outputs (runner job, no compile/test logic).
build: compile once (make build-ci) and save cargo/sccache cache (runs inside the CI container image).
Parallel verify jobs (granular reviewer visibility):
- fmt -> make fmt-check (or cargo fmt --check)
- clippy -> make lint-clippy
- test -> make test-ci (cargo nextest run)
- smoke -> make smoke-test

All verify jobs:

Run in the same fingerprinted CI container image as build.
Restore caches read-only.
Avoid duplicate compilation by relying on warmed cache from build.

5) Test Tiering Policy¶

PR minimal gate:

fmt, clippy, build-ci, nextest (default profile), smoke.

Push to main/develop full verification:

Includes security/load/integration-heavy suites and packaging validation.

Implementation note:

Encode tiers as explicit make targets (test-pr, test-full) to keep local and CI behavior identical.

6) Local/CI Command Parity¶

Entrypoints:

Local: ./dflow ci-verify <target>
CI: make <target> from inside container job

Rule:

No business logic in workflow YAML; logic lives in Makefile and src-scripts/.
YAML only wires triggers, permissions, cache restore/save, and target invocation.

7) Impact of Multi-Surface Refactor¶

If Kroki-rs adopts the multi-surface architecture (core/interface/transport/adapters), CI/container strategy should shift from a single-project pipeline to a contract-oriented workspace pipeline.

Changes to consider:

Split CI targets by layer:
- core (pure Rust, fastest checks)
- interface/contracts (schema compatibility checks)
- transport/adapters (HTTP, CLI, plugin integration checks)
Add contract compatibility gates:
- schema diff checks
- backward-compatibility policy enforcement
- generated type/API conformance tests
Partition caches by crate group and feature profile to avoid unnecessary invalidation.
Keep one shared base CI image, but allow optional adapter extension images for stack-specific dependencies.

8) Generic Platform Model (Beyond This Repo)¶

This proposal can be generalized into a reusable CI/dev orchestration model with stack extensions.

Platform core (generic):

Fingerprinted container identity
Standard cache root contract
Canonical command graph (setup/build/lint/test/package/release)
Provider-agnostic check reporting and status mapping
Policy engine for PR gate tiers and branch protection contracts

Stack extensions (examples):

Rust:
- cargo, nextest, clippy, rustfmt, sccache
- optional workspace-aware cache slicing by crate
Node.js/TypeScript:
- pnpm/npm cache, lockfile-aware install, tsc --noEmit, eslint, vitest/jest
- cache keys include package manager + lockfile + Node version
React/Lit front-end:
- build/test/lint plus storybook/e2e extension gates
- optional browser test container profile
VS Code extension:
- vsce packaging, extension host tests, API compatibility checks
Tauri:
- host OS dependency profiles, frontend + Rust dual cache strategy, bundling/signing stages

Principle:

Keep 85-90% behavior in the core orchestrator; implement stack-specific logic as plugins/extensions.

9) Generic CLI Tool Design (Installable via Homebrew)¶

Proposed tool concept: a standalone Rust CLI named devflow (alias dwf) that abstracts CI/container/caching complexity while preserving local developer UX.

Design goals:

One command surface across projects (devflow setup, devflow verify, devflow ci plan, devflow release).
Project-level config with minimal required fields and optional extensions.
Deterministic container-first workflows with portable local fallback.
Native integration with GitHub Actions generation and validation.

Canonical command interface:

Long form: devflow <command>
Short alias: dwf <command>
Examples: dwf setup, dwf lint, dwf test, dwf verify, dwf ci generate

High-level architecture:

devflow-core:
- DAG/task executor
- cache key builder
- container resolver (local daemon, GHCR)
- status/check emitter abstraction
devflow-ext-*:
- ext-rust
- ext-node
- ext-frontend-react
- ext-frontend-lit
- ext-vscode
- ext-tauri
devflow-gh:
- workflow templating, required-check contract generation, cache strategy synthesis
devflow-policy:
- branch rules, PR gates, release constraints, environment matrix policy

Configuration model (example):

[project]
name = "kroki-rs"
stack = ["rust"]

[container]
image = "ghcr.io/org/repo-ci"
fingerprint_inputs = ["Dockerfile", "Makefile", "src-scripts/**", "rust-toolchain.toml"]

[cache]
root = ".cache/devflow"
strategy = "layered"

[targets]
pr = ["fmt", "lint", "build", "test"]
main = ["fmt", "lint", "build", "test", "smoke", "security"]

[extensions.rust]
features = ["native-browser"]
test_runner = "nextest"

CLI capabilities:

init: detect stack and scaffold config + make targets + workflow skeleton.
doctor: validate container engine, cache root permissions, toolchain drift.
verify: run the canonical local containerized pipeline.
ci generate: generate/update GitHub Actions workflow from config.
cache stats/prune: inspect and clean cache by scope/age.
policy check: ensure repo protections and required checks match config.

Canonical command mapping by stack:

Canonical command	Rust stack	Node/TypeScript (`pnpm`)	React/Lit frontend	VS Code extension	Tauri
`dwf setup`	`rustup show`, toolchain sync, container/image resolve	`pnpm install --frozen-lockfile`	`pnpm install --frozen-lockfile`	`pnpm install --frozen-lockfile`	Rust + `pnpm install` + platform deps check
`dwf fmt`	`cargo fmt --all`	`pnpm prettier -w .` (or repo formatter)	`pnpm prettier -w .`	`pnpm prettier -w .`	`cargo fmt --all` + frontend formatter
`dwf fmt-check`	`cargo fmt --all -- --check`	`pnpm prettier -c .`	`pnpm prettier -c .`	`pnpm prettier -c .`	`cargo fmt --all -- --check` + frontend fmt check
`dwf lint`	`cargo clippy --all-targets -- -D warnings`	`pnpm eslint .` + `pnpm tsc --noEmit`	`pnpm eslint .` + typecheck	`pnpm eslint .` + extension typecheck	Rust clippy + frontend eslint/typecheck
`dwf build`	`cargo build --release`	`pnpm build`	`pnpm build`	`pnpm package` / extension bundle	`cargo build --release` + frontend build
`dwf test`	`cargo nextest run`	`pnpm test` (vitest/jest)	`pnpm test`	extension host tests	`cargo nextest run` + frontend tests
`dwf smoke`	app health/CLI smoke script	app startup + HTTP smoke	preview/build smoke	install/package smoke	app startup/bundle smoke
`dwf verify`	`fmt-check + lint + build + test + smoke`	same canonical sequence mapped to JS toolchain	same canonical sequence	same canonical sequence	same canonical sequence
`dwf ci generate`	generate Rust-oriented CI jobs/cache keys	generate pnpm/node CI jobs/cache keys	add frontend/e2e job templates	add extension packaging jobs	add OS matrix + bundling/signing jobs

Notes:

The canonical command names are stable for developers; stack-specific executors are resolved from config extensions.
dwf verify should always run inside the configured fingerprinted container in CI, and optionally locally via containerized mode.

Rollout path for this project:

Phase A: treat current dflow as compatibility facade and map it to devflow/dwf commands.
Phase B: move cache/image/check contracts into typed config.
Phase C: keep project-specific extensions only for diagram-tool dependencies and release packaging nuances.

Required Repo Changes¶

Standardize fingerprint algorithm docs + scripts to one input set.
Introduce KROKI_CACHE_ROOT and replace hardcoded .cargo-cache/target/ci paths in scripts and make fragments.
Add explicit CI-tier targets:
- fmt-check
- lint-clippy
- test-pr
- test-full
Consolidate CI triggers into one workflow (main, develop) and keep release/package workflows separate.
Add cache key normalization helper (script or make target) reused by both local diagnostics and GHA.
Add CI observability summary:
- fingerprint used
- cache hit/miss stats
- sccache hit ratio per run
Add explicit note in workflow docs that build and all verify-* jobs execute in the fingerprinted CI container.
Add extension points in docs/config for future multi-surface and multi-stack orchestration.

Suggested Implementation Plan (v0.1.0)¶

Phase 1: Cache Contract and Fingerprint Consistency¶

Add KROKI_CACHE_ROOT contract and migrate mounts/env vars.
Align fingerprint implementation and documentation.
Validate no regression in ./dflow ci-verify.

Phase 2: CI Job Contract Refactor¶

Create/adjust make targets for granular checks.
Refactor CI workflow into prep/build/parallel-verify jobs.
Keep required checks mapped to the same stable names.

Phase 3: Test Tiering + Runtime Optimization¶

Formalize PR vs full test tiers.
Tune nextest profile and retries for flaky integration suites.
Add run summaries with cache and compile diagnostics.

Phase 4: Hardening (Optional, Recommended)¶

Add SBOM + image scan gates.
Add provenance/signing for release images.
Add periodic cache hygiene workflow.

Risks and Mitigations¶

Risk: over-broad target cache reuse can cause stale artifacts.
- Mitigation: include fingerprint + toolchain + feature mode in cache key.
Risk: too many parallel jobs increase image pull overhead.
- Mitigation: keep verify jobs lightweight and rely on prep/build warm cache.
Risk: local cache growth on macOS.
- Mitigation: configurable cache root + retention/prune targets.

Decision Notes on Requested Strategies¶

Mounting cargo registry/git/target caches is correct and should remain, with stronger key scoping and a configurable root.
cargo nextest is the right default for CI; keep it as the primary PR test executor.
Granular checks should remain visible as separate required statuses/check-runs.
Single CI workflow for PR/push on main and develop is recommended; keep release/publish workflows independent.

Success Criteria¶

Local ./dflow ci-verify and PR CI run identical targets in the same image identity.
Warm PR rerun latency significantly lower than cold runs, with measurable sccache and cache hit ratios.
Reviewers can identify failure domain immediately from independent check status names.
Pipeline logic remains modular and centralized in scripts/make targets, not duplicated in YAML.