# EZSCALE Website — Production Docker + Kubernetes Deployment

**Date:** 2026-04-26
**Status:** Approved
**Scope:** Production-ready container build and Helm chart for deploying the EZSCALE website Laravel app to the existing K3s cluster. Two environments: `local` (a developer's k3d/minikube cluster) and `us-prod` (the existing US K3s cluster, namespace `ezscale`).

## Goals

- Ship a Helm chart that mirrors the sister `ezscale_api` chart's shape, so cluster-side deploy/rollback/ops scripts work without modification.
- Bake source, vendor, and built assets into immutable images. No host bind-mounts in production.
- Reuse the cluster's existing infrastructure: mariadb-operator, Longhorn PVCs, Traefik IngressRoute, cert-manager `letsencrypt` ClusterIssuer, Gitea Container Registry, Storj for object storage.
- Preserve the encryption-critical state (`APP_KEY`, Passport keys) across deploys without ever regenerating it.
- Keep the existing `docker-compose.yml` dev stack untouched — production is a separate, additional path.

## Non-goals

- Cloudflare Zero Trust for the admin panel (deferred — initial chart ships plain Let's Encrypt TLS).
- EU region deployment (single region for now).
- ExternalDNS automation (DNS managed manually in Cloudflare Terraform for now).
- Sealed Secrets / External Secrets Operator (raw Secret applied via `kubeseal` by hand, matching sister chart).
- WHMCS migration tooling (separate concern).

## Cluster context (assumed prerequisites)

The `us-prod` deployment depends on these already being installed in the K3s cluster — they all are, today, per `infrastructure/kubernetes/`:

- **mariadb-operator** — provides `MariaDB`, `Database`, `User`, `Grant` CRDs (`k8s.mariadb.com/v1alpha1`).
- A replicated `MariaDB` CR named `mariadb` in the `ezscale` namespace, fronted by **MaxScale** for read/write splitting and autofailover, backed by Longhorn PVCs with daily backup CronJobs.
- **cert-manager** with a ClusterIssuer named `letsencrypt`.
- **Traefik** with the `cloudflarewarp` middleware (`kube-system` namespace) for client IP restoration from `CF-Connecting-IP`.
- **Gitea Container Registry** at `git.ezscale.cloud` and an image-pull `Secret` named `gitea-registry` in the target namespace.
- A Storj account with an S3-compatible bucket reserved for the website's user uploads and PDF cache.

For `local`, the developer installs mariadb-operator into their k3d/minikube cluster (one-liner: `helm install mariadb-operator -n mariadb-operator --create-namespace mariadb-operator/mariadb-operator`). Cert-manager and Traefik are not strictly required locally — the chart's IngressRoute and Certificate templates are toggleable.

## Repository layout

```
website/
├── docker/                              # existing dev compose stuff (unchanged)
├── docker-compose.yml                   # existing dev stack (unchanged)
├── Dockerfile                           # NEW: production multi-stage
├── helm/
│   └── ezscale-website/
│       ├── Chart.yaml
│       ├── values.yaml                  # safe defaults, no secrets
│       ├── values-local.yaml            # k3d/minikube — everything in-cluster
│       ├── values-us-prod.yaml          # uses existing ezscale-namespace MariaDB + Storj
│       └── templates/
│           ├── _helpers.tpl
│           ├── configmap.yaml           # APP_ENV, non-secret env vars
│           ├── secret.yaml              # placeholder; only renders if values provided
│           ├── deployment-app.yaml      # nginx + php-fpm sidecar
│           ├── deployment-horizon.yaml
│           ├── deployment-scheduler.yaml
│           ├── service.yaml
│           ├── ingressroute.yaml        # Traefik CRD, three hosts → one Service
│           ├── certificate.yaml         # cert-manager Certificate
│           ├── job-migrate.yaml         # Helm hook: pre-install + pre-upgrade
│           ├── hpa-app.yaml             # autoscale web pods on CPU
│           ├── mariadb-database.yaml    # operator CRDs
│           ├── mariadb-user.yaml
│           ├── mariadb-grant.yaml
│           ├── mariadb-instance.yaml    # only renders when mariadb.enabled=true
│           └── statefulset-valkey.yaml  # only renders when valkey.enabled=true
└── .gitea/
    └── workflows/
        └── release.yml                  # NEW: build + push on v* tags
```

Chart name `ezscale-website` mirrors sister chart's `ezscale-api`.

## Production Dockerfile (multi-stage)

A single `Dockerfile` at the repo root with three named build targets that share common base layers:

| Stage | Base | Purpose |
|-------|------|---------|
| `composer-deps` | `composer:2` | `composer install --no-dev --no-scripts --prefer-dist` → `vendor/` |
| `node-build` | `node:24-alpine` | `npm ci && npm run build` → `public/build/` |
| `runtime-base` | `php:8.3-fpm-bookworm` | PHP extensions (pdo_mysql, intl, bcmath, gd, zip, pcntl, posix, exif, sockets, opcache, redis), opcache config, www-data UID, copies vendor + source + built assets |
| `app` (target) | `runtime-base` | CMD: `php-fpm`. Pairs with nginx sidecar in the Deployment. |
| `horizon` (target) | `runtime-base` | CMD: `php artisan horizon`. SIGTERM, 60s grace period. |
| `scheduler` (target) | `runtime-base` | CMD: `php artisan schedule:work`. |

Image tags published to `git.ezscale.cloud/ezscale/website:{role}-{version}` and `:{role}-latest`. The chart's `image.tag` value selects the version; the role suffix (`app`/`horizon`/`scheduler`) is appended in each Deployment template via `_helpers.tpl`.

**Why three targets sharing one Dockerfile, not one image with a parameterized command?** Image immutability and security. The horizon/scheduler images don't need nginx config or a php-fpm pool, and they're long-lived — separate targets let us trim each one to its minimum.

## Web pod shape

One `Deployment` named `ezscale-website-app` with **two containers** in a single pod:

- `nginx` — `nginx:1.30-alpine`, ConfigMap-mounted vhost serving `/var/www/html/public`, fastcgi → `127.0.0.1:9000`. Listens on `:80`.
- `app` — the `app` Dockerfile target. php-fpm on `:9000`.

The two containers share the source via an `emptyDir` populated by an init container that runs `cp -a /var/www/html/. /shared/` from the app image. This pattern is copied verbatim from the sister chart and lets us update nginx config without rebuilding the app image.

**Health probes:**
- Liveness: HTTP `GET /up` on nginx (Laravel's built-in health endpoint).
- Readiness: same path, with `failureThreshold: 3`.
- Startup probe: `GET /up` with a generous threshold to cover migrations finishing in front-of-pod warmup.

**HPA:** `1 → 8` replicas on 70% CPU, matches sister chart's prod values.

## Subdomain routing

Three subdomains → one Service. Laravel's `Route::domain()` in `bootstrap/app.php` handles per-subdomain dispatch in-pod.

```yaml
# ingressroute.yaml (simplified)
spec:
  entryPoints: [websecure]
  routes:
    - match: Host(`ezscale.cloud`) || Host(`account.ezscale.cloud`) || Host(`admin.ezscale.cloud`)
      middlewares:
        - name: cloudflarewarp
          namespace: kube-system
      services:
        - name: ezscale-website
          port: 80
  tls:
    secretName: ezscale-website-tls
```

A second IngressRoute on entryPoint `web` redirects HTTP → HTTPS via the `kube-system/http-to-https` middleware (matches sister pattern).

One `Certificate` resource covers all three SAN names. cert-manager solves HTTP-01 via Traefik on `:80`.

Cloudflare Zero Trust for the admin host is **deferred**. When ready, layer Access on by adding an annotation to the IngressRoute or splitting the admin host into its own IngressRoute with a Cloudflare Tunnel sidecar.

## File storage

Web/horizon/scheduler pods are stateless. All filesystem reads/writes go to Laravel's `s3` disk in prod:

- `values-us-prod.yaml` sets `FILESYSTEM_DISK=s3`.
- Storj credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_BUCKET`, `AWS_ENDPOINT`, `AWS_DEFAULT_REGION`, `AWS_USE_PATH_STYLE_ENDPOINT=true`) live in the chart's `Secret`.
- User uploads (avatars, KB images, ticket attachments) and cached invoice PDFs all go to Storj.

`local` defaults to the standard `local` disk on an `emptyDir` — fine for dev.

No PVCs on the app/horizon/scheduler Deployments.

## Persistent state inventory

| Data | Storage | Persistence guarantee |
|------|---------|-----------------------|
| **Application DB** | Existing `mariadb` CR in `ezscale` ns | Longhorn replicated PVCs + existing backup CronJob to Storj |
| **Sessions** | Valkey StatefulSet (this chart, 1 replica) | Valkey AOF on a Longhorn PVC. AOF survives pod restart. If Valkey is destroyed, users get logged out — acceptable. |
| **Cache** | Same Valkey | Ephemeral by design — anything in `cache:` is regenerable |
| **Queue (Horizon)** | Same Valkey | Important — losing the queue loses pending jobs. Same AOF-backed PVC. |
| **User uploads + cached PDFs** | Storj S3 | Bucket versioning + Storj's intrinsic replication |
| **`APP_KEY`** | k8s Secret `ezscale-website-secrets` | **Bootstrap once, never regenerated.** Decrypts `users.two_factor_secret`, encrypted credentials, encrypted cookies. |
| **Passport keys** (`oauth-private.key`, `oauth-public.key`) | Same Secret | Same constraint — bootstrapped once, never overwritten. Used to sign OAuth access tokens. |

### `APP_KEY` and Passport key bootstrap procedure

The chart's `templates/secret.yaml` only renders if `secret.create=true` AND a value is supplied. Default for prod is `secret.create=false` — the chart assumes a Secret named `ezscale-website-secrets` already exists in the namespace and references it by name.

First-time bootstrap (one-time, manual):
1. Generate `APP_KEY` locally: `php artisan key:generate --show`.
2. Generate Passport keys locally: `php artisan passport:keys` (writes to `storage/oauth-{public,private}.key`).
3. Create the Secret: `kubectl create secret generic ezscale-website-secrets -n ezscale --from-literal=APP_KEY=... --from-file=oauth-private.key=... --from-file=oauth-public.key=... --from-literal=DB_PASSWORD=... --from-literal=AWS_SECRET_ACCESS_KEY=... --from-literal=STRIPE_SECRET=... ...` etc.
4. (Optional) Run that command's output through `kubeseal` and check the resulting `SealedSecret` into `infrastructure/`.

Subsequent `helm upgrade` invocations never touch this Secret. The Deployments mount it via `envFrom: secretRef:` and the entrypoint copies the OAuth keys into `storage/`.

### Why this matters

If the chart ever regenerates `APP_KEY`, every encrypted value in the database becomes garbage — 2FA secrets, encrypted gateway credentials, encrypted session payloads. Same for Passport keys: regenerating them invalidates every issued access token at once. The chart's secret-handling MUST treat both values as immutable post-bootstrap.

## Database wiring (operator-managed)

For `us-prod`, the chart creates three CRDs in the `ezscale` namespace, all referencing the existing `mariadb` instance:

```yaml
# mariadb-database.yaml
apiVersion: k8s.mariadb.com/v1alpha1
kind: Database
metadata:
  name: ezscale-billing
  namespace: ezscale
spec:
  mariaDbRef: { name: mariadb }
  characterSet: utf8mb4
  collate: utf8mb4_unicode_ci
  name: ezscale_billing
```

```yaml
# mariadb-user.yaml
apiVersion: k8s.mariadb.com/v1alpha1
kind: User
metadata:
  name: ezscale-website-app
  namespace: ezscale
spec:
  mariaDbRef: { name: mariadb }
  passwordSecretKeyRef:
    name: ezscale-website-secrets
    key: DB_PASSWORD
  host: "%"
  maxUserConnections: 50
```

```yaml
# mariadb-grant.yaml
apiVersion: k8s.mariadb.com/v1alpha1
kind: Grant
metadata:
  name: ezscale-website-app-grant
  namespace: ezscale
spec:
  mariaDbRef: { name: mariadb }
  username: ezscale-website-app
  host: "%"
  privileges: ["ALL PRIVILEGES"]
  database: ezscale_billing
  table: "*"
```

Pods connect via the MaxScale router service (read/write split) at `mariadb-maxscale.ezscale.svc.cluster.local:3306` (port may differ — TBD verified from existing MaxScale Service).

For `local`, an additional `mariadb-instance.yaml` template renders a 1-replica `MariaDB` CR in the same chart release, plus a root-password Secret. `Database`/`User`/`Grant` reference that local instance instead.

## Valkey

`templates/statefulset-valkey.yaml` (toggleable via `valkey.enabled`):

- 1 replica, `valkey/valkey:9-alpine`
- Command: `valkey-server --appendonly yes --maxmemory 1gb --maxmemory-policy allkeys-lru` (LRU is fine because cache and sessions can be evicted; queue uses dedicated keys but Horizon will retry lost jobs).
- 5Gi PVC on Longhorn (prod) / local-path (local)
- ClusterIP Service on `:6379`
- No password in `local`. In `us-prod`, password from the Secret.

Both envs default to `valkey.enabled=true`. There's no current need for an external Redis in prod — running per-app Valkey matches sister API and infrastructure/petro patterns.

## Migrations

`templates/job-migrate.yaml` — Helm hook:

```yaml
metadata:
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation
```

Runs `php artisan migrate --force --no-interaction`. Optional second step (`--seed --class=ProductionSeeder`) toggleable via `migrate.seed=true`. Image: same as the `app` target.

If the Job fails, `helm upgrade` aborts before any pod rolls. The previous ReplicaSet stays serving traffic.

For emergency manual deploys: `--set migrate.enabled=false`.

## Scheduler

A `Deployment` (1 replica, no autoscale) running `php artisan schedule:work`. This long-running command checks for due tasks every minute and spawns them as subprocesses. Survives pod restart with no missed runs as long as the pod is up.

We chose this over a `CronJob` running `schedule:run` every minute because:
- Logs land in one place (the Deployment), easier to tail.
- No per-minute pod-creation overhead.
- Matches the dev compose pattern, easier mental model.

Single replica is intentional — running two `schedule:work` instances would double-fire scheduled tasks.

## Image registry, CI, deploy

`.gitea/workflows/release.yml` mirrors sister API:

- Trigger: `push` of `v*` tags
- Build & push three images (`app`, `horizon`, `scheduler`) tagged `:{role}-{version}` and `:{role}-latest`
- Login: `git.ezscale.cloud` with `${{ secrets.CI_TOKEN }}`
- After build: `helm upgrade --install ezscale-website helm/ezscale-website -n ezscale -f helm/ezscale-website/values-us-prod.yaml --set image.tag=v{X.Y.Z}` (executed against the cluster via a self-hosted runner with kubeconfig).

Pull secret: `gitea-registry` (already exists in the `ezscale` namespace).

Existing CI (tests, Pint) stays in `.gitea/workflows/ci.yml` if present, or is added separately — out of scope for this spec.

## Open questions / TBD during implementation

- Verify the exact MaxScale Service name and port in `infrastructure/kubernetes/ezscale/mysql/`. The chart's default `DB_HOST` should match what MaxScale exposes.
- Confirm the cluster's StorageClass name for production (Longhorn vs local-path) by inspecting the existing `mariadb` CR's PVCs.
- Confirm the exact Storj bucket name to use in `us-prod` (proposal: `ezscale-website-prod`). Local doesn't need one — it uses the `local` disk on `emptyDir`.

## Out of scope (separate spec needed before adding)

- Cloudflare Zero Trust for the admin host
- EU region deployment + DB replication topology
- Backup verification / restore drills
- Multi-tenancy (Kasm) — see `KASM_AND_MULTITENANCY.md`
- WHMCS migration runbook

## Implementation order (for the plan that follows)

1. Production `Dockerfile` (build the three targets locally, smoke-test via `docker run`).
2. Helm chart skeleton (`Chart.yaml`, `values.yaml`, `_helpers.tpl`).
3. Core templates: `configmap`, `secret` (placeholder), `deployment-app`, `service`.
4. Database CRDs (`mariadb-database`, `-user`, `-grant`, `-instance` for local).
5. `statefulset-valkey`.
6. `deployment-horizon`, `deployment-scheduler`.
7. `job-migrate` (Helm hook).
8. `ingressroute`, `certificate`.
9. `hpa-app`.
10. `values-local.yaml` and `values-us-prod.yaml`.
11. `.gitea/workflows/release.yml`.
12. Local end-to-end test in k3d.
13. Documentation: `helm/ezscale-website/README.md` covering bootstrap procedure for `APP_KEY` / Passport keys.