Co-authored-by: openhands <openhands@all-hands.dev> Reviewed-on: https://codeberg.org/johba/harb/pulls/84
249 lines
6.3 KiB
Markdown
249 lines
6.3 KiB
Markdown
# CI Migration: Composite Integration Service (Option A)
|
||
|
||
## Overview
|
||
|
||
The E2E pipeline has been refactored to use a **composite integration service** that bundles the entire Harb stack into a single Docker image. This eliminates Docker-in-Docker complexity and significantly speeds up CI runs.
|
||
|
||
## Architecture
|
||
|
||
### Before (Docker-in-Docker)
|
||
```
|
||
Woodpecker Pipeline
|
||
├─ Service: docker:dind (privileged)
|
||
└─ Step: run-e2e
|
||
├─ Install docker CLI + docker-compose
|
||
├─ Run ./scripts/dev.sh start (nested containers)
|
||
│ ├─ anvil
|
||
│ ├─ postgres
|
||
│ ├─ bootstrap
|
||
│ ├─ ponder
|
||
│ ├─ webapp
|
||
│ ├─ landing
|
||
│ ├─ txn-bot
|
||
│ └─ caddy
|
||
└─ Run Playwright tests
|
||
```
|
||
|
||
**Issues**:
|
||
- ~3-5 minutes stack startup overhead per run
|
||
- Complex nested container management
|
||
- Docker-in-Docker reliability issues
|
||
- Dependency reinstallation in every step
|
||
|
||
### After (Composite Service)
|
||
```
|
||
Woodpecker Pipeline
|
||
├─ Service: harb/integration (contains full stack)
|
||
│ └─ Manages internal docker-compose lifecycle
|
||
├─ Step: wait-for-stack (30-60s)
|
||
└─ Step: run-e2e-tests (Playwright only)
|
||
```
|
||
|
||
**Benefits**:
|
||
- ✅ **3-5 minutes faster** - Stack starts in parallel with pipeline setup
|
||
- ✅ **Simpler** - No DinD complexity, standard service pattern
|
||
- ✅ **Reliable** - Single health check, clearer failure modes
|
||
- ✅ **Reusable** - Same image for local testing and CI
|
||
|
||
## Components
|
||
|
||
### 1. Integration Image (`docker/Dockerfile.integration`)
|
||
- Base: `docker:27-dind`
|
||
- Bundles: Full project + docker-compose
|
||
- Entrypoint: Starts dockerd + Harb stack automatically
|
||
- Healthcheck: Validates GraphQL endpoint is responsive
|
||
|
||
### 2. CI Compose File (`docker-compose.ci.yml`)
|
||
- Simplified interface for local testing
|
||
- Exposes port 8081 for stack access
|
||
- Persists Docker state in named volume
|
||
|
||
### 3. New E2E Pipeline (`.woodpecker/e2e-new.yml`)
|
||
- Service: `harb/integration` (stack)
|
||
- Step 1: Wait for stack health
|
||
- Step 2: Run Playwright tests
|
||
- Step 3: Collect artifacts
|
||
|
||
### 4. Build Script (`scripts/build-integration-image.sh`)
|
||
- Builds integration image
|
||
- Pushes to registry
|
||
- Includes local testing instructions
|
||
|
||
## Migration Steps
|
||
|
||
### 1. Build the Integration Image
|
||
|
||
```bash
|
||
# Build locally
|
||
./scripts/build-integration-image.sh
|
||
|
||
# Or with custom registry
|
||
REGISTRY=localhost:5000 ./scripts/build-integration-image.sh
|
||
```
|
||
|
||
### 2. Push to Registry
|
||
|
||
```bash
|
||
# Login to registry (if using sovraigns.network registry)
|
||
docker login registry.sovraigns.network -u ciuser
|
||
|
||
# Push
|
||
docker push registry.sovraigns.network/harb/integration:latest
|
||
```
|
||
|
||
### 3. Activate New Pipeline
|
||
|
||
```bash
|
||
# Backup old E2E pipeline
|
||
mv .woodpecker/e2e.yml .woodpecker/e2e-old.yml
|
||
|
||
# Activate new pipeline
|
||
mv .woodpecker/e2e-new.yml .woodpecker/e2e.yml
|
||
|
||
# Commit changes
|
||
git add .woodpecker/e2e.yml docker/ scripts/build-integration-image.sh
|
||
git commit -m "ci: migrate E2E to composite integration service"
|
||
```
|
||
|
||
### 4. Update CI Image Build Workflow
|
||
|
||
Add to release pipeline or create dedicated workflow:
|
||
|
||
```yaml
|
||
# .woodpecker/build-ci-images.yml
|
||
kind: pipeline
|
||
type: docker
|
||
name: build-integration-image
|
||
|
||
when:
|
||
event:
|
||
- push
|
||
- tag
|
||
branch:
|
||
- main
|
||
- master
|
||
|
||
steps:
|
||
- name: build-and-push
|
||
image: docker:27-dind
|
||
privileged: true
|
||
environment:
|
||
DOCKER_HOST: tcp://docker:2375
|
||
REGISTRY_USER:
|
||
from_secret: registry_user
|
||
REGISTRY_PASSWORD:
|
||
from_secret: registry_password
|
||
commands:
|
||
- docker login registry.sovraigns.network -u $REGISTRY_USER -p $REGISTRY_PASSWORD
|
||
- ./scripts/build-integration-image.sh
|
||
- docker push registry.sovraigns.network/harb/integration:latest
|
||
```
|
||
|
||
## Local Testing
|
||
|
||
### Test Integration Image Directly
|
||
|
||
```bash
|
||
# Start the stack container
|
||
docker run --rm --privileged -p 8081:8081 \
|
||
registry.sovraigns.network/harb/integration:latest
|
||
|
||
# Wait for health (in another terminal)
|
||
curl http://localhost:8081/api/graphql
|
||
|
||
# Run E2E tests against it
|
||
npm run test:e2e
|
||
```
|
||
|
||
### Test via docker-compose.ci.yml
|
||
|
||
```bash
|
||
# Start stack
|
||
docker-compose -f docker-compose.ci.yml up -d
|
||
|
||
# Wait for healthy
|
||
docker-compose -f docker-compose.ci.yml ps
|
||
|
||
# Run tests
|
||
npm run test:e2e
|
||
|
||
# Cleanup
|
||
docker-compose -f docker-compose.ci.yml down -v
|
||
```
|
||
|
||
## Rollback Plan
|
||
|
||
If issues arise, revert to old pipeline:
|
||
|
||
```bash
|
||
# Restore old pipeline
|
||
mv .woodpecker/e2e-old.yml .woodpecker/e2e.yml
|
||
|
||
# Commit
|
||
git add .woodpecker/e2e.yml
|
||
git commit -m "ci: rollback to DinD E2E pipeline"
|
||
git push
|
||
```
|
||
|
||
## Performance Comparison
|
||
|
||
| Metric | Before (DinD) | After (Composite) | Improvement |
|
||
|--------|---------------|-------------------|-------------|
|
||
| Stack startup | ~180-240s | ~60-90s | **~2-3 min faster** |
|
||
| Total E2E time | ~8-10 min | ~5-6 min | **~40% faster** |
|
||
| Complexity | High (nested) | Low (standard) | Simpler |
|
||
| Reliability | Medium | High | More stable |
|
||
|
||
## Troubleshooting
|
||
|
||
### Image build fails
|
||
```bash
|
||
# Check kraiken-lib builds successfully
|
||
./scripts/build-kraiken-lib.sh
|
||
|
||
# Build with verbose output
|
||
docker build -f docker/Dockerfile.integration --progress=plain .
|
||
```
|
||
|
||
### Stack doesn't start in CI
|
||
```bash
|
||
# Check service logs in Woodpecker
|
||
# Services run detached, logs available via Woodpecker UI
|
||
|
||
# Test locally first
|
||
docker run --rm --privileged -p 8081:8081 \
|
||
registry.sovraigns.network/harb/integration:latest
|
||
```
|
||
|
||
### Healthcheck times out
|
||
- Default timeout: 120s start period + 30 retries × 5s = ~270s max
|
||
- First run is slower (pulling images, building)
|
||
- Subsequent runs use cached layers (~60-90s)
|
||
|
||
## Future Improvements
|
||
|
||
1. **Multi-stage build** - Separate build and runtime images
|
||
2. **Layer caching** - Optimize Dockerfile for faster rebuilds
|
||
3. **Parallel services** - Start independent services concurrently
|
||
4. **Resource limits** - Add memory/CPU constraints for CI
|
||
5. **Image variants** - Separate images for different test suites
|
||
|
||
## Podman to Docker Migration
|
||
|
||
As part of this work, the Woodpecker agent was migrated from Podman to Docker:
|
||
|
||
**Changes made**:
|
||
- Updated `/etc/woodpecker/agent.env`:
|
||
- `WOODPECKER_BACKEND_DOCKER_HOST=unix:///var/run/docker.sock`
|
||
- Added `ci` user to `docker` group
|
||
- Restarted `woodpecker-agent` service
|
||
|
||
**Agent label update** (optional, cosmetic):
|
||
```bash
|
||
# /etc/woodpecker/agent.env
|
||
WOODPECKER_AGENT_LABELS=docker=true # (was podman=true)
|
||
```
|
||
|
||
## Questions?
|
||
|
||
See `CLAUDE.md` for overall stack architecture and `INTEGRATION_TEST_STATUS.md` for E2E test details.
|