# kelos **Repository Path**: git.fccfc.com/kelos ## Basic Information - **Project Name**: kelos - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-06-11 - **Last Updated**: 2026-06-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

Kelos

Orchestrate autonomous AI coding agents on Kubernetes.

CI Release GitHub Stars Go Version License

Quick Start · Kelos Skill · Examples · Integration · Reference · YAML Manifests

Kelos is a Kubernetes-native framework for AI coding agents. It does two things: 1. **Defines the agent and the environment it runs in as one unit** — the prompt, model, instructions, plugins, MCP servers, git workspace, credentials, and Pod resources all live together as Kubernetes resources you can version-control. 2. **Defines how agents integrate with your workflows** — trigger runs from GitHub issues, PRs, webhooks, Linear, Jira, schedules, or any HTTP source, and chain agents into pipelines. Supports **Claude Code**, **OpenAI Codex**, **Google Gemini**, **OpenCode**, **Cursor**, and [custom agent images](docs/agent-image-interface.md). ## How It Works kelos-resources You define what needs to be done, and Kelos handles the "how" — from cloning the right repo and injecting credentials to running the agent and capturing its outputs (branch names, commit SHAs, PR URLs, and token usage). ### Core Primitives Kelos is built on four resources, grouped by the two concerns above: **Defining the agent and its environment** - **Tasks** — A single agent run: prompt, model, effort, credentials, and Pod-level overrides. - **Workspaces** — The git repository (URL, ref, auth) the agent operates in. - **AgentConfigs** — Reusable bundles of agent instructions (`AGENTS.md`, `CLAUDE.md`), plugins (skills and agents), and MCP servers. **Integrating with workflows** - **TaskSpawners** — React to external triggers (GitHub Issues/PRs, webhooks, Linear, Jira, Cron, Generic Webhooks) and create Tasks automatically. ## Why Kelos? AI coding agents are evolving from interactive CLI tools into autonomous background workers — managed like infrastructure, not invoked like commands. Kelos provides the framework to manage this transition at scale. - **Workflow as YAML** — Define your development workflow declaratively: what triggers agents, what they do, and how they hand off. Version-control it, review it in PRs, and GitOps it like any other infrastructure. - **Orchestration, not just execution** — Don't just run an agent; manage its entire lifecycle. Chain tasks with `dependsOn` and pass results (branch names, PR URLs, token usage) between pipeline stages. Use `TaskSpawner` to build event-driven workers that react to GitHub issues, PRs, or schedules. - **Host-isolated autonomy** — Each task runs in an isolated, ephemeral Pod with a freshly cloned git workspace. Agents have no access to your host machine — use [scoped tokens and branch protection](#security-considerations) to control repository access. - **Standardized interface** — Plug in any agent (Claude, Codex, Gemini, OpenCode, Cursor, or your own) using a simple [container interface](docs/agent-image-interface.md). Kelos handles credential injection, workspace management, and Kubernetes plumbing. - **Scalable parallelism** — Fan out agents across multiple repositories. Kubernetes handles scheduling, resource management, and queueing — scale is limited by your cluster capacity and API provider quotas. - **Observable & CI-native** — Every agent run is a first-class Kubernetes resource with deterministic outputs (branch names, PR URLs, commit SHAs, token usage) captured into status. Monitor via `kubectl`, manage via the `kelos` CLI or declarative YAML (GitOps-ready), and integrate with ArgoCD or GitHub Actions. ## Quick Start Get running in 5 minutes (most of the time is gathering credentials). ### Prerequisites - Kubernetes cluster (1.28+) - cert-manager installed and ready
Don't have a cluster? Create one locally with kind 1. [Install kind](https://kind.sigs.k8s.io/docs/user/quick-start/#installation) (requires Docker) 2. Create a cluster: ```bash kind create cluster ``` This creates a single-node cluster and configures your kubeconfig automatically.
### 1. Install the CLI Install using the script: ```bash curl -fsSL https://raw.githubusercontent.com/kelos-dev/kelos/main/hack/install.sh | bash ``` Or using Homebrew: ```bash brew tap kelos-dev/tap brew install kelos ```
Alternative: install from source ```bash go install github.com/kelos-dev/kelos/cmd/kelos@latest ```
### 2. Install Kelos Kelos uses cert-manager to issue the CRD conversion webhook certificate. Install cert-manager first if your cluster does not already have it. Follow the [cert-manager installation documentation](https://cert-manager.io/docs/installation/) for the current recommended installation method. ```bash kelos install ``` This installs the Kelos controller and CRDs into the `kelos-system` namespace. For chart-native customization, pass Helm values to `kelos install`: ```bash kelos install -f values.yaml kelos install --set webhookServer.sources.github.enabled=true ``` `kelos install` manages CRDs separately, so `crds.install` must be omitted or set to `false`. `kelos install --dry-run` prints controller manifests only; CRDs are staged separately during real installs after certificate and conversion webhook readiness. For the full values schema and advanced examples, see [the Helm chart README](internal/manifests/charts/kelos/README.md). Verify the installation: ```bash kubectl get pods -n kelos-system kubectl get crds | grep kelos.dev ``` ### Helm Install Kelos also publishes a Helm chart as an OCI artifact in GHCR. To install Kelos with Helm: ```bash helm upgrade --install kelos oci://ghcr.io/kelos-dev/charts/kelos \ -n kelos-system \ --create-namespace \ --version \ --set crds.install=true ``` This installs the controller resources and Kelos CRDs. The chart default leaves CRD rendering disabled for clusters where CRDs are managed by `kelos install` or another manifest workflow. Use `kelos install` for the staged controller-plus-CRD install, or see the Helm chart README for fresh-install, upgrade, and adoption flows. For CRD migration, adopting existing CRDs into Helm ownership, and advanced chart usage, see [the Helm chart README](internal/manifests/charts/kelos/README.md). ### 3. Initialize Your Config ```bash kelos init ``` Edit `~/.kelos/config.yaml`: ```yaml oauthToken: workspace: repo: https://github.com/your-org/your-repo.git ref: main token: # optional, for private repos and pushing changes ```
How to get your credentials **Claude OAuth token** (recommended for Claude Code): Run `claude setup-token` locally and follow the prompts. This generates a long-lived token (valid for ~1 year). Copy the token from `~/.claude/credentials.json`. **Anthropic API key** (alternative for Claude Code): Create one at [console.anthropic.com](https://console.anthropic.com). Set `apiKey` instead of `oauthToken` in your config. **Codex OAuth credentials** (for OpenAI Codex): Run `codex auth login` locally, then reference the auth file in your config: ```yaml oauthToken: "@~/.codex/auth.json" type: codex ``` For production installs, you can keep Codex OAuth credentials fresh by labeling each credentials Secret with `kelos.dev/codex-oauth-refresh=true`. Kelos creates one CronJob per labeled Secret; each CronJob uses Codex's file-backed auth refresh and writes the updated `auth.json` back to that Secret. See the [Helm chart README](internal/manifests/charts/kelos/README.md#codex-oauth-token-refresh) for values and setup. Or set `apiKey` with an OpenAI API key instead. **Gemini API key** (for Google Gemini): Create one at [aistudio.google.com/app/apikey](https://aistudio.google.com/app/apikey). Set `apiKey` in your config and use `type: gemini`. **Cursor API key** (for Cursor CLI): Obtain one from the Integrations tab at [cursor.com/dashboard](https://cursor.com/dashboard). Set `apiKey` in your config and use `type: cursor`. **GitHub token** (for pushing branches and creating PRs): Create a [Personal Access Token](https://github.com/settings/tokens) with `repo` scope (and `workflow` if your repo uses GitHub Actions). **GitHub App** (recommended for production/org use): For organizations, [GitHub Apps](https://docs.github.com/en/apps) are preferred over PATs — they offer fine-grained permissions, higher rate limits, and don't depend on a specific user account. Use `githubApp` instead of `token` in your workspace config: ```yaml workspace: repo: https://github.com/your-org/repo.git ref: main githubApp: appID: "12345" installationID: "67890" privateKeyPath: ~/.config/my-app.private-key.pem ``` See the [Workspace reference](docs/reference.md#workspace) for details.
> **Warning:** Without a workspace, the agent runs in an ephemeral pod — any files it creates are lost when the pod terminates. Always set up a workspace to get persistent results. ### 4. Run Your First Task ```bash $ kelos run -p "Add a hello world program in Python" task/task-r8x2q created $ kelos logs task-r8x2q -f ``` The task name (e.g. `task-r8x2q`) is auto-generated. Use `--name` to set a custom name, or `-w` to watch task status after creation. To stream agent logs, run `kelos logs -f`. You can also read the prompt from a file with `--prompt-file`, or pipe it from stdin: ```bash $ kelos run --prompt-file prompt.txt $ echo "Fix the flaky test" | kelos run --prompt-file - ``` The agent clones your repo, makes changes, and can push a branch or open a PR. > **Tip:** If something goes wrong, check the controller logs with > `kubectl logs deployment/kelos-controller-manager -n kelos-system`.
Using kubectl and YAML instead of the CLI Create a `Workspace` resource to define a git repository: ```yaml apiVersion: kelos.dev/v1alpha2 kind: Workspace metadata: name: my-workspace spec: repo: https://github.com/your-org/your-repo.git ref: main ``` Then reference it from a `Task`: ```yaml apiVersion: kelos.dev/v1alpha2 kind: Task metadata: name: hello-world spec: type: claude-code prompt: "Create a hello world program in Python" credentials: type: oauth secretRef: name: claude-oauth-token workspaceRef: name: my-workspace ``` ```bash kubectl apply -f workspace.yaml kubectl apply -f task.yaml kubectl get tasks -w ```
Using an API key instead of OAuth Set `apiKey` instead of `oauthToken` in `~/.kelos/config.yaml`: ```yaml apiKey: ``` Or pass `--secret` to `kelos run` with a pre-created secret (api-key is the default credential type), or set `spec.credentials.type: api-key` in YAML.
## Kelos Skill The [Kelos skill](skills/kelos/) teaches AI coding agents how to author and operate Kelos resources. Install it via [skills.sh](https://skills.sh): ```bash npx skills add kelos-dev/kelos ``` Then ask your agent: ``` Using the /kelos skill, set up a TaskSpawner that watches GitHub issues labeled "bug" and auto-creates Tasks to fix them. ``` The agent will generate the correct manifests, apply them, and troubleshoot any issues on your behalf. ## Examples ### Auto-fix GitHub issues with TaskSpawner Create a TaskSpawner to automatically turn GitHub issues into agent tasks: ```yaml apiVersion: kelos.dev/v1alpha2 kind: TaskSpawner metadata: name: fix-bugs spec: when: githubIssues: labels: [bug] state: open pollInterval: 5m taskTemplate: type: claude-code workspaceRef: name: my-workspace credentials: type: oauth secretRef: name: claude-oauth-token promptTemplate: "Fix: {{.Title}}\n{{.Body}}" ``` ```bash kubectl apply -f taskspawner.yaml ``` TaskSpawner polls for new issues matching your filters and creates a Task for each one. ### Chain tasks into pipelines Use `dependsOn` to chain tasks into pipelines. A task in `Waiting` phase stays paused until all its dependencies succeed: ```bash kelos run -p "Scaffold a new user service" --name scaffold --branch feature/user-service kelos run -p "Write tests for the user service" --depends-on scaffold --branch feature/user-service ``` Tasks sharing the same `branch` are serialized automatically — only one runs at a time.
YAML equivalent ```yaml apiVersion: kelos.dev/v1alpha2 kind: Task metadata: name: scaffold spec: type: claude-code prompt: "Scaffold a new user service with CRUD endpoints" credentials: type: oauth secretRef: name: claude-oauth-token workspaceRef: name: my-workspace branch: feature/user-service --- apiVersion: kelos.dev/v1alpha2 kind: Task metadata: name: write-tests spec: type: claude-code prompt: "Write comprehensive tests for the user service" credentials: type: oauth secretRef: name: claude-oauth-token workspaceRef: name: my-workspace branch: feature/user-service dependsOn: [scaffold] ```
Downstream tasks can reference upstream results in their prompt using `{{.Deps}}`: ```yaml apiVersion: kelos.dev/v1alpha2 kind: Task metadata: name: open-pr spec: type: claude-code prompt: | Open a PR for branch {{index .Deps "write-tests" "Results" "branch"}}. credentials: type: oauth secretRef: name: claude-oauth-token workspaceRef: name: my-workspace branch: feature/user-service dependsOn: [write-tests] ``` The `.Deps` map is keyed by dependency Task name. Each entry has `Results` (key-value map with branch, commit, pr, etc.) and `Outputs` (raw output lines). See [examples/07-task-pipeline](examples/07-task-pipeline/) for a full three-stage pipeline. ### Create PRs automatically Add a `token` to your workspace config: ```yaml workspace: repo: https://github.com/your-org/repo.git ref: main token: ``` ```bash kelos run -p "Fix the bug described in issue #42 and open a PR with the fix" ``` The `gh` CLI and `GITHUB_TOKEN` are available inside the agent container, so the agent can push branches and create PRs autonomously. ### Inject agent instructions and MCP servers Use `AgentConfig` to bundle project-wide instructions, plugins, and MCP servers: ```yaml apiVersion: kelos.dev/v1alpha2 kind: AgentConfig metadata: name: my-config spec: agentsMD: | # Project Rules Follow TDD. Always write tests first. mcpServers: - name: github type: http url: https://api.githubcopilot.com/mcp/ headers: Authorization: "Bearer " ``` ```bash kelos run -p "Fix the bug" --agent-config my-config ``` - `agentsMD` is written to the agent's user-level instructions file (additive with the repo's own instructions). The destination depends on the agent type: `~/.claude/CLAUDE.md` for Claude Code, `~/.gemini/GEMINI.md` for Gemini, `~/.codex/AGENTS.md` for Codex, `~/.config/opencode/AGENTS.md` for OpenCode, and `~/.cursor/AGENTS.md` for Cursor. - `plugins` are mounted as plugin directories and passed via `--plugin-dir`. - `mcpServers` are written to the agent's native MCP configuration. Supports `stdio`, `http`, and `sse` transport types. See the [full AgentConfig spec](docs/reference.md#agentconfig) for plugins, skills, and agents configuration. > Browse all ready-to-apply YAML manifests in the [`examples/`](examples/) directory. ## Integration Kelos integrates with external systems in two ways: **TaskSpawner** — Kelos natively watches external sources and automatically creates Tasks. Supports GitHub Issues, GitHub Pull Requests, GitHub Webhooks, Linear Webhooks, Jira, Cron schedules, and Generic Webhooks (for arbitrary HTTP POST sources like Sentry, Notion, or Slack). No glue code needed. ```yaml spec: when: githubIssues: labels: [bug] state: open ``` **Direct Task creation** — Create Task resources from your own workflows for full control. Any system that can run `kubectl apply` or call the Kubernetes API can trigger agent runs — GitHub Actions, CI/CD pipelines, scripts, Slack bots, or custom automation. ```bash kelos run -p "Fix the flaky test in ci_test.go" --workspace my-workspace ``` See the [Integration guide](docs/integration.md) for examples of both approaches, including GitHub Actions workflows, Jira setup, and programmatic Task creation. ## Orchestration Patterns - **Autonomous Self-Development** — Build a feedback loop where agents pick up issues, write code, self-review, and fix CI flakes until the task is complete. Kelos itself is developed this way — see [Case Study: Kelos Developing Kelos](#case-study-kelos-developing-kelos) below. - **Event-Driven Bug Fixing** — Automatically spawn agents to investigate and fix bugs as soon as they are labeled in GitHub. See [Auto-fix GitHub issues](#auto-fix-github-issues-with-taskspawner). - **Fleet-Wide Refactoring** — Orchestrate a "fan-out" where dozens of agents apply the same refactoring pattern across a fleet of microservices in parallel. - **Hands-Free CI/CD** — Embed agents as first-class steps in your deployment pipelines to generate documentation or perform automated migrations. - **AI Worker Pools** — Maintain a pool of specialized agents (e.g., "The Security Fixer") that developers can trigger via simple Kubernetes resources. ## Case Study: Kelos Developing Kelos Kelos develops Kelos. TaskSpawners run 24/7, each handling a different part of the development lifecycle — triaging issues, planning implementations, fixing bugs, responding to PR feedback, reviewing code, squashing commits, updating agent images, testing DX, brainstorming improvements, and tuning their own prompts and configs. See the [`self-development/` README](self-development/README.md) for the full pipeline: manifests, triggers, models, and setup instructions. The same pattern develops the sibling [`kelos-dev/kanon`](https://github.com/kelos-dev/kanon) project — see [`kanon-development/`](kanon-development/README.md). ## Reference | Resource | Key Fields | Full Spec | |----------|-----------|-----------| | **Task** | `type`, `prompt`, `credentials`, `workspaceRef`, `dependsOn`, `branch` | [Reference](docs/reference.md#task) | | **Workspace** | `repo`, `ref`, `secretRef` (PAT or GitHub App), `files`, `setupCommand` | [Reference](docs/reference.md#workspace) | | **AgentConfig** | `agentsMD`, `plugins`, `mcpServers` | [Reference](docs/reference.md#agentconfig) | | **TaskSpawner** | `when`, `taskTemplate`, per-source `pollInterval`, `maxConcurrency` | [Reference](docs/reference.md#taskspawner) |
CLI Reference | Command | Description | |---------|-------------| | `kelos install` | Install Kelos CRDs and controller into the cluster | | `kelos uninstall` | Uninstall Kelos from the cluster | | `kelos init` | Initialize `~/.kelos/config.yaml` | | `kelos run` | Create and run a new Task | | `kelos get [name]` | List resources or view a specific resource (`tasks`, `taskspawners`, `workspaces`, `agentconfigs`) | | `kelos delete [name]` | Delete a resource (`tasks`, `taskspawners`, `workspaces`, `agentconfigs`); supports `--all` to delete every resource of that type in the namespace | | `kelos logs [-f]` | View or stream logs from a task | | `kelos suspend taskspawner ` | Pause a TaskSpawner | | `kelos resume taskspawner ` | Resume a paused TaskSpawner | | `kelos completion ` | Generate a shell completion script (`bash`, `zsh`, `fish`, `powershell`); also completes resource names dynamically from the cluster | See [full CLI reference](docs/reference.md#cli-reference) and [Shell Completion](docs/reference.md#shell-completion) for all flags, options, and setup instructions.
## Security Considerations Kelos runs agents in isolated, ephemeral Pods with no access to your host machine, SSH keys, or other processes. The risk surface is limited to what the injected credentials allow. **What agents CAN do:** Push branches, create PRs, and call the GitHub API using the injected `GITHUB_TOKEN`. **What agents CANNOT do:** Access your host, read other pods, reach other repositories, or access any credentials beyond what you explicitly inject. Best practices: - **Scope your GitHub tokens.** Use [fine-grained Personal Access Tokens](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#fine-grained-personal-access-tokens) restricted to specific repositories instead of broad `repo`-scoped classic tokens. - **Enable branch protection.** Require PR reviews before merging to `main`. Agents can push branches and open PRs, but protected branches prevent direct pushes to your default branch. - **Use `maxConcurrency` and `maxTotalTasks`.** Limit how many tasks a TaskSpawner can create to prevent runaway agent activity. - **Use `podOverrides.activeDeadlineSeconds`.** Set a timeout to prevent tasks from running indefinitely. - **Audit via Kubernetes.** Every agent run is a first-class Kubernetes resource — use `kubectl get tasks` and cluster audit logs to track what was created and by whom. > **About `--dangerously-skip-permissions`:** Claude Code uses this flag for non-interactive operation. Despite the name, the actual risk is minimal — agents run inside ephemeral containers with no host access. The flag simply disables interactive approval prompts, which is necessary for autonomous execution. Kelos uses standard Kubernetes RBAC — use namespace isolation to separate teams. Each TaskSpawner automatically creates a scoped ServiceAccount and RoleBinding. ## Cost and Limits Running AI agents costs real money. Here's how to stay in control: **Model and effort costs vary significantly.** Use `spec.model` (or `model` in config) to choose the model for each workflow, and use `spec.effort` (or `effort` in config) to lower reasoning effort for routine work or raise it for complex reviews and planning. Check your provider's pricing page for current rates. **Use `maxConcurrency` to cap spend.** Without it, a TaskSpawner can create unlimited concurrent tasks. If 100 issues match your filter on first poll, that's 100 simultaneous agent runs. Always set a limit: ```yaml spec: maxConcurrency: 3 # max 3 tasks running at once maxTotalTasks: 50 # stop after 50 total tasks ``` **Use `podOverrides.activeDeadlineSeconds` to limit runtime.** Set a timeout per task to prevent agents from running indefinitely: ```yaml spec: podOverrides: activeDeadlineSeconds: 3600 # kill after 1 hour ``` Or via the CLI: ```bash kelos run -p "Fix the bug" --timeout 30m ``` **Use `suspend` for emergencies.** If costs are spiraling, pause a spawner immediately: ```bash kelos suspend taskspawner my-spawner # ... investigate ... kelos resume taskspawner my-spawner ``` **Rate limits.** API providers enforce concurrency and token limits. If a task hits a rate limit mid-execution, it will likely fail. Use `maxConcurrency` to stay within your provider's limits. ## FAQ
What agents does Kelos support? Kelos supports **Claude Code**, **OpenAI Codex**, **Google Gemini**, **OpenCode**, and **Cursor** out of the box. You can also bring your own agent image using the [container interface](docs/agent-image-interface.md).
Can I use Kelos without Kubernetes? No. Kelos is built on Kubernetes Custom Resources and requires a Kubernetes cluster. For local development, use [kind](https://kind.sigs.k8s.io/) (`kind create cluster`) to create a single-node cluster on your machine.
Is it safe to give agents repo access? Agents run in isolated, ephemeral Pods with no host access. Their capabilities are limited to what you inject — typically a scoped GitHub token. Use fine-grained PATs, branch protection, and `maxConcurrency` to control the blast radius. See [Security Considerations](#security-considerations).
How much does it cost to run? Costs depend on the model and task complexity. Check the [API pricing](https://docs.anthropic.com/en/docs/about-claude/pricing) page for current rates. Use `maxConcurrency`, timeouts, and model selection to stay in budget. See [Cost and Limits](#cost-and-limits).
## Uninstall ```bash kelos uninstall ``` ## Development Build, test, and iterate with `make`: ```bash make update # generate code, CRDs, fmt, tidy make verify # generate + vet + tidy-diff check make test # unit tests make test-integration # integration tests (envtest) make test-e2e # e2e tests (requires cluster) make build # build binary make image # build docker image ``` ## Contributing 1. Fork the repo and create a feature branch. 2. Make your changes and run `make verify` to ensure everything passes. 3. Open a pull request with a clear description of the change. For significant changes, please open an issue first to discuss the approach. We welcome contributions of all kinds — see [good first issues](https://github.com/kelos-dev/kelos/labels/good%20first%20issue) for places to start. ## License [Apache License 2.0](LICENSE)