# pytest-agent-platform

**Repository Path**: brownz/pytest-agent-platform

## Basic Information

- **Project Name**: pytest-agent-platform
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-07
- **Last Updated**: 2026-05-22

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Pytest Agent Platform

Agent-driven pytest test case generation platform with closed-loop fixing.

## Overview

This platform automatically generates pytest test cases from ICD JSON Schema definitions and AREX traffic recordings. It uses a LangChain-powered Agent for intelligent test generation and a Harness execution engine that forms a closed feedback loop — when tests fail, the Agent automatically analyzes the failure, fixes the code, and re-executes until all tests pass.

```
ICD Schema  ──→ [SchemaParser] ──→ ContractIR ──┐
                                                  ├──→ [LangChain Agent + Jinja2] ──→ pytest
AREX API  ────→ [ArexClient] ────→ TrafficIR ────┘
                                                       │
                                                       ▼
                                                  [Harness Execute]
                                                       │
                                           ┌── All Pass? ──→ Done
                                           │
                                           No → [ErrorClassifier] → Agent Fix → Loop (max 3)
```

## Requirements

- Python >= 3.10
- Access to an LLM API (OpenAI-compatible)
- AREX recording service (optional, for regression tests)

## Installation

```bash
git clone <repo-url>
cd proj4
pip install -e ".[dev]"
```

## Quick Start

### 1. Set environment variables

```bash
# LLM API configuration (required)
export LLM_API_KEY="sk-..."
export LLM_MODEL="gpt-4o"           # default: gpt-4o
export LLM_BASE_URL="https://..."   # optional, for custom endpoints

# AREX configuration (optional)
export AREX_BASE_URL="http://arex.example.com"
export AREX_API_KEY="your-arex-key"
```

### 2. Create an ICD JSON Schema

```json
{
    "method": "POST",
    "path": "/api/orders",
    "request": {
        "type": "object",
        "required": ["amount", "currency"],
        "properties": {
            "amount": {"type": "number", "minimum": 0.01},
            "currency": {"type": "string", "enum": ["USD", "CNY"]}
        }
    },
    "response": {
        "type": "object",
        "properties": {
            "order_id": {"type": "string"},
            "status": {"type": "string", "enum": ["pending", "confirmed"]}
        }
    }
}
```

### 3. Run

```bash
# Generate tests only
pytest-agent generate --schema myschema.json --api-name create_order --base-url http://localhost:8000

# Execute existing tests via Harness
pytest-agent run --test-file tests/generated/test_create_order.py

# Full auto closed loop (generate → execute → fix → retry)
pytest-agent auto --schema myschema.json --api-name create_order --base-url http://localhost:8000
```

## CLI Commands

| Command | Description |
|---------|-------------|
| `generate` | Parse ICD schema (+ AREX recordings), generate pytest files |
| `run` | Execute a pytest file via Harness, display structured results |
| `auto` | Full closed loop — generate, execute, classify failures, fix, retry |

### `generate` options

| Option | Default | Description |
|--------|---------|-------------|
| `--schema` | (required) | Path to JSON Schema file |
| `--api-name` | `""` | API name for AREX recording lookup |
| `--base-url` | `http://localhost:8000` | Target API base URL |
| `--output-dir` | `tests/generated` | Output directory for test files |

### `run` options

| Option | Default | Description |
|--------|---------|-------------|
| `--test-file` | (required) | Path to pytest file to execute |
| `--timeout` | `60` | Execution timeout in seconds |

### `auto` options

| Option | Default | Description |
|--------|---------|-------------|
| `--schema` | (required) | Path to JSON Schema file |
| `--api-name` | `""` | API name for AREX recording lookup |
| `--base-url` | `http://localhost:8000` | Target API base URL |
| `--output-dir` | `tests/generated` | Output directory |
| `--max-retries` | `3` | Maximum fix retry count |

## Project Structure

```
src/
├── ir/
│   └── models.py              # ContractIR, TrafficIR, HarnessReport, ErrorType, etc.
├── agent/
│   ├── schema_parser.py       # JSON Schema → ContractIR
│   ├── arex_client.py         # HTTP fetch AREX recordings → TrafficIR
│   ├── tools.py               # LangChain Tool definitions (5 tools)
│   ├── test_generator.py      # Jinja2 rendering + boundary/exception derivation
│   └── agent.py               # ReAct Agent core + fix-and-retry logic
├── harness/
│   ├── runner.py              # PytestRunner (subprocess + timeout)
│   ├── parser.py              # ResultParser (pytest output → HarnessReport)
│   ├── classifier.py          # ErrorClassifier (4 error types)
│   ├── feedback.py            # FeedbackBuilder
│   └── controller.py          # ClosedLoopController (max_retries=3)
└── cli.py                     # CLI entry point

templates/
├── contract_test.py.j2        # Jinja2 template for contract tests
└── regression_test.py.j2      # Jinja2 template for regression tests

tests/
├── test_schema_parser.py      # SchemaParser tests
├── test_arex_client.py        # ArexClient tests (mock HTTP)
├── test_agent.py              # TestGenerator + IR tests
├── test_harness.py            # ErrorClassifier, FeedbackBuilder, ResultParser tests
└── test_integration.py        # End-to-end pipeline tests
```

## Architecture

### Core Data Flow

```
ICD JSON Schema ──→ SchemaParser ──→ ContractIR
                                            │
AREX HTTP API ────→ ArexClient ────→ TrafficIR
                                            │
                                            ▼
                                     LangChain Agent (ReAct)
                                            │
                                     Jinja2 Templates
                                            │
                                            ▼
                                     pytest test files
                                            │
                                            ▼
                                     Harness (pytest runner)
                                            │
                              ┌─────────────┴─────────────┐
                              ▼                           ▼
                          All Pass                      Failures
                              │                           │
                              ▼                           ▼
                            Done              ErrorClassifier
                                                  │
                                    ┌─────────────┼─────────────┐
                                    ▼             ▼             ▼
                              Assertion     Schema Drift   Environment
                              Mismatch                     Error
                                    │             │             │
                                    ▼             ▼             ▼
                              Update         Re-fetch       Retry 2x
                              expected       ICD+AREX
```

### Error Classification

| Type | Detection | Fix Strategy |
|------|-----------|--------------|
| `assertion_mismatch` | `assert` failure, expected vs actual | Update expected value |
| `schema_drift` | KeyError, AttributeError, field changes | Re-fetch ICD + AREX, regenerate |
| `environment_error` | ConnectionError, timeout, DNS | Wait 5s, retry 2x |
| `logic_error` | ImportError, SyntaxError, NameError | Re-analyze and rewrite test |

### Closed Loop Controller

- Max retries: **3** (hard limit, prevents infinite LLM token consumption)
- Environment error retries: **2** (separate from fix retries)
- Each fix tracked in `FixRecord` chain (original → fixed → reasoning)

## Running Tests

```bash
pytest tests/ -v
```

All 39 unit and integration tests should pass.

## Generated Test Output

The platform generates two types of tests per API:

**Contract tests** (`test_<api_name>.py`):
- Response field existence and type assertions
- Required field validation
- Schema constraint verification (enum, pattern, minLength, maxLength, minimum, maximum)
- Boundary value tests
- Negative scenarios (missing fields, wrong types)

**Regression tests** (`test_<api_name>_regression.py`):
- `@pytest.mark.parametrize` driven by AREX recordings
- Replay requests and compare response status
- Response body structure validation

## Configuration

All configuration is done via environment variables:

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `LLM_API_KEY` | Yes | — | API key for LLM provider |
| `LLM_MODEL` | No | `gpt-4o` | Model name |
| `LLM_BASE_URL` | No | — | Custom LLM endpoint (e.g., local proxy) |
| `AREX_BASE_URL` | No | — | AREX service base URL |
| `AREX_API_KEY` | No | — | AREX authentication key |

## License

MIT