Commit Graph

3 Commits

Author SHA1 Message Date
FullStackDev
b9745566c8 feat: custom judge criteria and multi-dimensional scoring
Add configurable judge rubrics and multi-dimensional scoring with
weighted aggregation. New config fields: judge_criteria (free text)
and judge_dimensions (list of {name, weight, description}). CLI
--judge-criteria flag provides quick overrides. The judge adapter
computes weighted aggregate scores and enriches feedback with
per-dimension breakdowns.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-29 15:40:21 +00:00
FullStackDev
c92ca4a2b8 feat: async/parallel execution with configurable concurrency
Parallelize LLM calls across minibatches to reduce wall-clock time.
All domain ports (LLMPort, JudgePort, ProposerPort) are now async.
Adapter implementations wrap synchronous DSPy calls with asyncio.to_thread.
Judge calls run in parallel within a batch using asyncio.gather + semaphore.
Evaluator parallelizes minibatch execution with configurable concurrency.
Evolution loop and use case are fully async. Proposer stays sequential.
Added --max-concurrency CLI flag and max_concurrency YAML config field.
Added async_retry_with_backoff for async error handling.
All 139 unit tests pass.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-29 13:15:34 +00:00
FullStackDev
e2d111ce5b feat: error handling, retry with backoff, and circuit breaker
Add robust error handling to the evolution loop and LLM adapters:
- Retry utility with exponential backoff for transient errors (429, 5xx, timeouts)
- Per-call error isolation in evaluator and judge adapter
- Circuit breaker in EvolutionLoop (trips after N consecutive failures)
- CLI flags: --max-retries, --error-strategy (skip|retry|abort)
- Config fields: max_retries, retry_delay_base, circuit_breaker_threshold, error_strategy
- 16 new unit tests covering all error handling paths

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-29 12:47:55 +00:00