Prompt-optimizer

Author	SHA1	Message	Date
FullStackDev	a5bf2ad59c	feat: v0.2.0 sprint — ground truth eval, crossover/mutation, checkpointing, similarity guards, dataset loader, CLI commands, extended test coverage Aggregates all v0.2.0 sprint work (GARAA-30 through GARAA-40) and fixes 2 integration tests that broke when the codebase went async (DSPyLLMAdapter and full pipeline tests now properly await coroutines). 277 tests pass (260 unit + 17 integration). Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 19:13:50 +00:00
FullStackDev	b9745566c8	feat: custom judge criteria and multi-dimensional scoring Add configurable judge rubrics and multi-dimensional scoring with weighted aggregation. New config fields: judge_criteria (free text) and judge_dimensions (list of {name, weight, description}). CLI --judge-criteria flag provides quick overrides. The judge adapter computes weighted aggregate scores and enriches feedback with per-dimension breakdowns. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 15:40:21 +00:00
FullStackDev	c92ca4a2b8	feat: async/parallel execution with configurable concurrency Parallelize LLM calls across minibatches to reduce wall-clock time. All domain ports (LLMPort, JudgePort, ProposerPort) are now async. Adapter implementations wrap synchronous DSPy calls with asyncio.to_thread. Judge calls run in parallel within a batch using asyncio.gather + semaphore. Evaluator parallelizes minibatch execution with configurable concurrency. Evolution loop and use case are fully async. Proposer stays sequential. Added --max-concurrency CLI flag and max_concurrency YAML config field. Added async_retry_with_backoff for async error handling. All 139 unit tests pass. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 13:15:34 +00:00
FullStackDev	e2d111ce5b	feat: error handling, retry with backoff, and circuit breaker Add robust error handling to the evolution loop and LLM adapters: - Retry utility with exponential backoff for transient errors (429, 5xx, timeouts) - Per-call error isolation in evaluator and judge adapter - Circuit breaker in EvolutionLoop (trips after N consecutive failures) - CLI flags: --max-retries, --error-strategy (skip\|retry\|abort) - Config fields: max_retries, retry_delay_base, circuit_breaker_threshold, error_strategy - 16 new unit tests covering all error handling paths Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 12:47:55 +00:00
FullStackDev	f516ca4be6	fix: multi-model routing — each adapter uses own dspy.LM instance - DSPyLLMAdapter now accepts dspy.LM instead of model string, uses dspy.context(lm=...) - DSPyJudgeAdapter, DSPyProposerAdapter, DSPySyntheticAdapter each accept and use own LM - OptimizationConfig gains per-model api_base/api_key_env override fields - cli/app.py creates separate dspy.LM per adapter with per-model overrides - New unit tests verify each adapter isolates its LM from global config Fixes Bug #1 (multi-model config not wired) and Bug #2 (DSPyLLMAdapter ignores model param). Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-03-29 12:31:48 +00:00
Gartoid	837a44970f	Initial commit: PROMETHEUS v0.1.0 - Prompt optimizer - Clean architecture (domain/application/infrastructure) - DSPy-based evolution engine with scoring - CLI via pyproject.toml entry point - Unit + integration tests (~300 tests) - Configs for glm-5.1 and glm-4.5-air models - Z.AI endpoint integration	2026-03-29 11:44:03 +00:00

6 Commits