Claude Code Plugin/MCP Harness
Бюджет: $30.0 - $50.0
HOURLY / PART_TIME
⭐ 0.00 (0)
United Kingdom
python, c, javascript, node.js
We're looking to create a reliable testing harness to be able to test MCP Servers and plugins with Claude Code.
It should let us spin up an ephemeral environment/folder to start Claude Code (/tmp on macOS and windows equivalent) with a file-based override that disables all MCP servers and installs only the one we’re testing. As well as giving it the API key in order to comply with automated usage of Claude code.
It should allow us to enter prompts and log the output as it happens, as well as the underlying transcripts. It should be able to view status line changes. It should also be able to calculate token usage for input tokens, output tokens, cache reads and cache writes, as well as the current model.
This will become part of our CI pipeline, for a series of Plugins.
This is not the CLI version; this is the interactive mode we are testing. We must be able to enter several prompts, not just one. This means waiting for Claude to finish a particular task before asking the next prompt.
This will be used in our CI pipeline to help test flow regression as well as benchmarking, for example, running with and without the plug-in.
This will be used on local developer machines (MacOS & Windows) as well as in a dedicated CI environment.
Each test on the harness will produce the following outputs:
- Wall clock time
- Total tool usage count
- Counts for each tool usage type (both native Claude tools and tools introduced by the MCP server/plugin)
- Total token usage (input, output, cache read, cache write).
- The raw transcript
- Anything else deemed useful.
Открыть заказ