AI prompt and Workflow engineer

Bütçe: $30.0 - $59.0 HOURLY / FULL_TIME ⭐ 0.00 (0) United Kingdom

Role: AI Prompt Engineer / AI Workflow Engineer Role Purpose: MTRX is looking for an AI Prompt Engineer / AI Workflow Engineer to support the development of structured, testable and reliable AI-enabled analytics workflows. The role is focused on designing prompts, workflow logic and AI-assisted processes that produce consistent, reviewable and high-quality outputs. This person should be comfortable working across prompt design, AI workflow orchestration, backend integration and output validation. Key Responsibilities: Design and refine prompts for complex analytical, research and reporting tasks. Translate business, analytical and econometric requirements into structured AI workflows. Build prompt sequences that guide AI models through multi-step reasoning and output generation. Define input requirements, output formats, validation checks and fallback behaviours. Work with econometricians and domain experts to ensure AI outputs follow the intended logic. Test prompts across multiple examples to identify hallucinations, inconsistency and weak reasoning. Improve prompts for clarity, reliability, token efficiency and repeatability. Support the design of backend workflows that connect AI models with data, tools, APIs and validation layers. Work with developers to integrate prompts into applications, agents, notebooks, APIs or internal tools. Create structured outputs such as JSON, tables, reports, summaries, checklists and review templates. Maintain prompt versions, documentation, test cases and known failure modes. Help design evaluation processes for testing AI-generated outputs before they are used. Identify where deterministic backend logic should be used instead of relying on the AI model. Required Experience Strong experience using and designing prompts for large language models. Good understanding of how LLMs behave, including hallucination, inconsistency, context limitations and sensitivity to prompt wording. Experience building structured prompts for professional, analytical or technical workflows. Ability to design multi-step AI workflows, not just one-off prompts. Understanding of backend concepts such as APIs, databases, data pipelines, authentication, logging and error handling. Experience working with AI APIs such as OpenAI, Azure OpenAI, Anthropic, Gemini or similar platforms. Ability to define structured outputs and validation rules. Comfort working with JSON, Markdown, Python notebooks, API calls or lightweight backend logic. Understanding of when to use AI reasoning versus deterministic code. Ability to test outputs systematically and improve prompts based on observed failures. Strong written communication and documentation skills. Preferred Background: Experience working on AI agents, copilots, internal AI tools, research automation or analytics workflows. Experience with Python, FastAPI, Flask, Node.js, SQL or similar backend tools. Experience with frameworks such as LangChain, LangGraph, LlamaIndex, Semantic Kernel, CrewAI or similar orchestration tools. Experience with prompt management, versioning or observability tools such as Langfuse, LangSmith, PromptLayer, Braintrust, Promptfoo or OpenAI Evals. Experience with retrieval-augmented generation, tool calling, function calling or structured generation. Exposure to financial services, investment research, risk, analytics, consulting, audit or enterprise workflows would be helpful. Experience designing evaluation datasets, test cases or quality assurance processes for AI outputs. Understanding of data privacy, prompt injection, model risk and AI governance concepts would be helpful. Ideal Candidate: A practical AI workflow engineer who can turn complex analytical requirements into clear, reliable and testable AI-assisted processes. The ideal candidate is not just a prompt writer. They understand how prompts sit inside a wider system involving data inputs, backend logic, model calls, validation checks, user review and auditability. They should be able to decide when a task should be handled by the AI model, when it should be handled by code, and when it should require human review. They should care about reliability, structure and testing as much as they care about prompt quality.

Upwork'te aç