AI Infrastructure Expert Needed to Eliminate Cold Starts for Image Generation Website
Budżet: $18.0 - $120.0
HOURLY / PART_TIME
⭐ 5.00 (1)
United States
cuda, python
I run a website, intelliquarium.ai, that uses AI image generation to create aquarium designs. The image generation pipeline currently runs Flux Schnell LoRA workflows in ComfyUI through a RunPod serverless endpoint, but every inference is being executed from a cold start, resulting in generation times of 90+ seconds (up to as long as 15 minutes in extreme cases presumably resulting from low GPU availability.)
All workflows use the same base model/CLIP/VAE but have different parameters calibrated within each workflow to produce optimal results - no two workflows are the same. I'm looking for someone with experience in GPUs, AI infrastructure, and model deployment to help evaluate and solve this latency problem. It essentially makes my image generation function useless to viewers who won't wait indefinitely for images.
My initial thought is to build a dedicated GPU server at home that would keep all 9 image-generation workflows hot at all times, eliminating cold starts and ongoing GPU rental costs. However, I'm open to alternative architectures if there is a better approach.
The first phase of the project would be determining the most practical and cost-effective solution. Questions I need answered include:
• Is a home GPU server a viable option for keeping all 9 (maybe more in the future) models hot simultaneously?
• What hardware would be required?
• How would it connect securely to the website?
• Are there better alternatives that provide similar performance at a reasonable cost?
If we find a suitable solution, I would also be interested in implementation assistance, including hardware selection, server setup, model deployment, and website integration.
Based on your background, it seems like you may have the expertise I'm looking for. Let me know if you're interested in discussing the project.
Otwórz na Upwork