Building AI features for clients in the DACH region (Germany/Austria/Switzerland) is currently a compliance nightmare. Legal departments often block US hyperscalers (OpenAI/Azure/AWS) due to US Cloud Act exposure and GDPR concerns. The alternative is renting bare metal and managing GPUs/Docker/Drivers yourself, which kills development velocity.
I built a managed inference layer on top of dedicated GPUs (in Europe). It exposes an OpenAI-compatible API endpoint.
Tech Stack:
- Sveltekit (app)
- Bun (proxy/router)
- Ollama (inference)
- Some GPU Servers
- Models: Currently serving supa:instant (Qwen3-0.6b) and supa:fast(Qwen3-8b).
- Privacy: Zero data retention logging. No data leaves Germany.
We are in beta. I'm looking for feedback on latency and API compatibility. The current setup allows testing without a credit card.
Happy to answer any questions about the infrastructure or the regulatory landscape in Germany!
I'm Christian, the creator of SUPA.
Building AI features for clients in the DACH region (Germany/Austria/Switzerland) is currently a compliance nightmare. Legal departments often block US hyperscalers (OpenAI/Azure/AWS) due to US Cloud Act exposure and GDPR concerns. The alternative is renting bare metal and managing GPUs/Docker/Drivers yourself, which kills development velocity.
I built a managed inference layer on top of dedicated GPUs (in Europe). It exposes an OpenAI-compatible API endpoint.
Tech Stack: - Sveltekit (app) - Bun (proxy/router) - Ollama (inference) - Some GPU Servers - Models: Currently serving supa:instant (Qwen3-0.6b) and supa:fast(Qwen3-8b). - Privacy: Zero data retention logging. No data leaves Germany.
We are in beta. I'm looking for feedback on latency and API compatibility. The current setup allows testing without a credit card.
Happy to answer any questions about the infrastructure or the regulatory landscape in Germany!