AI Engineer • White Circle

Wanted: ~~Claude Code operator~~ an AI engineer who will help us keep LLMs under control.

About us

White Circle is building the safety, reliability, and optimization layer for AI systems.

At our core are policies — simple natural-language rules that describe what an AI should and should not do, and what users can and cannot do. We test, enforce, and continuously improve these policies automatically.

Core products:

🧨 Test — automated red teaming that finds jailbreaks, unsafe tool calls, and other issues.
🛡️ Protect — real-time guardrails enforcing policies and blocking bad stuff.
🔍 Observe — advanced observability with topic clustering, failure tracing, and risk scoring.
⚙️ Improve — continuous prompt and policy optimization for better accuracy, consistency, and latency.

More info:

💰 Raised $11M from Hummingbird, Abstract, BoxGroup, SV Angel, Saga VC, Durk Kingma (co-founder of OpenAI), Thomas Wolf (co-founder of HuggingFace), Guillaume Lample (co-founder of Mistral), Romain Huet (head of DevEx at OpenAI), David Cramer (founder of Sentry), Olivier Pomel (founder of Datadog), Francois Chollet (creator of Keras), etc
📈 Processing millions of API requests every day

The Role

<aside> 💸

Salary: $80–150k + equity

</aside>

<aside> 🥖

Location: Paris, France (Hybrid)

</aside>

We’re looking for an AI Engineer who can take a cluster of GPUs and turn raw data into production-ready LLMs and VLMs.

You will:

Train LLMs and VLMs using distributed frameworks (Megatron, DeepSpeed, etc.).
Design and implement Mixture-of-Experts architectures for efficient scaling.
Build multimodal training pipelines that handle text, images, and audio.
Write custom Triton kernels to optimize training bottlenecks.
Experiment with new architectures, hyperparameters, and data mixtures.

You'll fit right in if you:

Have fine-tuned models.
Know the ins and outs of distributed training: tensor parallelism, pipeline parallelism, expert parallelism, ZeRO, FSDP.
Can write Triton kernels to speed up attention, MLP layers, or custom ops.
Have worked with multimodal architectures (Llava-style, Flamingo-style, or similar).
Care about training stability, convergence, and squeezing maximum performance from hardware.