Title: Senior AI Engineer — Inference & Agent Systems
Location:
- BLR/Remote India
What We're Building
Arcana is building AI agents that synthesize information across heterogeneous sources and deliver structured, reasoned answers in real time. The product only works if the agents are fast, reliable, and correct, not approximately correct.
Our stack: Go + Temporal for orchestration, a Plan-Execute-Synthesize agent architecture, and an evaluation harness we use to measure every regression. The problems are hard. The latency bar is aggressive. The accuracy requirements are unforgiving.
The Work
Inference Optimization
- Drive TTFT below 400ms for multi-step agent pipelines
- Streaming optimization: first token to user while sub-agents are still running
- KV cache strategy, prompt compr...