Anyone can build an agent that demos well. Almost nobody can build one that still works on the thousandth run, with real data and real edge cases. That gap is the whole opportunity, and it's the job. You'll work directly with the founders on the core product.
What you'll own
• Agentic loops that automate real, end-to-end work, not demos.
• The evals that tell us whether an agent is actually reliable.
• Agents that ship and hold up once real users hit them.
• You try every new model, framework, and harness the week it ships, and you have opinions on what's overrated.
• You already run agents in your own life, not as a novelty, as infrastructure.
• A compulsion to understand the stack all the way down: context, tools, failure modes, cost.
Instant hire if you've shipped an agent people actually use and can show us the evals that keep it honest.
What you'll need
• Strong fluency with a coding agent (opencode / claude-code / codex / hermes).
• You know what good evals look like because you've built them, broken them, and figured out what makes an agent reliable.
• Solid grounding in agent-engineering fundamentals.
Details
• Stipend: Rs. 25,000 to 50,000 / month, depending on experience
• Location: Remote / Bengaluru
• Start: Immediate
• Duration: 2 months, with a path to full-time