Founding Engineer

Location: In-person only, San Francisco (no remote)

What we’re doing

We're early-stage and moving fast. We research and build autonomous systems that make ML models run as fast as physically possible by learning to program GPUs, profile and evaluate system performance, generate/fuse kernels, and push hardware to its limits.

Core Technical Expertise

GPU Fundamentals: Deep understanding of GPU architectures, CUDA programming, and parallel computing patterns.

Deep Learning Frameworks: Proficiency in PyTorch, TensorFlow, or JAX, particularly for GPU-accelerated workloads.
LLM/AI Knowledge: Strong grounding in large language models (training, fine-tuning, prompting, evaluation).
Systems Engineering: Skilled in C++, Python, and possibly Rust/Go for building tooling around CUDA.
ML for Code: Understanding of program synthesis, reinforcement learning for code generation, and automated testing/verification of generated code.

Childlike Wonder

Curious about how LLMs reason about code and how that can be guided/improved.
Comfortable reading research papers and translating new findings into practical experiments.

Ideal Background

Advanced degree (or equivalent experience) in Computer Science, Machine Learning, or High-Performance Computing.
Publications or open-source contributions in GPU computing or ML/AI for code are a plus.
Hands-on experience with large-scale experiments, benchmarking, and performance tuning.

Why Herdora

Tiny, elite team; massive ownership.
As much compute as you need.
Ambiguous problems, real work, no busywork.