
Research Engineer, Reinforcement Learning
- On-site
- Palo Alto, California, United States
- Artificial Intelligence (AI)
Job description
Target start date: Immediately. Relocation provided.
Since its founding in 2015, 1X has been at the forefront of developing advanced humanoid robots designed for household use. Our mission is to create an abundant supply of labor via safe, intelligent humanoids. At 1X, you’ll own critical projects, tackle unsolved research problems, deliver great products to customers, and be rewarded based on merit and achievement.
We are looking for a Research Engineer in Reinforcement Learning (RL). In this role, you will teach NEO to learn new capabilities via RL algorithms. This enables our robots to be safe and robust in a variety of conditions.
Tech Stack
Linux
Python / C++ / Bazel
PyTorch
Isaac Sim / Mujoco
Location
The role is based in Palo Alto, CA. Candidates are expected to be in-person at the office.
Responsibilities
Full-stack engineering, from data engineering to model architecture design to shipping polished products
Train NEO to do a diverse set of manipulation and locomotion tasks.
Work with hardware teams to the sim2real gap between policies trained in simulation and real.
Work with controls, QA, and data collection teams to ship RL policies to the production fleet.
Deploy skills trained with RL into home environments
Job requirements
Getting general-purpose robots to work in the home is just about the hardest problem one can work on. We are looking for people with the courage to tackle unsolved technical challenges with an intense work ethic.
4+ years of Python programming experience.
Fundamental knowledge of at least one low-level programming language such as C++.
Good understanding of robotics fundamentals such as computer vision, kinematics & dynamics, and planning.
Strong empirical research abilities and a keen eye for spotting performance bottlenecks in RL training.
Experience with authoring environments and benchmarks in simulators like Mujoco, Pybullet, or Isaac Sim.
Ideal Experiences
Advanced degree (MS or PhD) in Computer Science or related field
Published RL research in top ML conferences (NeurIPS, CoRL, RSS, ICML, etc.)
Have trained real-world quadruped or biped locomotion with RL
Experience working with large, cross-team codebases.
Control theory and/or signal processing knowledge.
Sample Projects
We encourage you to apply even if you do not meet every single qualification. If you have direct experience in solving one of the “sample projects” listed below, please let us know in your cover letter.
Train locomotion policies with RL that is capable, robust, and safe enough to run on an actual humanoid product deployed in homes.
Speed up the simulator to enable faster training and evaluation.
Reduce the amount of “reward engineering” needed to solve long-horizon tasks by formulating general objectives like energy minimization, self-play, and data-driven reward functions.
Work on data pipelines and model architectures for generating natural, “human-like” motions.
Interview Process
The team reviews your CV and statement of exceptional work
15 minute phone conversation with our talent acquisition team
45-minute virtual interview with a team member asking a coding question in the language of your choice.
On-site interview (in-person or virtual) consisting of 4 technical interviews (mix of coding, systems design, open-ended research interview)
Background reference checks
Offer
Compensation
At 1X your work and results will be rewarded with a total rewards package consisting of a base salary, stock options and benefits. Base salary range is $180,000 to $300,000. Your actual salary will be based on your knowledge, skills and experience.
or
All done!
Your application has been successfully submitted!
Explore Careers at 1X.
Our mission is to design Androids that work alongside people, to meet the world’s labor demands and build an abundant society.


