Skip to content

Multimodal Research Engineer, AI Companion

    Job description

    Multimodal Research Engineer, AI Companion Team | Speech & Multimodal Interfaces
    Location: Palo Alto, CA (on-site)

    About 1X

    We build humanoid robots that work alongside people to solve labor shortages and create abundance.

    The Role
    As a Multimodal Research Engineer on the AI Companion Team, you will lead the development of a real-time conversational speech model that integrates multiple modalities including vision, spatial audio, and body language. You will collaborate with cross-functional teams to align NEO’s speech with its physical embodiment and personality. This is a key role in shaping how users interact with our humanoid robot in intuitive, engaging ways.

    Job requirements

    You Will

    • Design and implement data pipelines for large-scale speech interactions using internal and external datasets

    • Train speech-to-speech models that incorporate awareness of NEO’s physical form

    • Create dynamic responses for a wide range of user queries

    • Synchronize NEO’s speech with physical gestures and body language

    • Customize NEO’s speech behavior to reflect different personalities

    Must Have

    • 3+ years of experience in speech and audio modeling domains

    • Experience with multi-modal conversational models (language, audio, vision)

    • Ability to take open-ended problems in conversation modeling, develop creative solutions, build proof-of-concepts, and scale them to production

    Benefits & Compensation

    • Salary Range: $150,000 – $250,000 + Equity

    • Health, dental, and vision insurance

    • 401(k) with company match

    • Paid time off and holidays

    Equal Opportunity Employer

    1X is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, ancestry, citizenship, age, marital status, medical condition, genetic information, disability, military or veteran status, or any other characteristic protected under applicable federal, state, or local law.

    or

    On-site
    • Palo Alto, California, United States
    $150,000 - $250,000 per year
    Artificial Intelligence (AI)