Here at Hugging Face, we’re on a journey to advance good Machine Learning and make it more accessible. Along the way, we contribute to the development of technology for the better.

We have built the fastest-growing, open-source, library of pre-trained models in the world. With more than 1 Million+ models and 320K+ stars on GitHub, over 15.000 companies are using HF technology in production, including leading AI organizations such as Google, Elastic, Salesforce, Grammarly and NASA.

About the Role

As an On-device ML Engineer, you will explore cutting edge methods to run models on consumer platforms, with a special focus on Apple technologies. Your responsibilities will include optimizing, quantizing, and converting the best models for efficient execution on iPhones and Macs. Additionally, you will design, build, and contribute to open source software that demonstrates model usage and develop libraries to minimize friction for developers who may not be deeply familiar with ML. Beyond the technical challenges, your goal will be to disseminate these methods, facilitate their adoption, and create tools for the community.

Day-to-day tasks may include the following:

Model evaluation, considering quality, latency, memory, and storage needs. You understand the best model for a task may not be the latest SOTA, but the one with the best trade-off.
Strive to make SOTA models work efficiently on Apple platforms by converting them to native formats like Core ML or MLX, enabling execution on GPUs and the Neural Engine.
Dive into large codebases, such as Transformers, to optimize model architectures for Apple Silicon platforms, debug issues, and develop workarounds.
Write Swift code to implement or optimize ML tasks, including pre-and post-processing pipelines.
Produce high-quality technical documentation, including blog posts, tutorials, guides, social media threads, and concise demo apps.
Contribute to open source projects, like coremltools, to improve coverage of PyTorch operations.
Create tools that enable developers to convert, run, and share models easily, making it straightforward for researchers and practitioners to distribute models in device-friendly formats.
Occasionally, write or be ready to understand low-level code such as parallel GPU kernels.

About you

You’ll thrive in this position if you are:

Experienced Swift Developer: Have a strong background in Swift development with a practical, builder mindset and a good sense of software and application design.
Passionate About ML: Have a deep understanding of essential model architectures and a passion for machine learning.
Core ML Proficiency: Have experience using Core ML and understand its advantages and limitations.
Open Source Contributor: Are eager to publish and contribute to open-source libraries to help developers adopt ML.
Versatile Engineer: Can move across different levels of abstraction as needed, from UI to Metal kernels.
Readable Code: Write code that is easy to understand but are also prepared to make critical path ugly for optimization’s sake. (But just the critical path, please 🙂)
Optimization Techniques: Understand various optimization techniques, from kv-caching in transformers to post-training quantization and training-time methods.
System Understanding: Have a strong systems understanding and can identify performance bottlenecks.
Framework Proficiency: Have experience with various frameworks such as llama.cpp, MLX, PyTorch, and CoreNet.
Are a good debugger.
Can write excellent technical documentation.
Engage in discussion forums and communities about these topics.

If you’re interested in joining us but don’t tick every box above, we still encourage you to apply! We’re building a diverse team whose skills, experiences, and backgrounds complement one another. We’re happy to consider where you might be able to make the biggest impact.

More about Hugging Face

We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where you feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community, as well as the future of machine learning more broadly. Hugging Face is an equal opportunity employer, and we do not discriminate based on race, ethnicity, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or ability status.

We value development. You will work with some of the smartest people in our industry. We are an organization that has a bias for impact and is always challenging ourselves to grow continuously. We provide all employees with reimbursement for relevant conferences, training, and education.

We care about your well-being. We offer flexible working hours and remote options. We offer health, dental, and vision benefits for employees and their dependents. We also offer parental leave and flexible paid time off.

We support our employees wherever they are. While we have office spaces in NYC and Paris, we’re very distributed, and all remote employees have the opportunity to visit our offices. If needed, we’ll also outfit your workstation to ensure you succeed.

We want our teammates to be shareholders. All employees have company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.

Apply for this job

Hugging Face

On-device ML Engineer

Build Microservices in Go

Master microservices for beginners

About the Role

About you

More about Hugging Face