ModelZoo - Senior Software Engineer

Products

Ara-2

Shaping the future of Generative AI at the Edge
Explore

Ara-1

Machine Learning Acceleration for AI-Integrated Cameras and Embedded Systems
Explore

Kinara Software

Seamless mapping of AI models to Ara Silicon, streamlined deployment for highest productivity
Explore
Industry
Smart Retail

Kinara Edge AI processors power the Next-Gen Retail Experience

Explore

Smart Cities

Smart Cities: Driving Frictionless Experiences using Edge Intelligence

Explore

Manufacturing

Optimizing Manufacturing Processes with Edge AI

Explore
Resources

Demos & Videos
GenAI LLaVA demo with Ara-2

Demos & Videos
Kinara Ara-2 Masters Local LLM Chatbot at 12 Output Tokens/Sec

Demos & Videos
Building An Edge AI Appliance

View All

In the News
LLM in Edge? No problem!

In the News
Kinara Processor Boosts Performance and Enables Edge AI

In the News
How Good Is My AI?

View All

Press Releases
Ara-2 Runs 7 Billion Parameter LLMs @ 12 Tokens/Sec

Press Releases
Announcing the Kinara Ara-2 Processor

Press Releases
Kinara and ENERZAi Partnership Delivers High Performance Edge AI Processor with Optimized AI Model Technology

View All

Product Briefs
Kinara Ara-2 Product Brief

View All

White Papers
Understanding Sources of Inefficiency in General-Purpose Chips.

White Papers
Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing.

White Papers
TOPS can be an unreliable performance indicator on real-world applications.

View All
Company
We are Kinara

Led by Silicon Valley veterans and a world class development team in India, Kinara envisions a world of exceptional customer experiences, manufacturing efficiency, and greater safety for all of us

About Kinara

Team

Investors

Careers

Contact Us

Back to All

ModelZoo – Senior Software Engineer

Hyderabad, India

Who we are

Kinara is a Bay Area-based venture backed company. Our architecture is based on research done at Stanford University by Rehan Hameed and Wajahat Qadeer under the guidance of legendary Prof. Mark Horowitz (http://www.vlsi.stanford.edu/~horowitz/) and Prof. Christos Kozyrakis (http://csl.stanford.edu/~christos/).

What we do

Hot startup delivering on Generative AI Semiconductor play for the edge
Patented technology developed by founders during Ph.D at Stanford
Best in-class Silicon performance, power, Only Edge AI company shipping in volume
Peerless software tool suite that’s a game changer for the company
Well funded, Marquee (GAFA) customers including top e-tailer and Tier 1 PC OEM
Two generations of Silicon shipping in volume, mature software stack
Well capitalized, Series B led by Tiger Global, TSMC, Western Digital, Stanford, Catchlight

Job Summary

We’re lookingfora skilled and motivated Machine Learning Software Engineer with 2- 5 years of experience to join ourteam. The ideal candidate will have a solid foundation in deep learning and a strong interest in optimizing and deploying ML models on specialized hardware. This role involves implementing model optimizations, with a particular focus on quantization, to improve the performance of machine learning inference on target platforms.

Key Responsibilities

Model Porting s Deployment: Port and deploy deep learning models from frameworks like PyTorch and TensorFlow to proprietary or commercial ML accelerator hardware platforms.
Performance Optimization: Analyze and improve the performance of ML models for target hardware, focusing on latency andthroughput
Quantization: Contribute to model quantization efforts (e.g., INT8) to reduce model size and accelerate inference while maintaining model accuracy.
ProfilingsDebugging: Use profiling tools to identify and fix performance bottle necks in the ML inference pipeline on the accelerator.
Define and document interface specifications, control/status logic, and pipeline structures.
Lead PPA analysis and trade-off discussions across RTL and architecture.

Necessary Qualifications

Experience: 2-5 years of professional experience in software engineering, with a focus on machine learning model deployment and optimization

TechnicalSkills:

Proficiency in deep learning frameworks such as PyTorch and TensorFlow.
Hands-onexperience with deploying and optimizing models on GPUs or other specialized accelerators.
Some experience with model quantization (Post-Training Quantization).
Strong proficiency in C++ and Python.
Experience with GPU programming models like CUDA/cuDNN is a plus.
Familiarity with ML inference engines and runtimes (e.g., TensorRT, OpenVINO, TensorFlow Lite).
Foundational understanding of computer architecture principles.

Version Control: Proficient with Git and collaborative development workflows.
Education: Bachelor’s or Master’s degree in Computer Science, Electrical
Engineering, or a related field.

Preferred Qualifications:

Knowledge of hardware-aware model design.
Familiarity with compilertechnologies for deep learning.
Experience with real-time or embedded systems.
Knowledge of cloud platforms (AWS, GCP, Azure).
Experience with CI/CD pipelines for ML models.

Please send your resume and cover letter

Our team is happy to answer any questions. Please fill out the form below and we’’ll get back to you soon.

Ara-2

Ara-1

Kinara Software

Smart Retail

Smart Cities

Manufacturing