Senior Software Engineer, LLM Performance

Parasail
Barren, IL

Parasail is redefining AI infrastructure by enabling seamless deployment across a distributed network of GPUs, optimizing for cost, performance, and flexibility. Our mission is to empower AI developers with a fast, cost-efficient, and scalable cloud experience—free from vendor lock-in and designed for the next generation of AI workloads.

Job Description:

The Senior Software Engineer, LLM Performance plays a crucial role in delivering a competitive platform by focusing on efficiently scheduling, executing, and managing AI workloads on distributed compute systems. This role is deeply technical, spanning from low-level GPU kernels to distributed AI orchestration and Kubernetes (K8s) deployments. It is about more than optimization; it’s about pioneering efficient infrastructure that supports AI’s transformative role in reshaping productivity, revolutionizing industries, and addressing some of the world’s most challenging problems. You’ll ensure that generative AI — including large language models (LLMs), multi-modal models, and diffusion models — operates efficiently at enterprise scale while driving continuous improvements in cost, performance, and sustainability.

Responsibilities:



  • Add support for new LLMs, working across the stack from low-level GPU kernels to Kubernetes-based deployments.



  • Contribute to cutting-edge open-source LLM engines such as vLLM or SGLang to extend their capabilities and performance (e.g. use Python technologies to improve API servers or request schedulers).



  • Operate closer to the hardware, focusing on building and integrating solutions to boost performance and hardware utilization. For example, improve attention backends like FlashAttention or FlashInfer by contributing to their development and optimization, or by integrating their solutions into vLLM.



  • Improve LLM performance using advanced algorithmic solutions such as speculative decoding, quantization, or other state-of-the-art techniques. Understand the impact of such techniques in model quality.


Qualifications:



  • Expertise in GPU computing, including low-level platforms such as CUDA, ROCm, XLA, PyTorch, Jax, etc.



  • Background in performance analysis and optimization of AI/HPC workloads (e.g. profiling or theoretical analysis of Flops and bandwidth).



  • Experience in writing GPU kernels using technologies like CUDA, CUTLASS, Triton.



  • Strength in Python and C++.



  • Demonstrated contributions to open-source projects. Contributions to inference engines such as vLLM is a strong plus.



  • A production-oriented mindset emphasizing robust, scalable code suitable for enterprise-grade applications.



  • A relentless curiosity about cutting-edge AI technologies combined with a passion for solving complex problems.


What You Bring to the Table: We are looking for people who are eager to learn and master the lower-level compute concepts that are critical for the AI revolution. With us, your skills will not only contribute to coding but will also have a significant impact on the scalability and efficiency of AI applications at large. If you're geared up for the challenge of optimizing AI performance and eager to push our technological prowess to new heights, we're excited to welcome you aboard.

Posted 2026-02-10

Recommended Jobs

Production Supervisor

Chicago, IL

Job Title: Production Supervisor Step into a leadership role with a global innovator in food ingredients. As a Production Supervisor , you’ll guide a dedicated team in producing high‑quality foo…

View Details
Posted 2026-01-26

Card Operations Intern - Summer 2026

Consumers Credit Union
Lake Forest, IL

Consumers Credit Union (CCU) is one of the largest, fastest growing credit unions in Illinois! Every year we look for a diverse group of hardworking students to join our IGNITE Intern Program. What…

View Details
Posted 2026-02-06

Captain

Petite Edith
Chicago, IL

Petite Edith is the newest restaurant from James Beard-nominated chef Jenner Tomaska & Katrina Bravo (the husband-wife duo behind Chicago's Michelin-starred Esmé). Located at the intersection of th…

View Details
Posted 2026-02-03

Senior Java Developer (Onsite Chicago 5x/week)

Insight Global
Chicago, IL

Job Description Come join an exciting team within Global Information Security (GIS). Cyber Security Technology (CST) is responsible for cyber security innovation and architecture, engineering, …

View Details
Posted 2026-01-26

Supervisor - Respiratory Care (Urbana)

Carle Health
Urbana, IL

Overview This position will act as a Supervisor role over the night shift Respiratory Care staff, including RC techs and RRT's. They will assist manager in interviewing, hiring, onboarding/training…

View Details
Posted 2026-02-09

Bilingual HR Safety Coordinator

Woodridge, IL

Safety Coordinator | HR & Safety Compliance Support (Oversee plant safety, employee onboarding, OSHA tracking, and ensure HR compliance in a manufacturing setting.) Location: Woodridge, IL ⏰ Schedul…

View Details
Posted 2026-01-21

Personal Trainer

Fitness Holdings
DeWitt County, IL

Benefits: ~Employee discounts ~Free uniforms ~Opportunity for advancement Are you looking for a career opportunity in one of the fastest growing fitness clubs? With over 45 locations currently…

View Details
Posted 2026-01-29

Registered Nurse, RN - Part-time Days

ScionHealth
Illinois

What makes Kindred Healthcare a great place to work? Our people, of course! Our Registered Nurses answer this special calling because they have a fundamental, internal drive to directly hel…

View Details
Posted 2026-01-22

Fertilizer Tech

Green Acres Lawn Care
Belvidere, IL

Green Acres Lawn Care is a fast growing lawn care, maintenance, irrigation and pest control company serving Belvidere, Rockford, and the surrounding area. We have been in business for 19 years, servin…

View Details
Posted 2025-08-28