Kevin Galim

Senior AI Research Engineer · FuriosaAI · Seoul, South Korea

prof_pic.jpg

FuriosaAI

Seoul, South Korea

I am a Senior AI Research Engineer at FuriosaAI, where I design scalable algorithms and systems that make modern AI models faster and more efficient. My research sits at the intersection of efficient inference, generative modeling, and AI systems.

I have authored 10+ publications at top-tier venues including ICLR, ICML, ACL, CVPR, ECCV, and WACV, with work spanning efficient LLM inference (speculative decoding, KV-cache optimization), parameter-efficient fine-tuning for state space models, and diffusion-based language models.

Before joining FuriosaAI, I worked on applied computer vision at Funzin (autonomous golf cart perception, CES 2021), GPU-accelerated image processing at ARRI in Munich, and freelance AR/web development. I received my M.Sc. in Informatics (Games Engineering) from the Technical University of Munich (grade 1.4), including a semester of research in computer graphics at the University of Tokyo.

Research interests:

  • Efficient LLM inference: speculative decoding, KV-cache compression, approximate inference
  • Parameter-efficient fine-tuning: LoRA, state space models (Mamba/SSMs)
  • Diffusion-based language models and generative systems
  • AI accelerator deployment and custom hardware pipelines

Languages: German (native) · English (fluent) · Korean (professional, TOPIK 5)

selected publications

  1. ICLR
    Draft-based Approximate Inference for LLMs
    Kevin Galim*, Ethan Ewer*, Wonjun Kang, and 3 more authors
    In International Conference on Learning Representations, 2026
    * Equal contribution
  2. ICLR
    ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs
    Wonjun Kang*, Kevin Galim*, Seunghyuk Oh*, and 8 more authors
    In International Conference on Learning Representations, 2026
    * Equal contribution
  3. ICML
    Parameter-Efficient Fine-Tuning of State Space Models
    Kevin Galim*, Wonjun Kang*, Yuchen Zeng*, and 2 more authors
    In International Conference on Machine Learning, 2025
    * Equal contribution