Domain-Specific Accelerator Design

Sparsity-aware PE architecture

  • Summary:
    - Multi-window dataflow architecture for selective weight update
    - Redundancy Censoring Unit (RCU)-based PE architecture
  • Reference:
    - Jaekang Shin, Seungkyu Choi, Yeongjae Choi, and Lee-Sup Kim, "A Pragmatic Approach to On-device Incremental Learning System with Selective Weight Updates," ACM/IEEE Design Automation Conference (DAC), 2020.
    - Kangkyu Park, Seungkyu Choi, Yeongjae Choi, and Lee-Sup Kim, "Rare Computing: Removing Redundant Multiplications from Sparse and Repetitive Data in Deep Neural Networks," IEEE Transactions on Computers, Apr. 2022.
sparsity-aware PE architecture

Mixed-precision Processing units

  • Summary:
    - We design unified hardware architectures to eliminate computational and memory inefficiencies across the entire AI lifecycle. Our research centers on flexible, mixed-precision processing units capable of seamlessly supporting heterogeneous data types—from standard formats to advanced block-scaled representations for Large Language Models. By dynamically adapting numerical precision to workload demands, we develop scalable accelerators that maximize throughput and energy efficiency without compromising model accuracy.
  • Reference:
    - Seungkyu Choi, Jaekang Shin, and Lee-Sup Kim, "A Deep Neural Network Training Architecture with Inference-aware Heterogeneous Data-type," IEEE Transactions on Computers, May 2022.
    - Jongwoo Park, Hyeonsung Kim, Jiyun Han, and Seungkyu Choi, "A Mixed-Precision Architecture for Efficient DNN Training with Inference-aware Data-type," ACM/IEEE Design Automation Conference (DAC), 2025.
Mixed-precision Processing units

Memory-Efficient Architecture for Vector Search

  • Summary:
    - We specialize in architecting scalable hardware-software solutions to eliminate memory and I/O bottlenecks in massive-scale AI retrieval systems. Our research bridges CPU, custom ASIC, and in-storage processing to accelerate vector similarity search for large language models. By pioneering novel data quantization schemes and speculative execution models, we deliver ultra-fast, high-recall vector databases tailored for modern AI workloads.
  • Reference:
    - Seongjoon Cho, Junyoung Park, Donghyun Kang, Moohyeon Nam, Hongchan Roh, Moo-Kyoung Chung, Se-Hyun Yang, and Seungkyu Choi, "LOHA: A Latency-Optimized CPU-Storage Hybrid Architecture for Billion-Scale Graph-based Vector Similarity Search," ACM/IEE Design Automation Conference (DAC), 2026.
    - Seongjoon Cho, Junyoung Park, Donghyun Kang, Moohyeon Nam, Hongchan Roh, Moo-Kyoung Chung, Se-Hyun Yang, and Seungkyu Choi, "Q-VESA: Accelerating Quantization-Aware Vector Search for Fast Retrieval in Prompt Engineering," IEEE Transactions on Computers, Feb. 2026.
Processing Near Memory (PNM) for Vector Search

Hardware-Efficient Data Formats

  • Summary:
    - a novel 8-bit PTQ data format designed for various DNNs
    - Leveraging the dynamic configuration of exponent and fraction bits derived from Posit data format, but demonstrates enhanced decoding efficiency
  • Reference:
    - Nguyen-Dong Ho, Gyujun Jeong, Cheol-Min Kang, Seungkyu Choi, and Ik-Joon Chang, "MERSIT: A Hardware-Efficient 8-bit Data Format with Enhanced Post-Training Quantization DNN Accuracy" ACM/IEEE Design Automation Conference (DAC), 2024.
Hardware-Efficient Data Formats
← Back to Research