A Cache Coherence-Aware Runtime Reconfigurable System-on-Chip for Efficient Language Model Inference
Je Yang, Gracen Wallace, Joseph Zuckerman, Gabriele Tombesi, and Luca P. Carloni
Je Yang, Gracen Wallace, Joseph Zuckerman, Gabriele Tombesi, and Luca P. Carloni
Gabriele Tombesi, Joseph Zuckerman, Je Yang, William Baisi, Kevin Lee, Davide Giri, and Luca P. Carloni
Je Yang, Gabriele Tombesi, Joseph Zuckerman, and Luca P. Carloni