Improving Tridiagonalization Performance on GPU Architectures
H Wang, Z Duan, Z Zhao, S Wu, S Zheng, Q Li, X Jiang, S Zhang
PPoPP '25 (CCF-A) https://dl.acm.org/doi/abs/10.1145/3710848.3710894 📄pdf
High Performance Householder QR Factorization On Emerging GPU architectures Using Tensor Cores
Y Leng, G Zou, H Wang, P Wu, S Zhang
TPDS (CCF-A) https://ieeexplore.ieee.org/abstract/document/10816084 📄pdf
Fast symmetric eigenvalue decomposition via wy representation on tensor core
S Zhang, R Shah, H Ootomo, R Yokota, P Wu
PPoPP '23(CCF-A) https://dl.acm.org/doi/10.1145/3572848.3577516 📄pdf
Recursion Brings Speedup to Out-of-Core TensorCore-based Linear Algebra Algorithms: A Case Study of Classic Gram-Schmidt QR Factorization
S Zhang, P Wu
ICPP '21 (CCF-B) https://dl.acm.org/doi/10.1145/3472456.3473522 📄pdf
High Accuracy Matrix Computations on Neural Engines: A Study of QR Factorization and its Applications best paper candidate
S Zhang, E Baharlouei, P Wu
HPDC '20 (csranking reco) https://dl.acm.org/doi/abs/10.1145/3369583.3392685 📄pdf
TensorSVM: Accelerating Kernel Machines with Tensor Engine
S Zhang, R Shah, P Wu
ICS '20 (csranking reco) https://dl.acm.org/doi/10.1145/3392717.3392770 📄pdf
