Skip to content

Improving Tridiagonalization Performance on GPU Architectures

H Wang, Z Duan, Z Zhao, S Wu, S Zheng, Q Li, X Jiang, S Zhang

PPoPP '25 (CCF-A) https://dl.acm.org/doi/abs/10.1145/3710848.3710894 📄pdf

High Performance Householder QR Factorization On Emerging GPU architectures Using Tensor Cores

Y Leng, G Zou, H Wang, P Wu, S Zhang

TPDS (CCF-A) https://ieeexplore.ieee.org/abstract/document/10816084 📄pdf

Fast symmetric eigenvalue decomposition via wy representation on tensor core

S Zhang, R Shah, H Ootomo, R Yokota, P Wu

PPoPP '23(CCF-A) https://dl.acm.org/doi/10.1145/3572848.3577516 📄pdf

Recursion Brings Speedup to Out-of-Core TensorCore-based Linear Algebra Algorithms: A Case Study of Classic Gram-Schmidt QR Factorization

S Zhang, P Wu

ICPP '21 (CCF-B) https://dl.acm.org/doi/10.1145/3472456.3473522 📄pdf

High Accuracy Matrix Computations on Neural Engines: A Study of QR Factorization and its Applications best paper candidate

S Zhang, E Baharlouei, P Wu

HPDC '20 (csranking reco) https://dl.acm.org/doi/abs/10.1145/3369583.3392685 📄pdf

TensorSVM: Accelerating Kernel Machines with Tensor Engine

S Zhang, R Shah, P Wu

ICS '20 (csranking reco) https://dl.acm.org/doi/10.1145/3392717.3392770 📄pdf