Publications


2019

A Time-Space Sharing Selected Scheduling Abstraction for Next Generation of Shared Cloud via Vertical Labels
Yuzhao Wang, Lele Li, You Wu, Junqing Yu, Zhibin Yu, Xuehai Qian
ISCA 2019
TIE: Energy-Efficient Tensor Train-Based Inference Engine for Deep Neural Network
Chunhua Deng, Fangxuan Sun, Xuehai Qian, Jun Lin, Zhongfeng Wang, Bo Yuan
ISCA 2019
A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron Superconducting Technology
Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Yoshikawa Nobuyuki, Yanzhi Wang
ISCA 2019
HOP: Heterogeneity-Aware Decentralized Training
Qinyi Luo, Jinkun Lin, Youwei Zhuo, Xuehai Qian
ASPLOS 2019
SW-Lock: A Fast Lock for Sunway Taihulight
Xiongchao Tang, Jidong Zhai, Xuehai Qian, Wenguang Chen
ASPLOS 2019
ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers
Ao Ren, Jiayu Li, Tianyun Zhang, Shaokai Ye, Wenyao Xu, Xuehai Qian, Xue Lin, Yanzhi Wang
ASPLOS 2019
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
Linghao Song, Jiachen Mao, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen
HPCA 2019
A Hybrid Framework for Fast and Accurate GPU Performance Estimation through Source-Level Analysis and Trace-Based Simulation
Xiebing Wang, Kai Huang, Alois Knoll, Xuehai Qian
HPCA 2019
E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs
Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang
HPCA 2019

2018

CLIP: A Disk I/O Focused Parallel Out-of-core Graph Processing System
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, Weimin Zheng
IEEE Transactions on Parallel and Distributed Systems
CSE: Parallel Finite State Machines with Convergence Set Enumeration
Youwei Zhuo, Jinglei Cheng, Qinyi Luo, Jidong Zhai, Yanzhi Wang, Zhongzhi Luan, Xuehai Qian
MICRO 2018
CounterMiner: Mining Big Performance Data from Hardware Counters
Yirong Lv, Bin Sun, Qinyi Luo, Zhibin Yu, Xuehai Qian
MICRO 2018
PermDNN: Efficient Compressed Deep Neural Network Architecture with Permuted Diagonal Matrices
Chunhua Deng, Siyu Liao, Yi Xie, Keshab K. Parhi, Xuehai Qian, Bo Yuan
MICRO 2018
HEIF: Highly Efficient Stochastic Computing based Inference Framework for Deep Neural Networks
Zhe Li, Ji Li, Ao Ren, Ruizhe Cai, Caiwen Ding, Xuehai Qian, Jeffrey Draper, Bo Yuan, Jian Tang, Qinru Qiu, Yanzhi Wang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2018
ReRAM-based accelerator for deep learning
Bing Li, Linghao Song, Fan Chen, Xuehai Qian, Yiran Chen, Hai Helen Li
DATE 2018
Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems
Xiaoxiao Liu, Wei Wen, Xuehai Qian, Hai Li, Yiran Chen
ASP-DAC 2018
vSensor: Leveraging Fixed-Workload Modules of Programs for Performance Variance Detection
Xiongchao Tang, Jidong Zhai, Xuehai Qian, Bingsheng He, Wei Xue, Wenguang Chen
PPOPP 2018
DudeTx: Durable Transactions Made Decoupled
Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng, Jinglei Ren
ACM Transaction on Storage 2018
Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System
Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chenying Huan, Kang Chen
ASPLOS 2018
VIBNN: Hardware Acceleration of Bayesian Neural Networks
Ruizhe Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang
ASPLOS 2018
DAC: Data-Aware Auto-Tuning High Dimensional Configurations of In-Memory Cluster Computing.
Zhibin Yu, Zhendong Bei, Xuehai Qian
ASPLOS 2018
Towards Ultra-High Performance and Energy Effciency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework.
Yanzhi Wang, Caiwen Ding, Geng Yuan, Siyu Liao, Zhe Li, Xiaolong Ma, Bo Yuan, Xuehai Qian, Jian Tang, Qinru Qiu, Xue Lin
AAAI 2018
GraphR: Accelerating Graph Processing Using ReRAM.
Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen
HPCA 2018
GraphP: Reducing Communication of PIM-based Graph Processing with Efficient Data Partition
Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, Xuehai Qian
HPCA 2018
G-TSC: Timestamp Based Coherence for GPUs
Abdulaziz Tabbakh, Xuehai Qian, Murali Annavaram
HPCA 2018

2017

CIRCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices
Caiwen Ding, Yanzhi Wang, Siyu Liao, Zhe Li, Yu Bai, Youwei Zhuo, Chao Wang, Xuehai Qian, Ning Liu, Geng Yuan, Xiaolong Ma, Yipeng Zhang, Xue Lin, Jian Tang, Qinru Qiu, Bo Yuan
MICRO 2017
Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, Weimin Zheng
ATC 2017
Power Efficient Sharing-Aware GPU Data Management
Abdulaziz Tabbakh, Murali Annavaram and Xuehai Qian
IPDPS 2017
DudeTM: Building Durable Transactions with Decoupling for Persistent Memory
Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng and Jinglei Ren
ASPLOS 2017
SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing
Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan and Yanzhi Wang
ASPLOS 2017
PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning
Linghao Song Xuehai Qian, Hai Li and Yiran Chen
HPCA 2017

2016

Exploring the Hidden Dimension in Graph Processing
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li and Weimin Zheng
OSDI 2016
SReplay: Deterministic Group Replay for One-Sided Communication
Xuehai Qian, Koushik Sen, Paul Hargrove and Costin Iancu
ICS 2016

Prior to 2015

Pacifier: Record and Replay for Relaxed-Consistency Multiprocessors with Distributed Directory Protocol
Xuehai Qian, Benjamin Sahelices and Depei Qian
ISCA 2014
OmniOrder: Directory-Based Conflict Serialization of Transactions
Xuehai Qian, Benjamin Sahelices and Josep Torrellas
ISCA 2014
BulkCommit: Scalable and Fast Commit of Atomic Blocks in a Lazy Multiprocessor Environment
Xuehai Qian, Benjamin Sahelices, Josep Torrellas and Depei Qian
MICRO 2013
Volition: Precise and Scalable Sequential Consistency Violation Detection
Xuehai Qian, Benjamin Sahelices, Josep Torrellas and Depei Qian
ASPLOS 2013
Rainbow: Efficient Memory Race Recording with High Replay Parallelism for Relaxed Memory Model
Xuehai Qian, He Huang, Benjamin Sahelices and Depei Qian
HPCA 2013
BulkSMT: Designing SMT Processors for Atomic-Block Execution
Xuehai Qian, Wonsun Ahn and Josep Torrellas
HPCA 2012
ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
Xuehai Qian, Wonsun Ahn and Josep Torrellas
MICRO 2010
Optmized Register Renaming Scheme for Stack-Based x86 Operations
Xuehai Qian, He Huang, Zhenzhong Duan, Junchao Zhang, Nan Yuan, Yongbin Zhou, Hao Zhang, Huimin Cui and Dongrui Fan
ARCS 2007
Circuit Implementation of Floating Point Range Reduction for Trigonometric Functions
Xuehai Qian, Hao Zhang, Jingang Yang, He Huang, Junchao Zhang and Dongrui Fan
ISCAS 2007
Design and Implementation of Floating Point Stack on General RISC Architecture
Xuehai Qian, He Huang, Hao Zhang, Guoping Long, Junchao Zhang and Dongrui Fan
PDP 2007