HPCS 2019    |
High Performance Multilevel Graph Partitioning on GPU
Bahareh Goodarzi, Farzad Khorasani, Vivek Sarkar, and Dhrubajyoti Goswami.
The 17th Annual International Conference on High Performance Computing & Simulation, 10 pages,
July 2019.
ASPLOS 2019    |
CORF: Coalescing Operand Register File for GPUs
Hodjat Asghari Esfeden, Farzad Khorasani, Hyeran Jeon, Daniel Wong, and Nael Abu-Ghazaleh.
The 24th International Conference on Architectural Support for Programming Languages and Operating Systems, 14 pages,
April 2019.
[ Lightning Video ] [ Slides ]
MICRO 2018    |
In-Register Parameter Caching for Dynamic Neural Nets with Virtual Persistent Processor Specialization
Farzad Khorasani, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, and Vivek Sarkar.
The 51st Annual IEEE/ACM International Symposium on Microarchitecture, 13 pages,
October 2018.
[ Lightning Video ] [ Slides ]
ISCA 2018    |
RegMutex: Inter-Warp GPU Register Time-Sharing
Farzad Khorasani, Hodjat Asghari Esfeden, Amin Farmahini-Farahani, Nuwan Jayasena, and Vivek Sarkar.
The 45th International Symposium on Computer Architecture, 13 pages,
June 2018.
[ Lightning Video ] [ Slides ]
US Patent    |
Compiler-assisted inter-simd-group register sharing
Appl. No.: 15/935,399. Inventors: Farzad Khorasani, Amin Farmahini-Farahani, and Nuwan Jayasena.
Assignee: Advanced Micro Devices Inc. September 2018.
SIN 2018      |
High Performance and Scalable Graph Computation on GPUs  
Farzad Khorasani.
Sustainable Interdependent Networks. Studies in Systems, Decision and Control, vol 145. Springer, Cham, pages 67-75,
February 2018.
IA^3 2017      |
Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan.
In Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, Article No. 11, 4 pages,
November 2017.
[ Code ]
MAPL 2017      |
Dyna: Toward a self-optimizing declarative language for machine learning applications
Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, and Jason Eisner.
In Proceedings of the First ACM SIGPLAN Workshop on Machine Learning and Programming Languages, 10 pages,
June 2017.
PhD Thesis      |
High Performance Vertex-Centric Graph Analytics on GPUs
Farzad Khorasani.
Department of Computer Science, University of California Riverside,
ICS 2016      |
CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs
Mehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, and Rajiv Gupta.
ACM 30th International Conference on Supercomputing, pages 1-12,
June 2016.
IPDPS 2016    |
Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement
Farzad Khorasani, Bryan Rowe, Rajiv Gupta, and Laxmi N. Bhuyan.
The 30th IEEE International Parallel and Distributed Processing Symposium, pages 524-533,
May 2016.
[ Slides ]
MICRO 2015    |
Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection
Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 48th Annual IEEE/ACM International Symposium on Microarchitecture, pages 204-215,
December 2015.
[ Slides ]
PACT 2015     |
Scalable SIMD-Efficient Graph Processing on GPUs
Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 39-50,
October 2015.
[ Slides ] [ Code ]
PACT 2015     |
Stadium Hashing: Scalable and Flexible Hashing on GPUs
Farzad Khorasani, Mehmet E. Belviranli, Rajiv Gupta, and Laxmi N. Bhuyan.
The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 63-74,
October 2015.
[ Slides ]
HPDC 2014     |
CuSha: Vertex-Centric Graph Processing on GPUs
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan.
The 23rd International ACM Symposium on High Performance Parallel and Distributed Computing, pages 239-251,
June 2014.
[ Slides ] [ Code ]
LCPC 2014     |
LightPlay: Efficient Replay with GPUs
Min Feng, Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 27th International Workshop on Languages and Compilers for Parallel Computing,
September 2014.