I am currently a Member of Technical Staff at OpenAI. Prior to this, I served as an Autopilot Software Engineer and then as an Engineering Manager at Tesla, working on compilation and deployment of trained neural network on Tesla's in-house inference hardware.

My earlier academic journey includes Postdoctoral Fellowships at Georgia Institute of Technology and Rice University. In the Spring of 2016, I interned at AMD Research in Sunnyvale, CA. I earned my PhD in Computer Science from the University of California, Riverside, under the supervision of Dr. Rajiv Gupta, specializing in irregular computations and graph processing on GPUs. I received my B.Sc. in Electrical Engineering from Sharif University of Technology in Tehran, Iran.


Publications & Patents

HPCS 2019    High Performance Multilevel Graph Partitioning on GPU
Bahareh Goodarzi, Farzad Khorasani, Vivek Sarkar, and Dhrubajyoti Goswami.
The 17th Annual International Conference on High Performance Computing & Simulation, 10 pages, July 2019.
ASPLOS 2019   
CORF: Coalescing Operand Register File for GPUs
Hodjat Asghari Esfeden, Farzad Khorasani, Hyeran Jeon, Daniel Wong, and Nael Abu-Ghazaleh.
The 24th International Conference on Architectural Support for Programming Languages and Operating Systems, 14 pages, April 2019.
[ Lightning Video ] [ Slides ]
MICRO 2018   
In-Register Parameter Caching for Dynamic Neural Nets with Virtual Persistent Processor Specialization
Farzad Khorasani, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, and Vivek Sarkar.
The 51st Annual IEEE/ACM International Symposium on Microarchitecture, 13 pages, October 2018.
[ Lightning Video ] [ Slides ]
ISCA 2018   
RegMutex: Inter-Warp GPU Register Time-Sharing
Farzad Khorasani, Hodjat Asghari Esfeden, Amin Farmahini-Farahani, Nuwan Jayasena, and Vivek Sarkar.
The 45th International Symposium on Computer Architecture, 13 pages, June 2018.
[ Lightning Video ] [ Slides ]
US Patent   
Compiler-assisted inter-simd-group register sharing
Appl. No.: 15/935,399. Inventors: Farzad Khorasani, Amin Farmahini-Farahani, and Nuwan Jayasena.
Assignee: Advanced Micro Devices Inc. September 2018.
SIN 2018     
High Performance and Scalable Graph Computation on GPUs  
Farzad Khorasani.
Sustainable Interdependent Networks. Studies in Systems, Decision and Control, vol 145. Springer, Cham, pages 67-75, February 2018.
IA^3 2017     
Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan.
In Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, Article No. 11, 4 pages, November 2017.
[ Code ]
MAPL 2017     
Dyna: Toward a self-optimizing declarative language for machine learning applications
Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, and Jason Eisner.
In Proceedings of the First ACM SIGPLAN Workshop on Machine Learning and Programming Languages, 10 pages, June 2017.
PhD Thesis     
High Performance Vertex-Centric Graph Analytics on GPUs
Farzad Khorasani.
Department of Computer Science, University of California Riverside, 2016.
ICS 2016     
CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs
Mehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, and Rajiv Gupta.
ACM 30th International Conference on Supercomputing, pages 1-12, June 2016.
IPDPS 2016   
Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement
Farzad Khorasani, Bryan Rowe, Rajiv Gupta, and Laxmi N. Bhuyan.
The 30th IEEE International Parallel and Distributed Processing Symposium, pages 524-533, May 2016.
[ Slides ]
MICRO 2015   
Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection
Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 48th Annual IEEE/ACM International Symposium on Microarchitecture, pages 204-215, December 2015.
[ Slides ]
PACT 2015    
Scalable SIMD-Efficient Graph Processing on GPUs
Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 39-50, October 2015.
[ Slides ] [ Code ]
PACT 2015    
Stadium Hashing: Scalable and Flexible Hashing on GPUs
Farzad Khorasani, Mehmet E. Belviranli, Rajiv Gupta, and Laxmi N. Bhuyan.
The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 63-74, October 2015.
[ Slides ]
HPDC 2014    
CuSha: Vertex-Centric Graph Processing on GPUs
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan.
The 23rd International ACM Symposium on High Performance Parallel and Distributed Computing, pages 239-251, June 2014.
[ Slides ] [ Code ]
LCPC 2014    
LightPlay: Efficient Replay with GPUs
Min Feng, Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan.
The 27th International Workshop on Languages and Compilers for Parallel Computing, September 2014.