Farzad Khorasani

I am currently a Member of Technical Staff at OpenAI. Prior to this, I served as an Autopilot Software Engineer and then as an Engineering Manager at Tesla, working on compilation and deployment of trained neural network on Tesla's in-house inference hardware.

My earlier academic journey includes Postdoctoral Fellowships at Georgia Institute of Technology and Rice University. In the Spring of 2016, I interned at AMD Research in Sunnyvale, CA. I earned my PhD in Computer Science from the University of California, Riverside, under the supervision of Dr. Rajiv Gupta, specializing in irregular computations and graph processing on GPUs. I received my B.Sc. in Electrical Engineering from Sharif University of Technology in Tehran, Iran.

Publications & Patents

HPCS 2019	High Performance Multilevel Graph Partitioning on GPU Bahareh Goodarzi, Farzad Khorasani, Vivek Sarkar, and Dhrubajyoti Goswami. The 17th Annual International Conference on High Performance Computing & Simulation, 10 pages, July 2019.
ASPLOS 2019	CORF: Coalescing Operand Register File for GPUs Hodjat Asghari Esfeden, Farzad Khorasani, Hyeran Jeon, Daniel Wong, and Nael Abu-Ghazaleh. The 24th International Conference on Architectural Support for Programming Languages and Operating Systems, 14 pages, April 2019. [ Lightning Video ] [ Slides ]
MICRO 2018	In-Register Parameter Caching for Dynamic Neural Nets with Virtual Persistent Processor Specialization Farzad Khorasani, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, and Vivek Sarkar. The 51st Annual IEEE/ACM International Symposium on Microarchitecture, 13 pages, October 2018. [ Lightning Video ] [ Slides ]
ISCA 2018	RegMutex: Inter-Warp GPU Register Time-Sharing Farzad Khorasani, Hodjat Asghari Esfeden, Amin Farmahini-Farahani, Nuwan Jayasena, and Vivek Sarkar. The 45th International Symposium on Computer Architecture, 13 pages, June 2018. [ Lightning Video ] [ Slides ]
US Patent	Compiler-assisted inter-simd-group register sharing Appl. No.: 15/935,399. Inventors: Farzad Khorasani, Amin Farmahini-Farahani, and Nuwan Jayasena. Assignee: Advanced Micro Devices Inc. September 2018.
SIN 2018	High Performance and Scalable Graph Computation on GPUs Farzad Khorasani. Sustainable Interdependent Networks. Studies in Systems, Decision and Control, vol 145. Springer, Cham, pages 67-75, February 2018.
IA^3 2017	Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. In Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, Article No. 11, 4 pages, November 2017. [ Code ]
MAPL 2017	Dyna: Toward a self-optimizing declarative language for machine learning applications Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, and Jason Eisner. In Proceedings of the First ACM SIGPLAN Workshop on Machine Learning and Programming Languages, 10 pages, June 2017.
PhD Thesis	High Performance Vertex-Centric Graph Analytics on GPUs Farzad Khorasani. Department of Computer Science, University of California Riverside, 2016.
ICS 2016	CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs Mehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, and Rajiv Gupta. ACM 30th International Conference on Supercomputing, pages 1-12, June 2016.
IPDPS 2016	Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement Farzad Khorasani, Bryan Rowe, Rajiv Gupta, and Laxmi N. Bhuyan. The 30th IEEE International Parallel and Distributed Processing Symposium, pages 524-533, May 2016. [ Slides ]
MICRO 2015	Efficient Warp Execution in Presence of Divergence with Collaborative Context Collection Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan. The 48th Annual IEEE/ACM International Symposium on Microarchitecture, pages 204-215, December 2015. [ Slides ]
PACT 2015	Scalable SIMD-Efficient Graph Processing on GPUs Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan. The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 39-50, October 2015. [ Slides ] [ Code ]
PACT 2015	Stadium Hashing: Scalable and Flexible Hashing on GPUs Farzad Khorasani, Mehmet E. Belviranli, Rajiv Gupta, and Laxmi N. Bhuyan. The 24th International Conference on Parallel Architectures and Compilation Techniques, pages 63-74, October 2015. [ Slides ]
HPDC 2014	CuSha: Vertex-Centric Graph Processing on GPUs Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. The 23rd International ACM Symposium on High Performance Parallel and Distributed Computing, pages 239-251, June 2014. [ Slides ] [ Code ]
LCPC 2014	LightPlay: Efficient Replay with GPUs Min Feng, Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan. The 27th International Workshop on Languages and Compilers for Parallel Computing, September 2014.