About Me
Hi! I’m Fabian, a Hardware Compiler Engineer at SiFive in the San Francisco Bay Area. I work at the intersection of hardware and software, building fast and reliable compilers to design the next generation of open RISC-V systems.
Before joining SiFive in 2021, I completed my PhD at ETH Zürich where I explored energy-efficient Computer Architectures based on the RISC-V ISA by leveraging clever instruction set extensions and new memory architectures.
My research – cited over 1000 times – ranges from hardware compiler intermediate representations to many-core compute clusters and energy-efficient operand streaming CPU architectures. Throughout my PhD, I have worked on various ASICs covering RTL design, verification, synthesis, place & route, DRC fixing, and testing of the manufactured silicon on an industry-grade ASIC tester.
I am most skilled in C/C++, Python, and SystemVerilog. During night hours I enjoy coding in Rust. When I’m not deep in compilers or ISA design, I tinker with hardware projects on my YouTube channel – recently building an 8-bit superscalar CPU on breadboards and custom PCBs.
Projects
A foundation for building the next generation of open hardware design tools.
The CIRCT project is developing a collection of MLIR-based intermediate representation dialects for hardware design tools. These dialects separate input languages such as Chisel, SystemVerilog, and VHDL, from EDA tools such as simulators, synthesizers, and placers/routers. Separating concerns makes writing such tools easier, allows for more rich and complex HDLs, and does not require vendors to agree upon the implementation of a language.
I am one of the main contributors of the CIRCT project, including its SystemVerilog and Chisel frontend, LLVM-based hardware simulator, and core dialects to represent combinational and sequential logic, simulation and verification constructs, and much more. The LLHD IR presented at PLDI now serves as a core IR in CIRCT.
See the LLHD talk and the PLDI paper.
A dependency management tool for hardware projects.
Bender is the dependency management tool that drives IP development and many ASIC tape-outs at the Integrated Systems Laboratory. Heavily inspired by Rust’s cargo, it implements fully decentralized dependency resolution and offers commands to generate compilation and analysis scripts for common ASIC and FPGA EDA tools.
A key challenge in supporting ASIC tape-outs is the necessity to provide reproducible builds at all costs, and catering to the hectic nature of last-minute global fixes to the code base. Bender tackles this problem through strict lock files, vendorization, and a hands-off approach to dependency checkouts.
See the Bender talk.
A dependency management tool for hardware projects.
On my YouTube channel I tinker with hardware projects. Most recently I’ve been building an 8-bit superscalar CPU using breadboards and dedicated custom PCBs.
Talks
I presented our paper submission “LLHD: A Multi-level Intermediate Representation for Hardware Description Languages” at the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020).
Chips
Technology demonstrator designed by me and two fellow PhD colleagues, in collaboration with Globalfoundries.
Baikonur features three eight-core Snitch clusters, alongside two Ariane cores, and a custom purely-digital DDR link. It contains a total of 24 Snitch cores with 64-bit FPUs, which together with the Xssr and Xfrep extensions are capable of delivering up to 48 Gflop/s (float64) 96 Gflop/s (float32) each cycle.
Technology demonstrator designed by me and two fellow PhD colleagues, in collaboration with Globalfoundries.
Kosmodrom features two 64-bit Ariane RISC-V cores (RV64GC) and an NTX cluster for ultra-energy-efficient training of Deep Neural Networks. The Ariane cores are implemented in two different standard cell flavours targeting different operating points. The NTX cluster contains a RISC-V core paired with 8 NTX units for 32-bit floating-point workloads.
Fine-grained power gating demonstrator designed by me and fellow PhD colleagues and students.
Thestral features a total of 10 Snitch cores. One core serves as the system governor, one as the DMA controller, and the remaining 8 have individually power-gated 64-bit FPU and separate Integer Processing Units attached. The goal of this chip is to showcase that fine-grained power gating for functional units is feasible.
Research chip designed by me and a fellow PhD colleague.
Billywig is a first implementation of a novel breed of processing systems, featuring 4 ultra-small Snitch cores (RV32IMAFD) paired with a large 64-bit FPU capable of vectorized 32-bit operations. The Xssr and Xfrep extensions allow the tiny core to operate in pseudo-dual-issue mode and achieve extreme energy efficiency.
Designed by me and a fellow bachelor student.
Hecate is one out of a series of four student chips investigating techniques for sharing FPUs among processor cores. It features 4 OR10N OpenRISC cores, 2 FPUs, and a custom round-robin arbitrated sharing interconnect for the FPUs.
Designed by four bachelor students under my supervision.
Xavier is geared towards acquisition of EEG signals and features a 32-bit RI5CY core (RV32ICMF), 8 SPI ports to attach ADCs, integrated filtering capabilities, and a hardware accelerator for Quantized Neural Networks.
Designed by four bachelor students under my supervision.
Scarabaeus features a 64-bit Ariane RISC-V core (RV64GC), a custom DMA capable of multi-dimensional tensor transfers, and a custom HyperBus peripheral controller for interfacing with a Cypress HyperRAM.
Publications
-
LLHD: A Multi-Level Intermediate Representation for Hardware Description Languages
Fabian Schuiki, Andreas Kurth, Tobias Grosser, and Luca Benini
41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2020. arXiv:2004.03494.
-
Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads
Florian Zaruba, Fabian Schuiki, Torsten Hoefler, and Luca Benini
To be published, 2020. arXiv:2002.10143.
-
Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOI
Matheus Cavalcante, Fabian Schuiki, Florian Zaruba, Michael Schaffner, and Luca Benini
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2019. arXiv:1906.00478.
-
Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores
Fabian Schuiki, Florian Zaruba, Torsten Hoefler, and Luca Benini
-
A 0.80 pJ/flop, 1.24 Tflop/sW 8-to-64 bit Transprecision Floating-Point Unit for a 64 bit RISC-V Processor in 22nm FD-SOI
Stefan Mach, Fabian Schuiki, Florian Zaruba, and Luca Benini
IFIP/IEEE 27th International Conference on Very Large Scale Integration (VLSI-SoC), 2019.
-
NTX: An Energy-Efficient Streaming Accelerator for Floating-Point Generalized Reduction Workloads in 22nm FD-SOI
Fabian Schuiki, Michael Schaffner, and Luca Benini
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2019. arXiv:1812.00182.
-
The Floating Point Trinity: A Multi-modal Approach to Extreme Energy-Efficiency and Performance
Florian Zaruba, Fabian Schuiki, Stefan Mach, and Luca Benini
26th IEEE International Conference on Electronics Circuits and Systems (ICECS), 2019.
-
A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets
Fabian Schuiki, Michael Schaffner, Frank K. Gürkaynak, and Luca Benini
Experience
As the pioneering force behind RISC-V, SiFive continues to lead the charge in innovation and efficiency within computing.
As part of my work at SiFive, I develop novel compiler IR dialects for the CIRCT open-source project. Together with a vibrant community of collaborators from SiFive, academia, and the open-source world at large, we are building the foundation for the next generation of hardware design tools.
My contributions include:
-
A SystemVerilog frontend based on the Slang project, which combines my prior work on LLHD and the Moore compiler into a robust and reusable set of MLIR dialects and lowering passes. Based on this work, SiFive is able to compile industry-grade Verilog IP with CIRCT and use it alongside the existing Chisel-based workflow.
-
A fast simulator for hardware designs, based on the restructuring of the input design into register-to-register transfer arcs. These arcs allow for very aggressive optimizations to be performed, such as stratifying simulation execution on clock edges. And by leveraging MLIR’s tight integration with LLVM, the simulator can generate very fast native machine code to execute the simulation.
-
Verification dialects aiming to capture assertions, assumptions, and coverage of boolean and temporal properties. These dialects allow CIRCT to represent SystemVerilog assertions, give Chisel an opportunity to emit temporal properties, and enable CIRCT-based formal verification tools.
-
A debug dialect to capture source-language debugging information and carry it through the compiler pipeline as part of the IR. This has allowed us to demonstrate Chisel-level debugging of hardware designs in a collaboration with Synopsys. Furthermore, debug information can be used to translate Verilog-level waveforms back to the source language.
ETH Zürich
ethz.chDoctor of Science
November 2016 - March 2021
Master Thesis
February 2016 - August 2016
Established in 1854 with the stated mission to educate engineers and scientists, the school focuses exclusively on science, technology, engineering, and mathematics. More than twenty Nobel laureates, including Albert Einstein, have either studied at ETH or were awarded the Nobel Prize for their work achieved at ETH.
I have a PhD in Hardware Engineering. My research in Luca Benini’s group at the Integrated Systems Laboratory focused on energy-efficient Computer Architectures and High-Performance Computing. My contributions include:
-
Optimized Deep Neural Network training, improving energy/area efficiency by 3x compared to GPUs by developing a 32 bit float streaming processor that operates directly on memory. (Paper, Paper)
-
Improved CPU throughput and resource utilization for single-issue in-order processors, achieving 3x performance and 2x energy efficiency gains by allowing instructions to implicitly encode loads/stores. (Paper)
-
Co-authored and formally verified a RISC-V core, and doubled its peak performance by allowing integer and float pipelines to operate in parallel in a pseudo-dual-issue mode. (Paper)
-
Developed a novel intermediate representation for Hardware Description Languages. (Paper)
-
Designed, manufactured, and tested 6 ASICs in 22nm and 65nm technology nodes.
During my Master Thesis I have developed a generator for Standard Cell Memories. This involved designing and optimizing Custom Cell Layouts, transistor-level simulations in Spice, as well as Standard Cell Characterization. My work reduced memory access energy by 61% and area by 20%.
ABB is a technology leader that is driving the digital transformation of industries. With a history of innovation spanning more than 130 years, the company focuses on Electrification, Industrial Automation, Motion, and Robotics & Discrete Automation. It operates in more than 100 countries with about 147,000 employees.
During my delightful internship at ABB, I contributed to introducing Ethernet into electrical substation infrastructure. This involved Low-Level System Programming on commercial and open-source Real-Time Operating Systems, implementing commercial network protocols, and Linux Kernel Driver Development for custom networking hardware.
Education
ETH Zürich
ethz.chPhD Electrical Engineering
November 2016 - March 2021
MSc Electrical Engineering
September 2014 - November 2016
BSc Electrical Engineering
September 2010 - October 2014
Established in 1854 with the stated mission to educate engineers and scientists, the school focuses exclusively on science, technology, engineering, and mathematics. More than twenty Nobel laureates, including Albert Einstein, have either studied at ETH or were awarded the Nobel Prize for their work achieved at ETH.
During my time at ETH I learned most of my key skills such as team work, precision, and working to tight deadlines. The school taught me a thorough knowledge in Computer Architecture, Digital Circuits, Semiconductors, Communication Networks, and Network Security.
During my free time I taught myself complementary skills in Software Engineering and Compiler Design.
Imperial College London
imperial.ac.ukBSc Electrical Engineering (Exchange Year)
October 2012 - June 2013
Established in 1907 by Royal Charter, the college focuses exclusively on science, technology, medicine and business. Imperial is among the top ten universities in the world, home to fourteen Nobel Laureates, three Field Medalists, and one Turing Award winner.
I have spent one year of my BSc education at Imperial College London in an exchange program. At Imperial I have learned essential engineering skills in Communication and Digital Circuits, as well as know-how in Project Management and Entrepreneurship at the Imperial Business School.













