I am currently a senior at North Carolina State University majoring in Applied
Mathematics and Electrical Engineering. I am interested in building new tools for parallel and distributed computing primaritly for Python and C++. I also contribute
to open-source projects in Python’s data science stack; particularly Dask and Numba. After I graduate, I plan to work as a software engineer at NVIDIA working on making Dask and the RAPIDS libraries work better on GPUs.
Open-source Work
- Developed a tool called Numba-Inspector which allows users to visualize and debug Numba-compiled code. Specifically, the tool is a Jupyter Magic command that allows users to visualize their CPU or GPU targeted Python code. When the user adds the magic command to the top of their notebook cell, they can click on specific lines in their Python code and view the compiled code. For the CPU targeted code, users can view the bytecode and LLVM IR generated by specific Python lines. For the CUDA kernels, users can view bytecode, LLVM IR, PTX, and SASS for specific Python lines. They can also visualize the CFG of the SASS generated by the CUDA kernel. I’m currently working on improving the Numba-Inspector’s code visuals as well as adding more debugging capabilities to CUDA targeted kernels.
- Developed a Kubernetes Operator for Dask that follows the operator pattern. The Kubernetes Operator uses Kubernetes Custom Resource Definitions (CRDs). Dask clusters are now created CRDs on Kubernetes with the Operator communicating with the Dask Scheduler to scale workers up and down.
- Added support to support for multiple worker groups to the Dask Helm Chart. Dask users can now configure multiple Dask worker groups on Kubernetes with different hardware configurations. For example, this feature enables running high memory or GPU workers and annotating them to run specific tasks. Users can also manage and scale their Dask cluster easily from Python. I wrote a blog post about it on the Dask Blog.
Other Projects
A record of my projects and contributions to open source is available on GitHub:
github.com/Matt711.