ECOOP 2024
Mon 16 - Fri 20 September 2024 Vienna, Austria
co-located with ISSTA/ECOOP 2024

Tensors are multi-dimensional data structures that can represent the data processed by machine learning tasks. Tensor programs tend to be short and readable, and they can leverage libraries and frameworks such as TensorFlow and PyTorch, as well as modern hardware such as GPUs and TPUs. However, tensor programs also tend to obscure shape information, which can cause shape errors that are difficult to find. Such shape errors can be avoided by a combination of shape annotations and shape analysis, but such annotations are burdensome to come up with manually.

In this paper, we use gradual typing to reduce the barrier of entry. Gradual typing offers a way to incrementally introduce type annotations into programs. From there, we focus on tool support for type migration, which is a concept that closely models code-annotation tasks and allows us to do shape reasoning and utilize it for different purposes. We develop a comprehensive gradual typing theory to reason about tensor shapes. We then ask three fundamental questions about a gradually typed tensor program. (1) Does the program have a static migration? (2) Given a program and some arithmetic constraints on shapes, can we migrate the program according to the constraints? (3) Can we eliminate branches that depend on shapes? We develop novel tools to address the three problems. For the third problem, there are currently two PyTorch tools that aim to eliminate branches. They do so by eliminating them for just a single input. Our tool is the first to eliminate branches for an infinite class of inputs, using static shape information. Our tools help prevent bugs, alleviate the burden on the programmer of annotating the program, and improves the process of program transformation.