SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Doctoral Showcase Archive

I/O Efficient Machine Learning

Author: Meghana Madhyastha (John Hopkins University, Argonne National Laboratory (ANL))

Advisor: Randal Burns (Johns Hopkins University)

Abstract: My research focuses on systems optimizations for machine learning, specifically on I/O efficient model storage and retrieval.

The first part of my work focuses on efficient inference serving of tree ensemble models. Tree structures are inherently not cache friendly and their traversal incurs random I/Os. We developed two systems - Blockset (Block Aligned Serialized Trees) and T-REX (Tree Rectangles).

Blockset improves inference latency in the scenario where the model doesn’t fit in memory. It introduces the concept of selective access for tree ensembles in which only the parts of the model needed for inference are deserialized and loaded into memory. It uses principles from external memory algorithms to rearrange tree nodes in a block aligned format to minimize the number of I/Os needed for inference. T-REX optimizes inference latency for both in-memory inference as well as inference when the model doesn’t fit in memory. T-REX reformulates decision tree traversal as hyperrectangle enclosure queries using the fact that decision trees partition the space into convex hyperrectangles. The test points are then queried for enclosure inside the hyperrectangles. In doing random I/O is traded for additional computation.

The second part of my work focuses on efficient deep learning model storage. We implemented a deep learning model repository that requires fine-grained access to individual tensors in models. This is useful in applications such as transfer learning, where individual tensors in layers are transferred from one model to another. We’re currently working on caching and prefetching popular tensors based on application level hints.

Thesis Canvas: pdf

Back to Doctoral Showcase Archive Listing