Authors: Yingjin Qian (Data Direct Networks), Wen Cheng and Lingfang Zeng (Zhejiang Lab), Xi Li (Data Direct Networks), Marc-André Vef (Johannes Gutenberg University Mainz), Andreas Dilger and Siyao Lai (Whamcloud Inc), Shuichi Ihara (Data Direct Networks), Yong Fan (Intel Corporation), and André Brinkmann (Johannes Gutenberg University Mainz)
Abstract: Directory tree walks on parallel file systems are costly operations frequently required by many storage management tasks. Even listing the contents of a single directory can take minutes to hours for huge directories, as the tree walk performance of parallel file systems in Linux is severely throttled by sequentially accessing distributed metadata for each file through the syscall interface.
We present extreme file attribute stat (Xfast), which scales the performance of directory tree walks by combining techniques that have been developed over a time frame of 10 years for the Lustre file system. Scalable statahead predicts file access patterns and prefetches required attributes, while the Size on MDT (SOM) mechanism reduces the number of RPC calls to collect file attributes. Xfast improves the performance of common directory operations, e.g. reduces the time to list one million files from 11 minutes to less than one minute for a single process.
Back to Technical Papers Archive Listing