Table of Contents

Benchmark 1: Metabots for Metadata Creation

Motivation

This experiment seeks to demonstrate how assigning file system metadata creation to out-of-band metabots can improve in-band performance of the application. In situations where applications have to create large file hierarchies, which are accessed at a later point of time, there is considerable overhead for creating file system metadata in-band. In many of those cases, it is reasonable for the application programmer to know the structure of this hierarchy and possibly the number and sizes of these files. When file sizes cannot be predetermined, it is usually possible to parse the contents of a data stream and demarcate the logical end of one file and the beginning of another. In those instances, our proposed framework would have individual processes in the parallel application write to LWFS containers which may be created ahead of time. Out-of-band metabots will then go through these containers based on application programmer supplied information about the hierarchy, and create the requisite file system metadata and store it for future access. This is expected to considerably enhance application write performance.

Experiment Design

We designed our micro-benchmark similar to the fdtree benchmark which creates a file hierarchy of user specified depth, with each directory containing user specified number of files each having a user defined file size. However, since our implementation has to make calls into the LWFS client libraries, it was implemented as a C program instead of a shell script. In our experiments, we compare the performances of file hierarchy creation which use the following approaches:

  1. Making in-band file system metadata using LWFS
  2. Writing just the raw data in-band using LWFS and generating the filesystem metadata out-of-band
  3. Making the file hierarchy using NFS (Synchronized)

We designed three series of experiments. In the first, we gradually increase the breadth and depth of the hierarchy (Test A). In the second, we restrict the depth to just one and gradually increase the number of files (Test B). In B, all files have the fixed size of 4KB. In the third, we have only one file with gradually increasing file sizes (Test C).

Experiment Setup

We ran our experiment on 4 nodes of a 52 node [Machine and interconnect description].

Measurements and Analysis

Critical Items and Time Line

Time: 3-4 days

Time: (3-4 days)

Time: 3-4 days

Benchmark 2: LWFS performance

Motivation

This series of experiments are designed to demonstrate LWFS performance for MPI applications in comparison to other parallel file systems. It involves running a series of standard parallel I/O benchmarks:

Experiment Setup

In the control setup, we have an instance of the benchmarks on Lustre. In the experimental setup, LWFS will be used instead.

Critical Items and Time Line

Time: 7-10 days

Time: 3-4 days

Progress Status

We are currently in the process of getting LWFS running on one of the machines here. We expect development to start on the ADIO/LWFS interface as soon as possible.


Errata

Benefits Metabots provide:

  1. Offline/Out-of-band Processing
  2. Closer/Faster Access to datastore
  3. Specialized Knowledge of interface to datastore (in low level, not high level specification)

Ideas

  1. Minimal Penalty for extracting metabot functionality from application
  2. A more efficient translation to parallel transformations
  3. Near Disk Access to “Meta-tized” (metadata annotated) data
  4. Translating between two formats netcdf → hdf
  5. Creating filename containers

To show benefit 1 above: T(MB) + T(App-MB) = T(App) This will demonstrate that we can gain performance benefits from metabots.

Most practical idea: look at during an app on LWFS using a metabot to make filesystem that app expects from a raw container.