A number of questions that were posed prior to the meeting: 1. What kind of data transformations are being suggested to be moved off the core? 2. What file formats being used? 3. How much of the code base is implemented and what is the extent of reimplementation? 4. How do we ascertain nodes in the I/O graph to be replicable and how are I/O nodes reconfigured presently. 5. What % of the computation can be moved out of the core .... Matt gave an overview of the two applications which we'll be working with. 1. Fusion application: Two primary datasets: * The Mesh for which each cordinate point has associated with it multiple vectors (Energy/Velocity) * Particles, with each particle having data associated with it. File formats used include HDF5 and NetCDF. HDF5 used predominantly for Particle Data. and NetCDF for mesh Data Data transformations that could be off the core include Different Filters and Metadata annotators, Extraction, Statistical Analysis. Data set sizes approx. in the region of 400TB - alloof which cannot be accomodated in memory. Hence sampling is done - on the order of 40T once every 1-5 minutes. Also the Mesh is saved every 10-15 minutes or so. The Particles Dataset is also a Restart file in the sense that the application can be restarted midway by reading this dataset. 2. Molecular Dynamics Application: Some sort of an Atomic level Simulator. Each atom has associated with it multiple variabes. Application uses MPI I/O and Fortran Writes - no scientific binary formats used. Data transformations off the core could include significanlty large computations - bounding box operations, Data extension using discrete derivations and other custom extensions. Action Items: - Matt to send us/Pat code dumps of the two applications. - Reference was made to Hasan's work on I/O graph. Some confusion on which papers are of relevance here - Reference was made to Vibhore's work on how the I/O graph was beign reconfigured currently. Mention was made to SmartPointers/IFlow/Service Augmentation Paper. Need to find the relevant papers in this area. - Reference was also made to some sort of formal methods being used in Vibhore's work for ascertaining which nodes can be moved around in the I/O graph. Need to find additional details on this.