Features Common to all metabots (probably incomplete and new uses for metabots may arise)

1. starting metabots

  1. when to start metabots: overlap with inline operations or wait until those are finished
  2. how do metabots know when to start

2. each metabot knowing how/when to terminate

3. kind of operation metabot needs to do:

  1. data transformation (ex. filtering, transforms, verification)
  2. writing metadata
Both of the above need to know what, where
 - what info to write as metadata
 - what kind of transformation to perform  (expected input and output)
 - where to place metadata
 - where to place transformations

Users can specify:

  1. what kind of operation (metadata or transformation)
  2. basic naming information
  3. other app specific data they might want/need (in either metadata or transformations); at first, will assume user knows what kind of information will be needed; later, maybe there can be a way for metabots to figure out what will be needed based on work flows? )

Other questions: - Do the metabots have to search for the data they need (such as input files for a data transformation? a previous paper said that metabots “crawled” a storage device) Are they started where they need to be? Does the user know where the data will be? - Can there be more metabots running than is necessary, if some are on nodes with nothing to do?

- There might be cases where a group of metabots needs to finish an operation (such as metadata writing) before another group of metabots can do something else (such as transformation).

- Case of when an IOGraph is dynamically changed to improve performance; How does this affect metabot? Possible scenario: A metabot is running on a storage node, able to process up to n timesteps when iograph diverts data to another storage node; Steps:

1. must shut down the first metabot, keeping info such as location, timesteps completed
2. start new metabot, may need information recorded by the first metabot
3. in case like the chunking metabot, there would be 2 actions: doing preliminary transformations on data, then actual chunking; would require 2 metabots

- First stab at metabot framework in pseudocode:

 if (type of operation == data transformation)
    get user specification of input
    get user specification of output
    ?? distribute the work, coordinate ??
    * how is transformation done?
 else if (type of operation == metadata)
    get user specification for required metadata
    ?? distribute work ??
    write metadata with appropriate naming