Simplest! METABOT CONTROLLER API (where mailbox is a simple file)
API between MC and Mailbox
Metabot Controller to Mailbox:
- read from mailbox (as much at one time as possible)
(Assuming that, for simplest case, MC is executed after apps on
compute nodes are finished; otherwise, mailbox may need to
signal the MC that enough data is available to begin metabots--
"enough" may depend on the application)
MC functionality:
- (partially) parse mailbox notifications to know where to launch
metabots; may have info about type of operation metabots should
perform
API between MC and metabots
Metabot Controller to Metabots:
- launch metabot
params: notification from mailbox (give location of data, possibly
format info), operation to perform?, output format? stored where
Metabots to MC:
- notify MC when finished with operation, or phase of operation?
- notify MC if there's a problem with data?
Issues: How is MC told what kind of tasks need to be performed?
Should the MC be aware if a metabot fails?
How to know where to launch metabots
How to send metabot/location specific information to a large # of metabots efficiently
When to launch metabots
There may be more than one phase of operations that metabots need to perform, these phases may conflict if some metabots finish phase 1 before others and try to begin phase 2
Possibly 2 types of metadata, depending on apps and metabot operations:
general metadata to indicate location, size of data,timestep (chunking metabot); more app specific metadata, like Kniss' app may want to index by parameter settings like tissue types, and uncertainty/risk? (fdtree's metadata is also general to create/remove large directory/file structures-- needs locations, # of levels, sizes)
[In implementation, keep read or get mailbox notifications and parse notifica- tions separate and decoupled in case of changes to mailbox]
How does MC recognize potential scheduling conflicts