Efficient processing and classification of wave energy spectrum data with a distributed pipeline

Ivan Gankevich, Alexander Degtyarev

Processing of large amounts of data often consists of several steps, e.g. pre- and post-processing stages, which are executed sequentially with data written to disk after each step, however, when pre-processing stage for each task is different the more efficient way of processing data is to construct a pipeline which streams data from one stage to an-other. In a more general case some processing stages can be factored into several parallel subordinate stages thus form-ing a distributed pipeline where each stage can have multiple inputs and multiple outputs. Such processing pattern emerges in a problem of classification of wave energy spectra based on analytic approximations which can extract different wave systems and their parameters (e.g. wave system type, mean wave direction) from spectrum. Distributed pipeline approach achieves good performance compared to conventional “sequential-stage” processing.

Bibtex
@article{gankevich2015spec,
  title={Efficient processing and classification of wave energy spectrum data with a distributed pipeline},
  author={Ivan Gankevich and Alexander Degtyarev},
  publisher={Institute of Computer Science},
  journal={Computer Research and Modeling},
  url={http://crm-en.ics.org.ru/journal/article/2301/},
  year={2015},
  month={01},
  language={english},
  pages={517--520},
  number={3},
  volume={7},
  type={article}
}

Publication: Computer Research and Modeling
Publisher: Institute of Computer Science