Reprints from my posting to SAN-Tech Mailing List and ...

2011/06/11

[san-tech][02875] Re[2]: NVIDIA GPUDirect等々 (+Storage と GPU)

Date: Wed, 29 Dec 2010 15:49:44 +0900
--------------------------------------------------
[san-tech][02873] NVIDIA GPUDirect等々 (+Storage と GPU)
ちょっと気になったので、Erasure Codesと GPUで調べてみました:

前回紹介したと思いますが
"A GPU Accelerated Storage System,"
 Abdullah Gharaibeh, Samer Al-Kiswany, Sathish Gopalakrishnan,
 Matei Ripeanu, The University of British Columbia, CANADA
 IEEE/ACM International Symposium on High Performance Distributed
 Computing (HPDC 2010)
  http://www.ece.ubc.ca/~abdullah/papers/gharaibeh-hpdc10.pdf
  http://www.ece.ubc.ca/~abdullah/talks/gharaibeh-hpdc10.ppt

Paper (Page 3)から
  "Similarly, erasure codes' encoding throughput is limited by their
   computational complexity" ....
     We aim to understand the viability of offloading these
   data-processing intensive operations to a GPU to dramatically reduce
   the load on the source CPU and enhance overall system performance.
   In this context, we focus on the use of hashing-based primitives in
   storage systems. This section highlights their computational overheads
   (2.1), and presents a brief overview of NVIDIA GPU’s architecture
   (2.2)"



StoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed
Storage Systems (上記プロジェクト)
  http://www.ece.ubc.ca/~samera/projects/StoreGPU/index.htm

StoreGPU
  http://sourceforge.net/projects/storegpu/
  "StoreGPU is a library that accelerates hashing functions (MD5 and
   SHA1) by offloading the hashing computation to the GPU. StoreGPU
   provides two hashing modes: standard hashing mode, and sliding window
   hashing mode."
Version 1.0, 2010-05-30

Recent Publications
  http://netsyslab.ece.ubc.ca/wiki/index.php/Publications


前回紹介した東工大 TUBAME 2.0でも

> "TSUBAME 2.0 Supercomputer"
>  Satoshi Matsuoka, Titech, November 17, 2010
>   http://www.nvidia.com/content/PDF/sc_2010/theater/Matsuoka_SC10.pdf

"Hybrid Diskless Checkpoint" (71枚目から) で少し触れてます。

関連発表
Scalable encoding algorithm and efficient group mapping
"Distributed Diskless Checkpoint for Large Scale Systems"
 Leonardo Arturo Bautista Gomez, ..... , Satoshi Matsuoka
 IEEE/ACM International Conference on Cluster, Cloud and Grid Computing,
 2010 (CCGrid10)
  http://www.computer.org/portal/web/csdl/doi/10.1109/CCGRID.2010.40

Fast encoding work on idle resource (CPU/GPU)
"Low-overhead diskless checkpoint for hybrid computing systems"
 Leonardo Bautista, ..... , Satoshi Matsuoka
 International Conference on High Performance Computing (HiPC 2010)
※オンライン化はまだです。ただし、Lecture Notes in Computer Science
後述しますが、同じメンバーでの講演資料が見つかりました。

Software Framework for GPGPU Memory FT
"A High-Performance Fault-Tolerant Software Framework for Memory on
 Commodity GPUs"
 Naoya Maruyama, Akira Nukada and Satoshi Matsuoka
 IEEE International Parallel and Distributed Processing Symposium
 (IPDPS'10)
  http://matsu-www.is.titech.ac.jp/~naoya/publications/ipdps10.pdf
  http://matsu-www.is.titech.ac.jp/~naoya/publications/ipdps10-talk.pdf

"Transparent low-overhead checkpoint for GPU-accelerated clusters"
 Leonardo Bautista, ..... , Satoshi Matsuoka
 Forth workshop of the Joint Laboratory  for Petascale Computing
 2010 (NCSA & INRIA)
  https://wiki.ncsa.illinois.edu/download/attachments/17630761/INRIA-UIUC-WS4-lbautista.pdf
  https://wiki.ncsa.illinois.edu/display/jointlab/Workshop+Program#WorkshopProgram-GomezA
  "We propose a transparent low-overhead checkpointing technique for
   GPU accelerated clusters that avoid the I/O bottleneck by using
   erasure codes and SSDs on the compute nodes."
※講演資料とアブストラクトだけでペーパーは公開されていません。

Joint Laboratory for Petascale Computing
  http://jointlab.ncsa.illinois.edu/

The fourth workshop of the Joint Laboratory for Petascale Computing
 November 22-24, 2010
  http://jointlab.ncsa.illinois.edu/events/workshop4/
Workshop Program
  https://wiki.ncsa.illinois.edu/display/jointlab/Workshop+Program
※講演資料とアブストラクトが公開されています。


ちょっと関連:
"What Comes After RAID? Erasure Codes", December 16, 2010
  http://www.networkcomputing.com/deduplication/what-comes-after-raid---erasure-codes.php
  http://www.networkcomputing.com/deduplication/what-comes-after-raid---erasure-codes.php?p=2

  "Erasure codes get really interesting, however, when we up the ante
   beyond n+2 as several vendors have.  NEC's HydraStor deduplicating
   grid system uses erasure codes to spread each data chunk across
   twelve disk drives in the grid."

※Cleversafeについても言及してます (これは面白い分散ストレージです)。

前回コメントを忘れましたが
> "Processing Petabytes per Second with the ATLAS experiment at the
>  Large Hadron Collider at CERN"
>  Philip Clark, Andrew Washbrook - University of Edinburgh
>   http://www.nvidia.com/content/GTC-2010/pdfs/2135_GTC2010.pdf

これは、CERN加速器系のデータ処理量のすごさが解ります。
今回はストレージ系にはほとんど言及してませんが。

0 件のコメント:

コメントを投稿