Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage

By joe

March 15, 2016 - 7 minutes read - 1293 words

TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …

root@usn-02:~/burn-in/fio# df -h /mnt/unison
Filesystem      Size  Used Avail Use% Mounted on
beegfs_nodev     41T  203G   40T   1% /mnt/unison

though the units also come with block (iSCSI, FCo*, rbd), and object (S3 via radosgw) installed. Note the dsk/total column named writ. This is in Bytes/second. 10G is 10GB/s. Cpu usage is important too … specifically idl and wai. Idle and wait as percentages of overall load.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0 100   0   0   0|   0     0 | 420B  502B|   0     0 |4161   742
  0   0  99   0   0   0|   0   231M| 360B  916B|   0     0 |  14k   28k
  9   9  65  16   0   0|   0    10G| 360B  420B|   0     0 | 319k  996k
  9   8  66  17   0   0|   0    10G|  60B  338B|   0     0 | 327k  996k
  9   7  66  18   0   0|   0    10G|  60B  338B|   0     0 | 325k  998k
  9   9  66  17   0   0|   0    10G|  60B  338B|   0     0 | 329k  998k
  9   8  65  17   0   0|   0    10G|  60B  338B|   0     0 | 332k  994k
  9   8  65  17   0   0|   0    10G|  60B  338B|   0     0 | 333k  997k
 10   8  66  17   0   0|   0    10G|  60B  338B|   0     0 | 327k  996k
 10   7  65  17   0   0|   0    10G|  60B  338B|   0     0 | 338k  993k
  9   8  66  17   0   0|   0    10G|  60B  338B|   0     0 | 330k  998k
  9   7  66  18   0   0|   0    10G|  60B  338B|   0     0 | 332k  995k
 10   9  65  17   0   0|   0    10G|  60B  338B|   0     0 | 333k  999k
  9   8  65  18   0   0|   0    10G|  60B  338B|   0     0 | 331k  993k
  9   8  66  16   0   0|   0    10G|  60B  338B|   0     0 | 332k  993k
  9   8  65  19   0   0|   0    10G|  60B  338B|   0     0 | 331k  995k
  9   8  65  18   0   0|   0    10G|  60B  338B|   0     0 | 327k  995k
 10   8  65  17   0   0|   0    10G|  60B  338B|   0     0 | 332k  996k
 10   7  66  17   0   0|   0    10G|  60B  338B|   0     0 | 327k  977k
  4   2  84  10   0   0|   0  4159M|  60B  338B|   0     0 | 146k  300k
  2   2  87   9   0   0|   0  3238M|  60B  338B|   0     0 | 115k  228k
  2   2  87   9   0   0|   0  3236M|  60B  338B|   0     0 | 117k  231k
  3   2  88   8   0   0|   0  3236M| 240B  632B|   0     0 | 117k  230k
  2   2  87   9   0   0|   0  3228M|  60B  338B|   0     0 | 115k  229k
  2   2  87   9   0   0|   0  3215M|  60B  338B|   0     0 | 113k  226k
  2   1  87   9   0   0|   0  3253M|1164B  590B|   0     0 | 115k  229k
  2   2  89   8   0   0|   0  2907M|  60B  338B|   0     0 | 102k  206k
  0   0 100   0   0   0|   0     0 |  60B  338B|   0     0 |4156  3166
  0   0 100   0   0   0|   0     0 | 120B  420B|   0     0 |4172  2136

This is a single Forte 2U unit, connected to a Unison client over a single 100Gb link. Forte has 2 of these. A few write stragglers as you can see in the dstat output … but … note that the system is 2/3rds idle during this write. In aggregrate, even considering the straggler writes

Run status group 0 (all jobs):
  WRITE: io=204723MB, aggrb=8189.6MB/s, minb=8189.6MB/s, maxb=8189.6MB/s, mint=24998msec, maxt=24998msec

We are rate limited by the speed of the network. We were seeing sustained 20GB/s during the conditioning portion (and yes, we conditioned the system for several hours to hit equilibrium state). Now for reads. Slightly higher utilization, but still tremendous. These systems have much more head room than one (even 100Gb) network will allow for.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0 100   0   0   0|   0     0 | 120B  380B|   0     0 |4166   491
  0   0 100   0   0   0|   0     0 |  60B  338B|   0     0 |4160   465
  2   1  94   3   0   0|1197M    0 |  60B  338B|   0     0 |  35k   73k
 24   7  43  26   0   0|  11G    0 |  60B  338B|   0     0 | 246k  679k
 24   7  42  27   0   0|  11G    0 |  60B  338B|   0     0 | 244k  678k
 25   7  42  27   0   0|  11G    0 |  60B  338B|   0     0 | 252k  679k
 24   7  43  27   0   0|  11G    0 |  60B  338B|   0     0 | 253k  680k
 23   7  43  27   0   0|  11G    0 |  60B  338B|   0     0 | 251k  679k
 25   6  43  25   0   0|  11G    0 |  60B  338B|   0     0 | 246k  679k
 24   7  43  26   0   0|  11G    0 |  60B  338B|   0     0 | 254k  678k
 23   6  43  28   0   0|  11G    0 |  60B  338B|   0     0 | 253k  676k
 24   7  43  26   0   0|  11G    0 |  60B  338B|   0     0 | 244k  678k
 24   6  42  28   0   0|  11G    0 |  60B  338B|   0     0 | 240k  678k
 23   6  44  27   0   0|  11G    0 |  60B  338B|   0     0 | 244k  678k
 23   6  43  28   0   0|  11G    0 |  60B  338B|   0     0 | 248k  677k
 24   7  42  27   0   0|  11G    0 |  60B  338B|   0     0 | 249k  678k
 24   7  42  27   0   0|  11G   12k|  60B  338B|   0     0 | 247k  678k
 23   6  41  29   0   0|  11G    0 |  60B  616B|   0     0 | 251k  677k
 13   5  56  26   0   0|9713M    0 | 292B  746B|   0     0 | 201k  579k
  3   2  79  15   0   0|4167M    0 |  60B  338B|   0     0 |  87k  249k
  4   2  81  14   0   0|3956M    0 |  60B  338B|   0     0 |  80k  235k
  3   3  80  14   0   0|3953M    0 |  60B  338B|   0     0 |  82k  235k
  3   2  79  15   0   0|3960M    0 |  60B  338B|   0     0 |  84k  236k
  3   2  80  15   0   0|3970M    0 | 116B  398B|   0     0 |  83k  236k
  0   0  98   1   0   0| 394M    0 |  60B  338B|   0     0 |  14k   25k
  0   0 100   0   0   0|   0     0 | 240B  632B|   0     0 |4163   496

and

Run status group 0 (all jobs):
   READ: io=202320MB, aggrb=9534.5MB/s, minb=9534.5MB/s, maxb=9534.5MB/s, mint=21220msec, maxt=21220msec

100 of these units would put you at 1TB/s with the BeeGFS (FastPath Forte’s standard) parallel file system and a single 100Gb between them. 2 links, would be closer to 50 systems. Think about that. What could you do, if you could move just heretofore unimaginable volumes of data, at data rates you could not fathom? What sort of big data analytical problems could you solve? What sort of load could you handle? We have customers that placed single units of our predecessor technology in front or large clusters to absorb the loads that our competitors RACKFULL and ROWFULL of systems couldn’t handle without crashing. And do so without breaking much of a sweat. We just upped the bar. Again. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details.