Wednesday, August 24, 2016

Distributed File System benchmark

Distributed File System benchmark


See my updated post here


Im investigating various distributed file systems (loosely termed here to include SAN-like solutions) for use in Docker, Drupal, etc. and couldnt find recent benchmark stats for some popular solutions so I figured Id put one together.

Disclaimer: This is a simple benchmark test with no optimization or advanced configuration so the results should not be interpreted as authoritative.  Rather, its a rough ballpark product comparison to augment additional testing and review.


My Requirements:
  • No single-point-of-failure (masterless, multi-master, or automatic near-instantaneous master failover)
  • POSIX-compliant (user-land FUSE)
  • Open source (non-proprietary)
  • Production ready (version 1.0+, self-proclaimed, or widely recognized as production-grade)
  • New GA release within the past 12 months
  • *.deb package and easy enough to set up via CloudFormation (for benchmark testing purposes)

Products Tested:
  • GlusterFS 3.6.2 [2015-01-27]
    • (replicated volume configuration) - CloudFormation script   (Ubuntu version)
  • LizardFS 2.5.4 [2014-11-07]
    • CloudFormation script   (Ubuntu version)
  • XtreemFS 1.5.1 [2015-03-12]
    • couldnt get product to work ("Input/output error") - CloudFormation script   (Ubuntu version)
    Others:
      • Bazil is not production ready
      • BeeGFS server-side components are not open source (EULA, Section 2)
      • Behrooz (BFS) is not production ready
      • CephFS -- will be included in next round of testing
      • Chirp/Parrot does not have a *.deb package
      • Gfarm version 2.6.8 compiled from source kept returning x.xx/x.xx/x.xx for gfhost -H for any non-local filesystem node (and in general the documentation and setup was terrible)
      • GPFS is proprietary
      • Hadoops HDFS is not POSIX-compliant
      • Lustre does not have a *.deb package
      • MaggieFS has a single point of failure
      • MapR-FS is proprietary
      • MooseFS only provides high availability in their commercial professional edition
      • ObjectiveFS is proprietary
      • OpenAFS kerberos requirement is too complex for CloudFormation
      • OrangeFS is not POSIX-compliant
      • Ori latest release Jan 2014
      • QuantcastFS has a single point of failure
      • PlasmaFS latest release Oct 2011
      • Pomegranate (PFS) latest release Feb 2013
      • S3QL does not support concurrent mounts and read/write from multiple machines 
      • SeaweedFS is not POSIX-compliant
      • SheepFS -- will be included in next round of testing
      • SXFS -- will be included in next round of testing
      • Tahoe-LAFS is not recommended for POSIX/fuse use cases
      • TokuFS is latest release Feb 2014

    AWS Test Instances:
    • Debian 7 (wheezy) paravirtual x86_64 (AMI)
    • m1.medium (1 vCPU, 3.75 GB memory, moderate network performance)
    • 410 GB hard drive (local instance storage)

    Test Configuration:

    Two master servers were used for each test of 2, 10, and 18 clients.  Results of the three tests were averaged.  Benchmark testing was performed with bonnie++ 1.96 and fio 2.0.8.
    Example Run:
    $ sudo su -
    # apt-get update -y && apt-get install -y bonnie++ fio 
    # screen
    # bonnie++ -d /mnt/glusterfs -u root -n 4:50m:1k:6 -m GlusterFS with 2 data nodes -q | bon_csv2html >> /tmp/bonnie.html
    # cd /tmp
    # wget -O crystaldiskmark.fio http://www.winkey.jp/downloads/visit.php/fio-crystaldiskmark
    # sed -i s/directory=/tmp//directory=/mnt/glusterfs/ crystaldiskmark.fio
    # sed -i s/direct=1/direct=0/ crystaldiskmark.fio 
    # fio crystaldiskmark.fio
    Translation: "Login as root, update the server, install bonnie++ and fio, then run the bonnie++ benchmark tool in the GlusterFS-synchronized directory as the root user using a test sample of 4,096 files (4*1024) ranging between 1 KB and 50 MB in size spread out across 6 sub-directories.  When finished, send the raw CSV result to the html converter and output the result as /tmp/bonnie.html.  Next, run the fio benchmark tool using the CrystalDiskMark script by WinKey referenced here."

    Results (click to view larger image):







    (Note: raw results can be found here)

    _______________________________________________________


    Concluding Remarks:

    Both GlusterFS and LizardFS had strong showings with pros and cons for each.  Both should work fine for production use.  While not an endorsement, I will mention that GlusterFS had more consistent results (less spikes and outliers) between each test and I also like the fact that GlusterFS doesnt distinguish between master servers (master-master peers versus LizardFS master-shadow[slave] configuration).

    Update: GlusterFS requires your number of bricks to be a multiple of the replica count.  This adds complexity to your scaling solution.  For example, if you want two copies of each file kept in the cluster you must add/remove bricks in multiples of two.  Similarly, if you want three copies of each file kept in the cluster you must add/remove bricks in multiples of three.  And so on.  Since they recommend one brick per server as a best practice, this will also likely add cost to your scaling solution.  For this reason, Im now preferring LizardFS over GlusterFS since it does not impose that limitation.


    P.S. Check out this related article by Luis Elizondo for further reading on Docker and distributed file systems.



    Go to link download