27 July 2021

Apparently, GlusterFS No Longer Supports RDMA And You Can't Use It Across Ramdrives Anymore

Back in the day, I used to use CentOS 7.6.1810 with GlusterFS 3.7 and I was able to create ramdrives in said CentOS and then tie a bunch of ramdrives together with GlusterFS.

Apparently, that's not the case anymore and it hasn't been since Version 5.0 as RDMA was deprecated.

Bummer.

Here's why this was important (and useful):

SSDs, regardless of whether it's consumer-grade or enterprise-grade, the NAND flash memory cells/chips/modules that's used in them all have a finite number of program/erase cycles.

Therefore; as a result, ALL SSDs are consumer wear components (like brake pads on a car) where they are designed to be replaced after a few years due to said wear. (This is a point that unfortunately, I don't think that the SSD as an industry, as a whole, spends enough time focusing on because a LOT of people were and are using SSDs as a boot drive, and as a boot drive, because it has a finite number of program/erase cycles, this means that it is only a matter of time before the system will fail, but I'm going to write/rant about that some other time/day.)

But for now, the key takeaway is that SSDs have a finite number of erase/program cycles and that can cause SSDs to fail.

So, in the HPC space, where I am running simulations, I can produce a LOT of data over the course of a run, sometimes, into the PB/run territory.

Therefore; if I have a large amount of data that needs to be read and written, but I don't need to keep all of the transient data that the solver produces the course of a simulation, then I want it to be as fast as possible, but also NOT have it be a money pit where I am constantly pouring money into replacing SSDs (again, regardless of whether it's consumer grade SATA SSDs or enterprise grade U.2 NVMe SSDs).

So, this was where the idea came from - what if I were to create a bunch of ramdrives, and then tie them together somehow?

Originally GlusterFS was able to to do this with gluster version 3.7.

I would be able to create a tmpfs partition/mount point, make that a GlusterFS brick, and then create a GlusterFS volume with those bricks and then export the GlusterFS volume onto the Infiniband network as a NFSoRDMA file system.

And it worked ok for the most part.

I think that I was getting somewhere around like maybe 30 Gbps write speeds on it (for the distributed stripped volume).

Lately, I wanted to try and deploy that again, but for creating plots for the chia cryptocurrency.

Apparently, that wasn't possible/capable anymore.

And that just makes me sad because it had so much potential.

You can create the tmpfs.

Gluster will make you think that you can create the Gluster bricks and volume.

Gluster lies (which you only find out when you attempt to mount the gluster volume that it never really created the bricks (on tmpfs) to begin with).

And then Gluster-hell-breaks-loose because it thinks that the bricks are a part of a gluster volume already which locks the bricks and volume together, and nowhere in the Gluster documentation does it tell you how to dissociate a brick from a volume or vice versa.

And that's too bad that because GlusterFS had so much potential.