28 January 2022

How to implement `pixz` for the HiveOX PXE boot server and mining rig clients

 @hiveos
Have you or anybody ever tried moving to using pixz instead of using pxz for the parallel compression and parallel decompression of the boot archive (i.e. moving from hiveramfs.tar.xz to hiveramfs.tpxz)?

I tried dissecting through the scripts and I can't seem to find the part where the system knows to use tar to extract the hiveramfs.tar.xz file into tmpfs.

I've tried looking in /path/to/pxeserver/tftp and also in /path/to/pxeserver/hiveramfs and I wasn't able to find where it codifies the instruction and/or the command to unpack the hiveramfs.tar.xz.

If you can provide some guidance as to where I would find that in the startup script, where it would instruct the client to decompress and unpack the hiveramfs.tar.xz, that would be greatly appreciated.

Thank you.

*edit*
I've implemented pixz now for both parallel compression and the creation of the boot archive hiveramfs.tpxz and the decompression of the same.

It replaces the boot archive hiveramfs.tar.xz.

The PXE server host, if you are running an Ubuntu PXE boot server, will need to have pixz installed (which you can get by running sudo apt install -y pixz, so it's pretty easy to get and install.)

The primary motivation for this is on your mining rig, depending on the CPU that you have in it, but usually, at boot time, you will have excess CPU capacity, and therefore; if you can use a parallel decompression for the hiveramfs archive, then you can get your mining rig up and running that much quicker.

The side benefit that this has also produced is that in the management of the hiveramfs image on the PXE server, pixz worked out to be faster in the creation of the FS archive compared to pxz.

Tested on my PXE server which has a Celeron J3455 (4-core, 1.5 GHz base clock), it compressed the FS archive using pxz in 11 minutes 2 seconds whilst pixz was able to complete the same task (on a fresh install of the HiveOS PXE server) in 8 minutes 57 seconds. (Sidebar: For reference, previously, when using only xz (without the parallelisation), on my system, it would take somewhere between 40-41 minutes to create the FS archive.)

On my mining rig, which has a Core i5-6500T, it takes about 8.70 seconds to decompress hiveramfs.tpxz to hiveramfs.tar and then it takes about another 1.01 seconds to unpack the tarball file.

Unfortunately, I don't have the benchmarking data for how long it took my mining rig to decompress and unpack hiveramfs.tar.xz file.

Here are the steps to deploying pixz, and using that to replace pxz.

On the PXE server, install pixz:
sudo apt install -y pixz

Run the pxe-config.sh to specify your farm hash, server IPv4 address, etc. and also the change the name of the FS archive from hiveramfs.tar.xz to hiveramfs.tpxz.

DO NOT RUN HiveOS update/upgrade yet still!!!

When it asks if you want to upgrade HiveOS, type n for no.

For safety/security, make a backup copy of the initial hiveramfs.tar.xz file that can be found in /path/to/pxeserver/hiveramfs.

(For me, I just ran sudo cp hiveramfs.tar.xz hiveramfs.tar.xz.backup.)

You will need to manually create the initial hiveramfs.tpxz file that the system will act upon next when you run the hive-upgrade.sh script.

To do that, run the following:

/path/to/pxeserver$ sudo mkdir -p tmp/root
/path/to/pxeserver$ cd tmp/root
/path/to/pxeserver/tmp/root$ cp ../../hiveramfs/hiveramfs.tar.xz .
/path/to/pxeserver/tmp/root$ tar --lzma -xf hiveramfs.tar.xz
/path/to/pxeserver/tmp/root$ tar -I pixz -cf ../hiveramfs.tpxz .
/path/to/pxeserver/tmp/root$ cd ..
/path/to/pxeserver/tmp/root$ cp hiveramfs.tpxz ../../hiveramfs
/path/to/pxeserver/tmp/root$ cd ../../hiveramfs
/path/to/pxeserver/hiveramfs$ cp hiveramfs.tpxz hiveramfs.tpxz.backup


Now, edit the pxe-config.sh:
at about line 51, it should say something like:
#adde pxz (typo included)

copy lines 51-53 and paste it after line 53

(basically, add an i so that where it says pxz now says pixz instead)
edit the lines to read:
#adde pixz
dpkg -s pixz > /dev/null 2>&1
[[ $? -ne 0 ]] && need_install="$need_install pixz"


save, quit

Run pxe-config.sh again.

DO NOT RUN HiveOS update/upgrade yet still!!!

Now, your farm hash, IP address, etc. should all have been set previously. Again, when it asks you if you want to upgrade HiveOS, type n for no.

Now, we are going to make a bunch of updates to hive-upgrade.sh.

(For me, I still use vi, but you can use whatever text editor you want.)

/path/to/pxeserver$ sudo vi hive-upgrade.sh
at line 71, add pixz to the end of the line so that the new line 71 would read:
apt install -y pv pixz

I haven't been able to figure out how to decompress the hiveramfs.tpxz archive and unpack it in the same line.

(I also was unable to get pv working properly so that it would show the progress indicator, so if someone else who is smarter than I am can help figure that out, that would be greatly appreciated, but you can also remote into your PXE server again in another terminal window and run top to monitor your PXE server to make sure that it is working in the absence of said progress indicator.)

So the section starting at line 79 echo -e "> Extract Hive FS to tmp dir" now reads:

line80: #pv $FS | tar --lzma -xf -
line81: cp $FS .
line82: pixz -d $ARCH_NAME
line83: tar -xf hiveramfs.tar .
line84: rm hiveramfs.tar


Line84 is needed because otherwise, without it, when you go to create the archive, it will try to compress the old hiveramfs.tar in as well, and you don't need that.

Now fast forward to the section where it creates the archive (around line 121) where it says:
line121: echo -e "> Create FS archive"
line122: #tar -C root -I pxz -cpf - . | pv -s $arch_size | cat > $ARCH_NAME
line123: tar -C root -I pixz -cpf - . | pv -s $arch_size | cat > $ARCH_NAME


(in other words, copy that line, paste it, comment out the old line, and add an i to the new line.)

line125 is still the old line where it used the single threaded xz compression algorithm/tool, which should be already commented out for you.

The rest of the hive-upgrade.sh should be fine. You shouldn't have to touch/update the rest of it.

Now you can run hive-upgrade.sh:
/path/to/pxeserver$ sudo ./hive-upgrade.sh

and you can run it to check and make sure that it is copying the hiveramfs.tpxz from /path/to/pxeserve/hiveramfs to /path/to/pxeserver/tmp/root, decompressing the archive, and unpacking the files properly.

If it does that properly, then the updating portion of it should be running fine, without any issues (or none that I observed).

Then the next section that you want to check is to make sure that when it repacks and compresses the archive back up, that that should be working properly for you.

Again, it is useful/helpful to have a second terminal window open where you've ssh'd into the PXE server again, with top running so that you can make sure that the pixz process is working/running.

After that is done, you can reboot your mining rig to make sure that your mining rig is picking up the new hiveramfs.tpxz file is ok and that it is also successful in decompressing and unpacking the archive.

I have NO idea how it is doing that because normally, I would have to issue that as two separate commands, but again, it appears to be working with my mining rig.

*shrug*

It's working.

I don't know/understand why/how.

But I'm not going to mess with it too much to try and figure out why/how it works, because it IS working.

(Again, if there are other people who are smarter than I am that might be able to explain how it is able to decompress and unpack a .tpxz file, I would be interested in learning, but on the other hand, like I said, my mining rig is up with the new setup, so I'm going to leave it here.)

Feel free to ask questions if you would want to implement pixz so that you would have faster compression and decompression times.

If your PXE server is fast enough for you such that pxz is fast enough for you and this isn't going to make enough of a difference for you, then that's fine. That's up to you.

For me, my PXE server, running on a Celeron J3455 is quite slow, so anything that I can do to speed things up a little bit is still a speed up.

Thanks.

06 January 2022

Getting the latest and greatest hardware running in Linux is sometimes, a bit of a nightmare

Just prior to the holidays, I decided to upgrade one of three of my systems and consolidate it down to two. My old Supermicro Big Twin^2 Pro micro cluster server and two HP Z420 workstations (that I was using in lieu of the Supermicro because the Supermicro was "too loud") were getting replaced by an AMD system, built on the Ryzen 9 5950X CPU and an Intel system, built on the latest and greatest that Intel had to offer - the Core i9-12900K.

So, I speced out all of the rest of the hardware, which really, consisted of the motherboard, RAM, and the CPU heatsink and fan assembly whilst I was able to reuse some of my older, existing components as well. (I did have to buy an extra power supply though because I had originally miscalculated how many power supplies that I would need.)

So that's all fine and dandy. All of the hardware arrived just before the start of the Christmas break for me, so I started to set up the AMD system. Install the CPU, the RAM, the CPU HSF, plug everything in, check and double check all of the connections - everything is good to go. I used Rufus USB to write the CentOS 7.7.1908 installed onto a USB drive, plug in the keyboard, mouse, and flip the switch on the power supply and off I go right?

[buzzer]

Nope!


Near instant kernel panic. Nice.

 

As you can see from the picture above, less than 3 seconds into the boot sequence from the USB drive - Linux has a kernel panic.

Great.

So now I get the "fun" [/sarcasm] job of trying to sort this kernel panic out. Try it a few more times, the same thing happens.

So, ok. Now I'm thinking that the hardware is too new for this older Linux distro and version (and kernel). So, I take out my Intel Core i7-3930K system (one of them that I use to run my tape backup system), and I plug the hard drive into that system, along with the video card back in, and run through the boot and installation process (which worked without any issues of course), power down the 3930K, take the hard drive back out, and plug it into the 5950X system. Power it on. (I set the BIOS to power on after AC loss so that I can turn on the system even when it isn't inside a case and I don't have a power button connected to it.)

The official CentOS forums state that they only support CentOS 7.9.2009, so I try that as well, still to no avail.

Eventually, I end up using a spare Intel 545 series 512 GB SATA 6 Gbps SSD that I had laying around so that I could try installing and re-installing, trying different drivers, kernel modules, kernels, etc. a LOT faster than I was able to with a 7,200 rpm HDD.

End net result: I filed a bug report with kernel.org because the mainline kernel 5.15.11 kept producing kernel panics with the Mellanox 100 Gbps Infiniband network card installed. And it didn't matter whether I tried to use the "inbox" CentOS Infiniband drivers or the "official" Mellanox OFED Infiniband drivers.

Yet another Linux kernel panic.

Interestingly enough, the mainline kernel 5.14.15 works with the Infiniband NIC just fine. So that's what I landed on/with. 

The other major problem that I ran into was that the Asus X570 TUF Gaming Pro (WiFi) used the Intel I225-V 2.5 GbE NIC. Unbeknownst to me when I originally purchased the motherboard, I didn't realise that Intel does NOT have a Linux driver (even on Intel's website) for said Intel I225-V 2.5 GbE NIC. And what was weird was that when I migrating the SSD over during the testing and trying to find/figure out a configuration that worked, said Intel onboard 2.5 GbE NIC would work initially, but then it would eventually and periodically drop out and so that was quite the puzzle because if there wasn't a driver for it, then how was it that it was able to work when I moved the drive over?

As a result of that, that took up a couple of days where I would be trying to clone the disk image over from the Intel SSD over onto the HGST HDD using dd and in the end, that didn't work either.

So, what did I end up with?

This is the hardware specs that I ended up with on the AMD system:

CPU: AMD Ryzen 9 5950X (16-core, 3.4 GHz stock base clock, 4.9 GHz max boost clock, SMT enabled)

Motherboard: Asus X570 TUF Gaming Pro (WiFi)

RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)

CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan

Video card: EVGA GeForce GTX 980

Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD

NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)

NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)

Power Supply: Corsair CX750M

OS: CentOS 7.7.1908 kernel 5.14.15-1-el7.elrepo.x86_64

 

I ended up adding the Intel Gigabit CT Desktop NIC because a) it was an extra Intel GbE NIC AIC that I had also laying around, and b) it proved to be able to provide a vastly more reliable connection than the onboard Intel I225-V 2.5 GbE due to the driver issue.

Now that I have the system set up and running, there is a  higher probabilty that the igc kernel module probably works more reliably now than it did when I was originally setting up the system, but being that it was not reliable when I was doing the initial setup and testing, I am less likely to use said onboard NIC, which is a pity. Brand spankin' new motherboard and I can't even use nor trust the reliability of the onboard NIC. And I can't even blame Asus for it because it is an Intel NIC. (Sidebar: Ironically, the Asus Z690 Prime-P D4 motherboard that I also purchased uses a Realtek RTL8125 2.5 GbE NIC, which I WAS able to find a driver for that and it has been working flawlessly with it.)

That took probably on the order of around 10 days, from beginning to end, to get the AMD system up and running.


The Intel system was a little bit easier to set up.

The kernel panic issue with the mainline 5.15.11 kernel and Infiniband was also present on the Intel platform as well.

Interestingly and ironically enough, the newer kernel kept crashing or had severe stability issues. It turns out that I did NOT install the RAM correctly (i.e. in the DIMM_A2 and DIMM_B2 slots), so since then, I've corrected that.

Keen readers might note that I have stated that I have 4 sticks of RAM, except that one of the sticks arrived DOA, and is currently being sent back to Crucial under RMA, so when it comes back, then I will be able to install the extra stick that is currently not installed and the stick that is due back from the RMA exchange.

I might try the newer kernels again later, but for now, at least the system is up and running so that I can start making it do the work that I need it to be doing.

The system stability issues due to the error that I made when installing (and uninstalling) the RAM (because I was testing the stick of RAM that wouldn't POST that ended up getting RMA'd back to Crucial), I ended up with a RAM installation configuration that wasn't correct and the resulting system stability issues ate up a few more days.

So, in the end, it took me almost the entire Christmas holiday to get both of these systems up and running.

(This is also a really good reason why traditionally, I have stuck with workstation and server hardware because on my old Supermicro micro cluster, I can deploy all four nodes in 2 hours or less. It's a pity that the system is too loud.)


This is the hardware that I ended up with on the Intel system:

CPU: Intel Core i9-12900K (16 cores (8P + 8E), 3.2 GHz/2.4 GHz base clock speed, 5.2 GHz/3.9 GHz max boost clock, HTT enabled)

Motherboard: Asus Z690 Prime-P D4

RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)

CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan

Video card: EVGA GeForce GTX 660

Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD

NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)

NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)

Power Supply: Corsair CX750M

OS: CentOS 7.7.1908 kernel 3.10.0-1127.el7.x86_64

AMD Ryzen 9 5950X is faster than the Intel Core i9-12900K for mining Raptoreum

The results speak for themselves.

 

The AMD Ryzen 9 5950X is faster at mining Raptoreum than Intel's latest and greatest 12th gen Core i9-12900K.

 

System/hardware specs:

AMD:

CPU: AMD Ryzen 9 5950X (16-core, 3.4 GHz stock base clock, 4.9 GHz max boost clock, SMT enabled)

Motherboard: Asus X570 TUF Gaming Pro (WiFi)

RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)

CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan

Video card: EVGA GeForce GTX 980

Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD

NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)

NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)

Power Supply: Corsair CX750M

OS: CentOS 7.7.1908 kernel 5.14.15-1-el7.elrepo.x86_64


Intel:

CPU: Intel Core i9-12900K (16 cores (8P + 8E), 3.2 GHz/2.4 GHz base clock speed, 5.2 GHz/3.9 GHz max boost clock, HTT enabled)

Motherboard: Asus Z690 Prime-P D4

RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)

CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan

Video card: EVGA GeForce GTX 660

Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD

NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)

NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)

Power Supply: Corsair CX750M

OS: CentOS 7.7.1908 kernel 3.10.0-1127.el7.x86_64


Configuration notes:

I had about two weeks over the Christmas break 2021 to receive all of the hardware, assemble the systems, and get the systems set up and up and running. And that was quite the endeavour because with the latest and greatest hardware, the older versions of CentOS (7.7.1908) and the older kernel didn't work with all of the features and functions with this level of hardware.

As a result, I had to "jumpstart" both systems by first installing the OS using my Intel Core i7-3930K system (Asus X79 Sabertooth motherboard, 4x Crucial 8 GB DDR3-1600 unbuffered, non-ECC RAM, Mellanox MCX456A-ECAT, GTX 660) first, and then update the systems (at least in part) before I can transplant the hard drive with the OS install into their respective systems and finish setting the systems up. (I will write more about that "journey"/clusterf in another blog post here shortly because it was quite the journey to jumpstart both of these systems simultaneously, which took pretty much the full two weeks that I had.)

You will find out how and why I ended up with the respective hardware choices in that blog post.

I am using cpuminer-gr-1.2.4.1-x86_64_linux from here (https://github.com/WyvernTKC/cpuminer-gr-avx2/releases/tag/1.2.4.1).

For the Intel system, because of the existence of the combination of (P)erformance Cores and (E)fficiency Cores and HyperThreading, this resulted in more combinations that I had to test in order to find the setting that had the highest Raptoreum hash rate as reported from their benchmarking tool. Each time the CPU configuration changed, I ran a full tune again, which you might well imagine, took quite some time to do. In the cases where the efficient cores were disabled, I also tested and re-ran the full tune for both AVX2 and AVX512.

The AVX512 runs (both times I ran it, i.e. with and without HyperThreading), resulted in thermal throttling with about a 23-24 C ambient at the time.

For the AMD system, testing it was a lot simpler because it was either only SMT on or off.

Results:


The results speak for themselves.

The AMD Ryzen 9 5950X with SMT enabled produces the highest hash rate (3953.64 hashes/second). Compare that with the run where SMT was disabled, enabling SMT results in about a 10.3% increase in the hash rate performance result.

The Intel Core i9-12900K results are an interesting case. Despite the plethora of benchmarks talking about how great and how fast the latest and greatest from Intel is (and there ARE some things that said latest and greatest from Intel are great at), unfortunately, for Raptoreum mining, this is not one of them.

At best, the 5950X with SMT enabled compared to the 12900K with all 16 cores AND HyperThreading enabled puts the 5950X at about 80.9% faster in Raptoreum hash rate performance vs. the 12900K.

Comparing like-for-like thread count, whether it is 8P+8E without HyperThreading, the 16 core/16 thread of the 5950X is still approximately 64.2% faster than the 12900K. Without the efficiency cores, but turning HyperThreading back on (i.e. 8P+0E with HyperThreading), the 12900K again is bested by the 5950X by a 71.8% margin.

Unfortunately, running this in Linux meant that I didn't have or didn't know of a tool like Hardware Info 64 to be able to report power consumption figures/values. Maybe I might get around to re-running this test again in Windows, but for now, this might be helpful to those who might be interested in looking for guidance if mining Raptoreum is on your mind.