I bought a new Core i7-13700K and installed into my Asus Z690 Prime P that I bought about a year ago.
Question: How do you update the BIOS for a system that won't POST (so that it would be able to recognise said new processor)?
22 October 2022
Paradoxic connundrum...
24 August 2022
It's amazing how power efficient new systems are compared to (much) older ones
Father-in-law, I think, was originally having some kind of problem with his old, old computer, and as a result, I ended up giving him my old Intel Core 2 Quad Q9550 system to him.
Recently, said Q9550 system started to have some issues, so I gave him my Intel NUC NUC7i3BNH (Intel Core i3-7100U (2-core, HTT enabled), originally 4 GB of RAM, but I upgraded that to 8 GB (2x 4 GB), and it also originally came with a 16 GB Intel Optane module and a 1 TB HGST 2.5" 7200 rpm HDD, but I swapped that out I think for an Intel 520 Series SSD). Anyways, but I digress.
I don't know if I ever took power measurements for the NUC (probably not), but let's instead, compare it for example to the Beelink GTR5 5900HX system, which, at idle, could be sipping somewhere between 9-maybe 16 W of power.
Compare and contrast that to the old Q9550 system which has 4x 2 GB G.Skill DDR2-800 RAM, and a Nvidia GTX 260 in it, with a 610 W PSU, and a single I think it's an Intel 525s Series 240 GB SSD in it. At idle, it sucking back somewhere between 120-160 W.
That's CRAZY!!!
I thought that I was going to re-purpose that system to be a server of some kind. But now, I'm not so sure.
Granted, the Q9550 system does have a Gigabyte EP45-UD3P motherboard in it, and as such, sports 8 SATA 3 Gbps ports. And the Intel Core 2 Quad Q9550 dose support Intel VT-x and Intel VT-d, which means that, again, in theory, I can run a few virtual machines on it and throw TrueNAS onto that system and make it into a storage server.
I don't know if I'm going to that for sure yet, but it is a potential option.
But man, that idle power is really making me re-think that plan. (Sadly, I'm not sure if newer servers would really be that much more efficient. Desktop systems and/or mini-PCs, yeah, but towers and/or servers - I don't know about that.)
The BitComet client has gone to shit.
https://www.cometforums.com/topic/12802501-bitcomet-causing-excessive-ping-times/page/2/
"If you are so unhappy using BitComet as your client (you have previously stated that you are using it at the same time as two other clients), I suggest that, at least, you show that you do possess some amount of common courtesy."
Read my initial posts.
I was merely and simply stating "hey, I think there's a problem here".
Rhubarb repeatedly denied that the problem even exists, let alone offer anything that resembles help and/or assistance.
Rhubarb's response is akin to how companies blame independent media outlets when said independent media outlets find issues with said company's products. (Which Steve from GamersNexus makes references to here:
The fact that I cited the old forum posts where Rhubarb even SPECIFICALLY and EXPLICITLY asks for ping time data whilst on here, argues that it's not about ping times is laughable at the very least.
Interesting how you make no mention of this fact in your reply.
If Rhubard is going to be belligerent, then you can't be surprised when said belligerence is going to be met with belligerence.
If Rhubard doesn't know how to help and/or doesn't want to help, then he could've just plainly and pointedly stated that.
But that wasn't the case here.
"I have been a resident of BitComet Forums, assuredly, a lot longer than yourself, and I am always amazed at how imperious some users like to sound and, rather than thanking those who donate their free time to attempt to aid others (not being remunerated, by the way), feel it is their God-given right to insult and try to belittle them, just because they do not see eye-to-eye with what is suggested. Thank Goodness that this is not the case of the vast majority of the more than 100,000 worldwide users of this free application!! "
Once again, if you have actually READ my posts and Rhubarb's responses, it LITERALLY reads:
Me: "Hey, I think there's a problem with the program."
Rhubarb: "No, there isn't."
Me: "Yes, there is. And here is the data to prove it."
Rhubarb: "No, there isn't."
Me: "Well, I googled it and this is how I found this forum because other people reported about the same thing."
Rhubarb: "If this was a problem, there'd be all sorts of 'me too' posts."
Me: "But there are 'me too' posts."
Rhubarb: "No, there isn't."
Me: "Yes, there are. Here are the quotations from those threads, and here are the links to those posts."
Rhubarb: "No, there isn't."
(see a pattern here?)
So, why would I thank someone who is in denial about a problem???
That makes no sense.
Would you ever thank an alcoholic that beats their wife and kids "thank you for beating me?" (because you're an alcoholic) That's absolutely ridiculous.
You can literally conduct your internal review of how Rhubarb could've handled this better cuz right now, he's at the same level as the Enermax issue.
"Would that decrease your infinite rage and please your Magnanimous self? "
Why would I need to do that?
Again, the other thread tells you that it's a problem with how the client tries to establish a connection to the DHT network.
It's a very simple question: on startup, what does the client attempt to do as it is trying to establish a connection to the DHT network?
It would appear that no one here has ever bothered to ask this very simple, basic question as it pertains to this issue which might be the responsible party for both, this thread, and the previous thread that was filed a year and 8 months ago.
"Thank them profusely for fixing my car and go on my merry way. There is a Spanish proverb that says that 'to be thankful is a sign of being well-bred' ("Ser agradecido es de ser bien nacido"). "
And that's the difference between your mechanic and Rhubarb.
Rhubarb never made it to trying to profusely fix the client.
That's the difference between your scenario and this one.
---
The response from their forum is an example of how not to handle a problem when users/clients are reporting a problem.
You can read the rest of the thread to see what I'm talking about there.
BitComet sucks.
Use something else instead.
*edit 2022-09-01*
BWAHAHAHAHA.....
The mods at the BitComet forum has now banned me from said forum because I reported an issue, and they refused to fix it.
LOL...LMAO....
Fuck BitComet. It's LITERAL trash.
22 August 2022
AMD Ryzen 9 5950X may NOT be as fast for CFD as I otherwise thought/hoped
So this test is based on the same testcase, but just testing it with two different CFD applications.
Both are steady-state solutions (which is normally used to initialise the flow field for the transient solution, which I am not testing at the moment).
The AMD Ryzen 9 5950X cluster is two nodes, where each node has an AMD Ryzen 9 5950X (16-cores, SMT disabled), 128 GB of DDR4-3200 unbuffered, non-ECC RAM, and a Mellanox ConnectX-4 MCX456A-ECAT 100 Gbps Infiniband network card whilst the Xeon cluster is two nodes, each with dual Intel Xeon E5-2690 (V1, 8-cores each, HTT disabled for both processors), 128 GB of DDR3-1866 2Rx4 Registered ECC RAM running at DDR3-1600 speeds.
In one of the applications, the AMD Ryzen 9 5950X finishes the solution in 23342.021 seconds whilst the Xeon pair of nodes finishes the same steady state solution in 15834.675 seconds (or about a 32.16% reduction in wall clock time), which is rather significant. This run has about 13.4 million cells and it takes this long because it is running for 1000 iterations.
And then in another, different CFD application, but also running the steady-state solution run for 48 iterations, and finishes the solution on the AMD system in 292.665 seconds whilst on the Xeon system, it finishes this solution in 264.48 seconds or about 9.63% faster.
That's really interesting that the AMD Ryzen 9 system, despite it being 8 and a half years newer, still isn't able to be as fast as an older Xeon-based cluster.
The only real upside to using the Ryzen-based system over the Xeon based system -- well, two things actually are:
1) The Ryzen based system uses quite a lot less power compared to the Xeon cluster. It isn't surprising that I can see power consumptions, under load, of upwards or around 1 kW for just running two nodes (and running all four nodes pushes that total up to somewhere between 1.6-1.9 kW) whereas the Ryzen based systems combined, is using probably only about maybe 400 W total.
2) The Ryzen based system is a LOT quieter than the Xeon Supermicro Twin Pro^2 server (6027TR-HTRF).
So, if you're running it in a home lab environment where you don't live by yourself, then despite it being slower, it might still be a better alternative for these two reasons.
And the Ryzen based solution is certainly cheaper than the Threadripper, Threadripper Pro, and/or AMD EPYC solution platforms, where you might be able to get some of that performance back, but I can't say for certain without actually testing it myself because I thought that having the 16 faster clock speed cores on the Ryzen 9 5950X would be faster than the Xeon E5-2690 platform. Based on the data and the results, I stand corrected.
I did not expect that.
15 June 2022
Engineering data consolidation efforts
Since I built my Ryzen 5950X system, and the 12900K system, and then had to completely disassemble the 12900K system, and then built another Ryzen 5950X system whilst arguing with Asus, I was in the middle of a data consolidation effort for all of my engineering data from the various projects that I've worked on over the years.
Today marks the day where the first pass of this data consolidation effort has completed and I ended up saving almost 14 TB of storage space.
It feels nice, and I get a sense of accomplishment as the data is being written to tape right now.
I can't believe that it's taken me like about 6 months to finish this data consolidation effort.
At some points during the process of unpacking, packing, and then re-packing the data, both Ryzen 5950Xs and also the Intel Core i7-4930K that's in the headnode was oversubscribed 3:1 when it was processing the data. That just seems pretty crazy to me because that's also a little bit of an indication as to how much work the CPUs had to do to process and re-process the data.
Not to mention, my poor, poor hard drives, that have been working so hard throughout all of this.
13 June 2022
Welp....this is a problem.
Let me begin with the problem statement:
What you see above is the results from the 100 Gbps Infiniband network bandwidth test that are between my two AMD Ryzen 5950X systems. Both of them has a discrete GPU in the primary PCIe slot, and then the Mellanox ConnectX-4 dual port, 100 Gbps Infiniband NIC is in the next available PCIe slot.
I can't really tell from the motherboard manual for the Asus ROG Strix X570-E Gaming WiFi II motherboard what speed the second PCIe slot is supposed to be when there is a discrete GPU plugged into the primary PCIe slot.
The Mellanox ConnectX-4 card is a PCIe 3.0 x16 card, which means that the slot itself is supposed to support upto 128 Gbps (and the ports themselves is supposed to go up to a maximum of 100 Gbps out of the 128 Gbps that's theorectically available). If the slots were running as PCIe 3.0 x4, it should be capable of 32 Gbps.
As the results show, clearly, that is not the case.
I'll have to see if I can run both of those systems without the discrete GPU, so that I can plug the Mellanox cards into the primary PCIe slot.
*Update 2022-06-14*:
So I took out the discrete GPUs from both systems and put the Mellanox card into the primary PCIe slot and this is what I get from the bandwidth test results:
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x0c QPN 0x008c PSN 0x5ccdd5
remote address: LID 0x05 QPN 0x010a PSN 0x178491
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
2 100000 0.000000 0.066552 4.159479
4 100000 0.00 0.11 3.529205
8 100000 0.00 0.27 4.225857
16 100000 0.00 0.54 4.254547
32 100000 0.00 1.09 4.254549
64 100000 0.00 2.19 4.276291
128 100000 0.00 4.51 4.408332
256 100000 0.00 9.21 4.498839
512 100000 0.00 18.60 4.540925
1024 100000 0.00 36.74 4.485289
2048 100000 0.00 75.76 4.623960
4096 100000 0.00 96.55 2.946372
8192 100000 0.00 96.57 1.473530
16384 100000 0.00 96.58 0.736823
32768 100000 0.00 96.58 0.368421
65536 100000 0.00 96.58 0.184218
131072 100000 0.00 96.58 0.092109
262144 100000 0.00 96.58 0.046055
524288 100000 0.00 96.58 0.023027
1048576 100000 0.00 96.58 0.011514
2097152 100000 0.00 96.58 0.005757
4194304 100000 0.00 96.58 0.002878
8388608 100000 0.00 96.58 0.001439
---------------------------------------------------------------------------------------
Ahhhh.....much better. That's more like it.
05 April 2022
Moral of the story: Do NOT buy from Asus. Intel is willing to offer a refund. Asus is not.
As a follow-up to my previous blog post about the data corruption issue that I was experiencing with the Intel Core i9-12900K processor that was running on the Asus Z690 Prime-P D4 motherboard, Intel has offered a full refund on the defective unit whilst Asus has not.
So, moral of the story:
Don't buy from Asus.
I mean, clearly, if the interaction between the Intel Core i9-12900K and the Asus Z690 Prime-P D4 motherboard is causing the system to spontaneously reset itself when I attempted to run memtest86 a second time, using the memory that was from my AMD Ryzen 9 5950X (which was also using an Asus motherboard), which PASSED memtest86 on said Ryzen platform, and by putting those four DIMMs into the Asus Z690 Prime-P D4 motherboard, it results in the system spontaneously resetting itself; that's NOT a good sign of a reliable motherboard.
Asus was ONLY willing offer a RMA repair, and I told them that the CPU is in the process of being sent back, so even if they attempted to repair it, I would have no way of verifying whether the issue is still there or not because the CPU would've already been sent back and I'm not buying another Alder Lake CPU from Intel only to give it the chance for this problem to repeat itself.
So, moral of the story:
Don't buy from Asus.
02 April 2022
Data corruption and system stability issues with Intel Core i9-12900K and Asus Z690 Prime-P D4 motherboard
Welp, this happened:
https://youtu.be/VF19o2OMzbU
The data corruption issue with the Intel Core i9-12900K and the Asus Z690 Prime-P D4 saga continues.
01 April 2022
memtest86 self-aborted "due to too many errors" -- My Intel Core i9-12900K on an Asus Z690 Prime-P D4 motherboard regularly corrupting data
So this happened:
I am currently using an Intel Core i9-12900K processor (purchased November 10th, 2021) on an Asus Z690 Prime-P D4 motherboard (purchased November 18th, 2021) and the system was finally assembled around Christmas time, 2021. So the system has only been running for about 3 months and within that 3 months of normal, un-overclocked usage, this happens.
(I don't even use XMP.)
(I am using Crucial 32 GB DDR4-3200 unbuffered, non-ECC memory (Crucial part number: CT2K32G4DFD832A) - four sticks in total, for a total of 128 GB).)
(Memtest) has aborted "due to too many errors".
Wow.
I've NEVER seen that message before.
"Too many errors."
10035 errors to be precise (before the test self-aborted).
Think about how bad the problem must be for the CPU and/or the motherboard to cause memtest86 to self-abort the test on account of "due to too many errors".
I am in the process of trying to see if I can get a refund from Intel via a RMA because the processor has to be so royally screwed up to be able to produce 10035 errors in memtest86 before memtest86 self-aborted and also a refund on the motherboard from Asus (also under a RMA as well).
(Sidenote: I tested the same four sticks of memory in my AMD Ryzen 9 5950X system with an Asus TUF X570 Gaming Pro WiFi motherboard and it passed memtest86 with zero (0) cumulative errors, which is how I know that the problem is NOT with the memory.)
Asus so far, has issued a RMA number for a board repair, but the problem is that if I send the CPU back to Intel and Intel issues the refund, then I won't have a CPU to be able to test the Asus motherboard once it comes back to see whether the issue has been resolved or not.
And I don't want to play the game where I am just making the parcel delivery companies rich by constantly sending stuff back and forth in order to try and get this taken care of.
Stay tuned for this saga.
27 March 2022
Minisforum HX90 Conclusion
In order to conclude the saga that was the Minisforum HX90, I ended up trying out Pop! OS 21.04 from System76. At first, the results looked promising because I was able to install Steam, VirtualBox, and import all of my VMs, and got them up and running. Never got around to testing the games in Steam though.
Unfortunately though, what appeared, initially to be a success eventually still ended up in a failure.
The system did freeze, eventually, at least once; at which point, it was clear and obvious that there is something either wrong with the system, the hardware, the engineering, compatibility issues, and/or a problem with software running on it.
I don't have the tools to be able to diagnose the root cause of the issue, even when I had Pop! OS installed on the NVMe SSD. Therefore; as such, I have sent the SODIMM RAM back for a RMA already, and I am currently in the process of trying to do the same with the HX90 itself as well.
This is a bummer/shame because I was really hoping that said HX90 would have been able to take the place of my former Intel NUC, be more performant, and not have the same kind of thermal throttling issues that's wayyy too common in my Intel NUCs.
Sadly, that just didn't turn out to be the case.
So now I have my old Intel Core i7-6700K taking on the duties that were originally designated for the HX90 and I have bought two sets of 2x 16 GB DDR4-3200 Kingston HyperX Fury RAM modules (4 DIMMs total, 64 GB total), in the hopes that I would be able to upgrade the RAM in the 6700K system, and make that take on those duties instead.
We shall see how that goes.
24 March 2022
Still working on the Minisforum HX90
About two weeks ago, my Minisforum HX90 finally arrived and I was able to get the system up and going.
So far, it's been a bit of a mixed bag.
The system is actually VERY performant and I don't have any really complaints in regards to that However, the way that I had it set up where the system was hosting 9 VMs, it started freezing daily; which necessitated a hard power cycle before it would freeze again the next day, and the next, etc.
So between last night and this morning, I was trying to alternative operating systems to see if I would be able to get said Minisforum HX90 to be stable.
Proxmox VE 7.1.2 would install and it would pick up on the onboard Intel I225-V 2.5 GbE NIC, but then after the system has rebooted, post-install; said NIC WASN'T available and I couldn't quickly discern why nor the root cause of that issue. Tried installing it again. Same problem.
So Proxmox was a bust.
Next I tried Ubuntu 20.04 LTS. It installed, but then I wasn't able to install Oracle VirtualBox 5.2 in a way where said Oracle VirtualBox 5.2 was working the way that it is supposed to, so that failed.
Then I tried downgrading to Ubuntu 18.04 LTS figure "okay, at least I should be able to get Oracle VirtualBox 5.2 installed." Well, that part was true, except that Ubuntu 18.04 was too old and didn't recognise the integrated Radeon GPU that is on the AMD Ryzen 9 5900HX processor that is in the HX90. The maximum resolution that it would display was 800x600. So, then after getting Oracle VirtualBox 5.2 installed, I figured "okay, maybe I can upgrade the system from here and that should give my the proper resolution back".
Nope.
I updated and upgraded to Ubuntu 20.04 from 18.04 and not only did I NOT get the proper screen resolution back, I also lost connectivity to the 2.5 GbE NIC which was, ironically, working in 18.04 before.
So, let's say just say - trying to get and make the system stable has been a complete and utter nightmare.
I've got a fresh install of Windows 10 21H2 now (well...I think that the installer was actually 20H2, but then I was able to run Windows update to update it to 21H2), so hopefully, that will be able to help stabilise the system, but we shall see. I'm in the middle of re-installing all of my Windows applications along with re-importing the Oracle VirtualBox VMs back into VirtualBox.
And if that doesn't work, it would be such a pity because the system has a LOT of potential, but if it doesn't work, I'll likely end up RMAing the system back to Minisforum, and then just buying 4x 16 GB of DDR4-whatever RAM (whatever is the most cost efficient, which, perhaps ironically, might be DDR4-3200), install that back into my Intel Core i7-6700K system, and use that system to host all of the VMs once again instead.
It won't be as fast as the AMD Ryzen 9 5900HX, but hopefully, at least it'll work and it won't freeze on my daily.
Hopefully.
23 February 2022
Vastly differing results in WSL2 between 5950X and 12900K in Windows 10 21H2
A little while ago, I came across this video which was talking about how you can run Linux graphical applications natively in Windows (more specifically, in Windows 11).
However, when at the time when I watched said original video, I didn't have any hardware that could actually really run that probably. My "newest" system that I had was an Intel Core i7-6700K and as far as I know, it didn't have the Trusted Platform Module (TPM) anywhere (whether it is as an external add-on dongle) or integrated into the motherboard firmware/BIOS.
So, I didn't really make much of it back then.
But since then, I've built both my AMD Ryzen 9 5950X system and also my Intel Core i9-12900K system and I figured that with some of the work that I needed the systems to be doing over with, I had a little bit of time with the system to do some more testing with it.
So I grabbed two extra HGST 1 TB SATA 6 Gbps 7200 rpm HDDs (one per system), threw Windows 10 21H2 on it, and proceeded with the instructions on how to install and configure Windows Subsytem for Linux 2 (WSL2). I installed Ubuntu 20.04 LTS (which really, turned out to be 20.04.4 LTS), and proceeded to try and install the graphical layer/elements to it.
So that's all fine and dandy. (Well, not really because in both instances, neither of the systems was able to start the display and I can't tell if it is because I have older video cards in the system (Nvidia GeForce GTX 980 and a GTX 660 respectively - because as a CentOS 7.7.1908 system, it didn't really matter what I had in there since I was going to remote in over VNC anyways).)
But, since I had it installed, AND by some miracle, Windows 10 picked up on the Mellanox ConnectX-4 dual port VPI 100 Gbps Infiniband cards automatically, I just had to manually give each card in each system an IPv4 address so that it can talk to my cluster headnode (which was still running CentOS along with the OpenSM), and connect up to the network shares that I had set up. (SELinux is a PITA. But I got Samba going on said CentOS system so that on the Linux side, it can connect up to the RAID arrays using NFS-over-RDMA whilst in Windows, it's just through "normal" Samba (i.e. NOT SMB Direct).)
So, I might as well benchmark the systems to see how fast it would be able to write and read a 1024*1024*10240 byte file.
And for fun, I also installed Cygwin on both of the systems as well, so that I can compare the two together.
Being that both systems was able to pick up the Mellanox ConnectX-4 card right away (I didn't have to do anything special, install the Mellanox drivers, etc.), I was able to connect up to my cluster headnode and the Samba shares were visible immediately. As a result of that, I was able to right-click on both of those shared folders and map it to a network drive directly and automatically.
Now, in WSL2, I had to mount the mapped network drive using the command:
$ sudo mount -t drvfs V: /mnt/V
(Source: https://superuser.com/questions/1128634/how-to-access-mounted-network-drive-on-windows-linux-subsystem)
And then once that was done, I was able to run the follow commands in both Ubuntu on WSL2 and also in Cygwin:
Write test:
$ time -p dd if=/dev/zero of=10Gfile bs=1024k count=10240
Read test:
$ time -p dd if=10Gfile of=/dev/null bs=1024k
Here are the results:
Huh. Interrresting. |
But what is interesting though is that the speeds are close, with the 5950X being a little bit faster under Cygwin than the 12900K, also under Cygwin.
I decided to blog about this because there is a potential possibility that for those that might be working with WSL2, the hardware that you pick MAY have an adverse performance impact.
I'm not sure who, if anybody, has done a cross-platform comparison like this before but to be honest, I haven't really bothered to look for it either because you might have reasonably expected that this significant performance difference wouldn't/doesn't exist, but the results clearly show that there's a difference. And a rather significant difference in performance at that.
Please be aware and you should do your own testing for your workload/case/circumstance if you get a chance to be able to do so.
22 February 2022
Why is Intel keeping the overall physical dimensions of their Intel 670p Series 2 TB SSD a secret?
I recently submitted my order for a Minis Forum HX90 (specs) and being that I am looking to use it to replace my very hot Intel NUC that I had previously written about (it's back up to 100 C nominal now), and that I might also be offload all of the virtualisation duties as well from my Intel Core i7-6700K system and onto this new system instead. As such, I didn't know if said new system would support RAID0 with my two existing Samsung EVO 850 1 TB SATA 6 Gbps SSDs that are no longer currently deployed in a system, so I figured that I was going to get a 2 TB NVMe SSD just to be safe and I landed on this - an Intel 670p Series 2 TB NVMe 3.0 x4 SSD (specs).
Whilst browsing through YouTube, I stumbled my way upon a video where they were talking about NVMe SSD and putting heatsinks on them and how they would thermal throttle the performance if said NVMe SSD got too hot whilst it was being used/under load.
So, that got me thinking - should I start looking and seeing if I should be getting a NVMe SSD heatsink of my own for this drive?
So, I reached out to the customer support at Minis Forum (based out of Hong Kong, which is interesting because their first email back to me was written entirely in Traditional Chinese), so I asked them about a SSD heatsink (because some of the review units that they've sent to other tech YouTubers included a NVMe SSD with a heatsink pre-installed in the system) and they told me that the total height that the HX90 can take, INCLUDING the NVMe SSD is 7 mm.
So, ok. No problems, right? If I can find out what's the overall height of the Intel 670p Series 2 TB NVMe 3.0 x4 SSD, then I can figure out what's the maximum height of a heatsink the HX90 can accept, and then I can start to look into what are my purchasing options.
So, then I reached out to Intel's customer support, because of course, lo and behold, the overall height of the Intel 670p Series 2 TB NVMe 3.0 x4 SSD isn't listed on their spec page.
Huh. No overall physical dimensions listed on Intel's website. |
So I reached out to Intel's customer service and asked them this basic question and also told them that it was because the manufacturer of the computer has told me what the maximum height of the combined SSD and heatsink can be so that I can properly size and purchase said heatsink. Their customer service rep said that they understand why I was asking for this information and would need to do further research on this topic/matter and that they would get back to me. Okay. Not a big deal.
Well earlier today, I got an email from said customer service rep stating quote:
Why would Intel keep the overall physical dimensions of their product under a NDA? |
So, at this point, it seemed awfully suspicious.
I told them that I am not asking on behalf of the company where I work, and therefore; I have no idea if they have a signed NDA with Intel or not. (And frankly, that shouldn't matter because a customer should be able to ask for the overall physical dimensions of their product (and not the overall dimensions of the box/packaging that their product gets shipped in either).)
I then told them that I will just measure my drive when it arrives and that as such, I will not be signing a NDA in regards to this.
Well, about 3 hours later, my drive arrived.
So, for those that are interested in knowing, the overall physical dimensions of the Intel 670p Series 2 TB NVMe 3.0 x4 SSD are:
Overall length: 80.12 mm
Overall width: 22.05 mm
Overall height: 2.0525 mm (average of 2.09 mm, 2.06 mm, 1.97 mm, and 2.09 mm)
So, in case you're out trying to shop for a NVMe heatsink, and you're trying to use it for a small form factor (SFF) or ultra compact form factor (UCFF) build, now you know the height of the NVMe heatsink you can get.
08 February 2022
A friendly reminder to periodically clean your NUC
I have an Intel BOXNUC8i7BEH (specs) and I have been using it to run a VM and also as a host system/unit.
Lately, it's been having issues where even when I tried to run it without the chassis (i.e. running it in an "open case" configuration, the temps were still hitting a peak of 100 C whilst downloading something in the VM and also with 12 Firefox tabs open on the host itself.
So, given that it was still running so hot, even with it running out of the case/in the "open case" configuration, I figured that I would shut the unit down, wait for it to cool off a bit, and proceed with the further disassembling the unit.
Once I took the fan off, there was a LOT of dust that had been trapped where the inlet to the copper heatsink was, so I was able to clean that off with damp tissue paper.
And I also figured that since I had some Thermal Grizzly's Kyronaut sitting around, that I might as well also remove the plate that the heatpipes are connected to, clean off the old thermal paste that's on the CPU, and give it some new thermal paste whilst I'm at it.
Lo and below, the current system, still doing exactly what it was doing before (picking up from where it left off when I powered down the system) is now sitting at a cooler 85 C or so.
Yay!
Moral of the story: remember to periodically clean your NUC!
28 January 2022
How to implement `pixz` for the HiveOX PXE boot server and mining rig clients
@hiveos
Have you or anybody ever tried moving to using pixz instead of using pxz for the parallel compression and parallel decompression of the boot archive (i.e. moving from hiveramfs.tar.xz to hiveramfs.tpxz)?
I tried dissecting through the scripts and I can't seem to find the part where the system knows to use tar to extract the hiveramfs.tar.xz file into tmpfs.
I've tried looking in /path/to/pxeserver/tftp and also in /path/to/pxeserver/hiveramfs and I wasn't able to find where it codifies the instruction and/or the command to unpack the hiveramfs.tar.xz.
If you can provide some guidance as to where I would find that in the startup script, where it would instruct the client to decompress and unpack the hiveramfs.tar.xz, that would be greatly appreciated.
Thank you.
*edit*
I've implemented pixz now for both parallel compression and the creation of the boot archive hiveramfs.tpxz and the decompression of the same.
It replaces the boot archive hiveramfs.tar.xz.
The PXE server host, if you are running an Ubuntu PXE boot server, will need to have pixz installed (which you can get by running sudo apt install -y pixz, so it's pretty easy to get and install.)
The primary motivation for this is on your mining rig, depending on the CPU that you have in it, but usually, at boot time, you will have excess CPU capacity, and therefore; if you can use a parallel decompression for the hiveramfs archive, then you can get your mining rig up and running that much quicker.
The side benefit that this has also produced is that in the management of the hiveramfs image on the PXE server, pixz worked out to be faster in the creation of the FS archive compared to pxz.
Tested on my PXE server which has a Celeron J3455 (4-core, 1.5 GHz base clock), it compressed the FS archive using pxz in 11 minutes 2 seconds whilst pixz was able to complete the same task (on a fresh install of the HiveOS PXE server) in 8 minutes 57 seconds. (Sidebar: For reference, previously, when using only xz (without the parallelisation), on my system, it would take somewhere between 40-41 minutes to create the FS archive.)
On my mining rig, which has a Core i5-6500T, it takes about 8.70 seconds to decompress hiveramfs.tpxz to hiveramfs.tar and then it takes about another 1.01 seconds to unpack the tarball file.
Unfortunately, I don't have the benchmarking data for how long it took my mining rig to decompress and unpack hiveramfs.tar.xz file.
Here are the steps to deploying pixz, and using that to replace pxz.
On the PXE server, install pixz:
sudo apt install -y pixz
Run the pxe-config.sh to specify your farm hash, server IPv4 address, etc. and also the change the name of the FS archive from hiveramfs.tar.xz to hiveramfs.tpxz.
DO NOT RUN HiveOS update/upgrade yet still!!!
When it asks if you want to upgrade HiveOS, type n for no.
For safety/security, make a backup copy of the initial hiveramfs.tar.xz file that can be found in /path/to/pxeserver/hiveramfs.
(For me, I just ran sudo cp hiveramfs.tar.xz hiveramfs.tar.xz.backup.)
You will need to manually create the initial hiveramfs.tpxz file that the system will act upon next when you run the hive-upgrade.sh script.
To do that, run the following:
/path/to/pxeserver$ sudo mkdir -p tmp/root
/path/to/pxeserver$ cd tmp/root
/path/to/pxeserver/tmp/root$ cp ../../hiveramfs/hiveramfs.tar.xz .
/path/to/pxeserver/tmp/root$ tar --lzma -xf hiveramfs.tar.xz
/path/to/pxeserver/tmp/root$ tar -I pixz -cf ../hiveramfs.tpxz .
/path/to/pxeserver/tmp/root$ cd ..
/path/to/pxeserver/tmp/root$ cp hiveramfs.tpxz ../../hiveramfs
/path/to/pxeserver/tmp/root$ cd ../../hiveramfs
/path/to/pxeserver/hiveramfs$ cp hiveramfs.tpxz hiveramfs.tpxz.backup
Now, edit the pxe-config.sh:
at about line 51, it should say something like:
#adde pxz (typo included)
copy lines 51-53 and paste it after line 53
(basically, add an i so that where it says pxz now says pixz instead)
edit the lines to read:
#adde pixz
dpkg -s pixz > /dev/null 2>&1
[[ $? -ne 0 ]] && need_install="$need_install pixz"
save, quit
Run pxe-config.sh again.
DO NOT RUN HiveOS update/upgrade yet still!!!
Now, your farm hash, IP address, etc. should all have been set previously. Again, when it asks you if you want to upgrade HiveOS, type n for no.
Now, we are going to make a bunch of updates to hive-upgrade.sh.
(For me, I still use vi, but you can use whatever text editor you want.)
/path/to/pxeserver$ sudo vi hive-upgrade.sh
at line 71, add pixz to the end of the line so that the new line 71 would read:
apt install -y pv pixz
I haven't been able to figure out how to decompress the hiveramfs.tpxz archive and unpack it in the same line.
(I also was unable to get pv working properly so that it would show the progress indicator, so if someone else who is smarter than I am can help figure that out, that would be greatly appreciated, but you can also remote into your PXE server again in another terminal window and run top to monitor your PXE server to make sure that it is working in the absence of said progress indicator.)
So the section starting at line 79 echo -e "> Extract Hive FS to tmp dir" now reads:
line80: #pv $FS | tar --lzma -xf -
line81: cp $FS .
line82: pixz -d $ARCH_NAME
line83: tar -xf hiveramfs.tar .
line84: rm hiveramfs.tar
Line84 is needed because otherwise, without it, when you go to create the archive, it will try to compress the old hiveramfs.tar in as well, and you don't need that.
Now fast forward to the section where it creates the archive (around line 121) where it says:
line121: echo -e "> Create FS archive"
line122: #tar -C root -I pxz -cpf - . | pv -s $arch_size | cat > $ARCH_NAME
line123: tar -C root -I pixz -cpf - . | pv -s $arch_size | cat > $ARCH_NAME
(in other words, copy that line, paste it, comment out the old line, and add an i to the new line.)
line125 is still the old line where it used the single threaded xz compression algorithm/tool, which should be already commented out for you.
The rest of the hive-upgrade.sh should be fine. You shouldn't have to touch/update the rest of it.
Now you can run hive-upgrade.sh:
/path/to/pxeserver$ sudo ./hive-upgrade.sh
and you can run it to check and make sure that it is copying the hiveramfs.tpxz from /path/to/pxeserve/hiveramfs to /path/to/pxeserver/tmp/root, decompressing the archive, and unpacking the files properly.
If it does that properly, then the updating portion of it should be running fine, without any issues (or none that I observed).
Then the next section that you want to check is to make sure that when it repacks and compresses the archive back up, that that should be working properly for you.
Again, it is useful/helpful to have a second terminal window open where you've ssh'd into the PXE server again, with top running so that you can make sure that the pixz process is working/running.
After that is done, you can reboot your mining rig to make sure that your mining rig is picking up the new hiveramfs.tpxz file is ok and that it is also successful in decompressing and unpacking the archive.
I have NO idea how it is doing that because normally, I would have to issue that as two separate commands, but again, it appears to be working with my mining rig.
*shrug*
It's working.
I don't know/understand why/how.
But I'm not going to mess with it too much to try and figure out why/how it works, because it IS working.
(Again, if there are other people who are smarter than I am that might be able to explain how it is able to decompress and unpack a .tpxz file, I would be interested in learning, but on the other hand, like I said, my mining rig is up with the new setup, so I'm going to leave it here.)
Feel free to ask questions if you would want to implement pixz so that you would have faster compression and decompression times.
If your PXE server is fast enough for you such that pxz is fast enough for you and this isn't going to make enough of a difference for you, then that's fine. That's up to you.
For me, my PXE server, running on a Celeron J3455 is quite slow, so anything that I can do to speed things up a little bit is still a speed up.
Thanks.
06 January 2022
Getting the latest and greatest hardware running in Linux is sometimes, a bit of a nightmare
Just prior to the holidays, I decided to upgrade one of three of my systems and consolidate it down to two. My old Supermicro Big Twin^2 Pro micro cluster server and two HP Z420 workstations (that I was using in lieu of the Supermicro because the Supermicro was "too loud") were getting replaced by an AMD system, built on the Ryzen 9 5950X CPU and an Intel system, built on the latest and greatest that Intel had to offer - the Core i9-12900K.
So, I speced out all of the rest of the hardware, which really, consisted of the motherboard, RAM, and the CPU heatsink and fan assembly whilst I was able to reuse some of my older, existing components as well. (I did have to buy an extra power supply though because I had originally miscalculated how many power supplies that I would need.)
So that's all fine and dandy. All of the hardware arrived just before the start of the Christmas break for me, so I started to set up the AMD system. Install the CPU, the RAM, the CPU HSF, plug everything in, check and double check all of the connections - everything is good to go. I used Rufus USB to write the CentOS 7.7.1908 installed onto a USB drive, plug in the keyboard, mouse, and flip the switch on the power supply and off I go right?
[buzzer]
Nope!
Near instant kernel panic. Nice. |
As you can see from the picture above, less than 3 seconds into the boot sequence from the USB drive - Linux has a kernel panic.
Great.
So now I get the "fun" [/sarcasm] job of trying to sort this kernel panic out. Try it a few more times, the same thing happens.
So, ok. Now I'm thinking that the hardware is too new for this older Linux distro and version (and kernel). So, I take out my Intel Core i7-3930K system (one of them that I use to run my tape backup system), and I plug the hard drive into that system, along with the video card back in, and run through the boot and installation process (which worked without any issues of course), power down the 3930K, take the hard drive back out, and plug it into the 5950X system. Power it on. (I set the BIOS to power on after AC loss so that I can turn on the system even when it isn't inside a case and I don't have a power button connected to it.)
The official CentOS forums state that they only support CentOS 7.9.2009, so I try that as well, still to no avail.
Eventually, I end up using a spare Intel 545 series 512 GB SATA 6 Gbps SSD that I had laying around so that I could try installing and re-installing, trying different drivers, kernel modules, kernels, etc. a LOT faster than I was able to with a 7,200 rpm HDD.
End net result: I filed a bug report with kernel.org because the mainline kernel 5.15.11 kept producing kernel panics with the Mellanox 100 Gbps Infiniband network card installed. And it didn't matter whether I tried to use the "inbox" CentOS Infiniband drivers or the "official" Mellanox OFED Infiniband drivers.
Yet another Linux kernel panic. |
Interestingly enough, the mainline kernel 5.14.15 works with the Infiniband NIC just fine. So that's what I landed on/with.
The other major problem that I ran into was that the Asus X570 TUF Gaming Pro (WiFi) used the Intel I225-V 2.5 GbE NIC. Unbeknownst to me when I originally purchased the motherboard, I didn't realise that Intel does NOT have a Linux driver (even on Intel's website) for said Intel I225-V 2.5 GbE NIC. And what was weird was that when I migrating the SSD over during the testing and trying to find/figure out a configuration that worked, said Intel onboard 2.5 GbE NIC would work initially, but then it would eventually and periodically drop out and so that was quite the puzzle because if there wasn't a driver for it, then how was it that it was able to work when I moved the drive over?
As a result of that, that took up a couple of days where I would be trying to clone the disk image over from the Intel SSD over onto the HGST HDD using dd and in the end, that didn't work either.
So, what did I end up with?
This is the hardware specs that I ended up with on the AMD system:
CPU: AMD Ryzen 9 5950X (16-core, 3.4 GHz stock base clock, 4.9 GHz max boost clock, SMT enabled)
Motherboard: Asus X570 TUF Gaming Pro (WiFi)
RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)
CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan
Video card: EVGA GeForce GTX 980
Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD
NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)
NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)
Power Supply: Corsair CX750M
OS: CentOS 7.7.1908 kernel 5.14.15-1-el7.elrepo.x86_64
I ended up adding the Intel Gigabit CT Desktop NIC because a) it was an extra Intel GbE NIC AIC that I had also laying around, and b) it proved to be able to provide a vastly more reliable connection than the onboard Intel I225-V 2.5 GbE due to the driver issue.
Now that I have the system set up and running, there is a higher probabilty that the igc kernel module probably works more reliably now than it did when I was originally setting up the system, but being that it was not reliable when I was doing the initial setup and testing, I am less likely to use said onboard NIC, which is a pity. Brand spankin' new motherboard and I can't even use nor trust the reliability of the onboard NIC. And I can't even blame Asus for it because it is an Intel NIC. (Sidebar: Ironically, the Asus Z690 Prime-P D4 motherboard that I also purchased uses a Realtek RTL8125 2.5 GbE NIC, which I WAS able to find a driver for that and it has been working flawlessly with it.)
That took probably on the order of around 10 days, from beginning to end, to get the AMD system up and running.
The Intel system was a little bit easier to set up.
The kernel panic issue with the mainline 5.15.11 kernel and Infiniband was also present on the Intel platform as well.
Interestingly and ironically enough, the newer kernel kept crashing or had severe stability issues. It turns out that I did NOT install the RAM correctly (i.e. in the DIMM_A2 and DIMM_B2 slots), so since then, I've corrected that.
Keen readers might note that I have stated that I have 4 sticks of RAM, except that one of the sticks arrived DOA, and is currently being sent back to Crucial under RMA, so when it comes back, then I will be able to install the extra stick that is currently not installed and the stick that is due back from the RMA exchange.
I might try the newer kernels again later, but for now, at least the system is up and running so that I can start making it do the work that I need it to be doing.
The system stability issues due to the error that I made when installing (and uninstalling) the RAM (because I was testing the stick of RAM that wouldn't POST that ended up getting RMA'd back to Crucial), I ended up with a RAM installation configuration that wasn't correct and the resulting system stability issues ate up a few more days.
So, in the end, it took me almost the entire Christmas holiday to get both of these systems up and running.
(This is also a really good reason why traditionally, I have stuck with workstation and server hardware because on my old Supermicro micro cluster, I can deploy all four nodes in 2 hours or less. It's a pity that the system is too loud.)
This is the hardware that I ended up with on the Intel system:
CPU: Intel Core i9-12900K (16 cores (8P + 8E), 3.2 GHz/2.4 GHz base clock speed, 5.2 GHz/3.9 GHz max boost clock, HTT enabled)
Motherboard: Asus Z690 Prime-P D4
RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)
CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan
Video card: EVGA GeForce GTX 660
Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD
NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)
NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)
Power Supply: Corsair CX750M
OS: CentOS 7.7.1908 kernel 3.10.0-1127.el7.x86_64AMD Ryzen 9 5950X is faster than the Intel Core i9-12900K for mining Raptoreum
The results speak for themselves.
The AMD Ryzen 9 5950X is faster at mining Raptoreum than Intel's latest and greatest 12th gen Core i9-12900K.
System/hardware specs:
AMD:
CPU: AMD Ryzen 9 5950X (16-core, 3.4 GHz stock base clock, 4.9 GHz max boost clock, SMT enabled)
Motherboard: Asus X570 TUF Gaming Pro (WiFi)
RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)
CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan
Video card: EVGA GeForce GTX 980
Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD
NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)
NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)
Power Supply: Corsair CX750M
OS: CentOS 7.7.1908 kernel 5.14.15-1-el7.elrepo.x86_64
Intel:
CPU: Intel Core i9-12900K (16 cores (8P + 8E), 3.2 GHz/2.4 GHz base clock speed, 5.2 GHz/3.9 GHz max boost clock, HTT enabled)
Motherboard: Asus Z690 Prime-P D4
RAM: 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM CL22 (128 GB total)
CPU HSF: Noctua NH-D15 with one stock 140 mm fan, and one NF-A14 industrialPPC 3000 PWM fan
Video card: EVGA GeForce GTX 660
Hard drive: 1x HGST 1 TB SATA 6 Gbps 7200 rpm HDD
NIC: Mellanox ConnectX-4 dual port 100 Gbps 4x EDR Infiniband (MCX456A-ECAT)
NIC: Intel Gigabit CT Desktop 1 GbE NIC (Intel 82574L chipset)
Power Supply: Corsair CX750M
OS: CentOS 7.7.1908 kernel 3.10.0-1127.el7.x86_64
Configuration notes:
I had about two weeks over the Christmas break 2021 to receive all of the hardware, assemble the systems, and get the systems set up and up and running. And that was quite the endeavour because with the latest and greatest hardware, the older versions of CentOS (7.7.1908) and the older kernel didn't work with all of the features and functions with this level of hardware.
As a result, I had to "jumpstart" both systems by first installing the OS using my Intel Core i7-3930K system (Asus X79 Sabertooth motherboard, 4x Crucial 8 GB DDR3-1600 unbuffered, non-ECC RAM, Mellanox MCX456A-ECAT, GTX 660) first, and then update the systems (at least in part) before I can transplant the hard drive with the OS install into their respective systems and finish setting the systems up. (I will write more about that "journey"/clusterf in another blog post here shortly because it was quite the journey to jumpstart both of these systems simultaneously, which took pretty much the full two weeks that I had.)
You will find out how and why I ended up with the respective hardware choices in that blog post.
I am using cpuminer-gr-1.2.4.1-x86_64_linux from here (https://github.com/WyvernTKC/cpuminer-gr-avx2/releases/tag/1.2.4.1).
For the Intel system, because of the existence of the combination of (P)erformance Cores and (E)fficiency Cores and HyperThreading, this resulted in more combinations that I had to test in order to find the setting that had the highest Raptoreum hash rate as reported from their benchmarking tool. Each time the CPU configuration changed, I ran a full tune again, which you might well imagine, took quite some time to do. In the cases where the efficient cores were disabled, I also tested and re-ran the full tune for both AVX2 and AVX512.
The AVX512 runs (both times I ran it, i.e. with and without HyperThreading), resulted in thermal throttling with about a 23-24 C ambient at the time.
For the AMD system, testing it was a lot simpler because it was either only SMT on or off.
Results:
The results speak for themselves.
The AMD Ryzen 9 5950X with SMT enabled produces the highest hash rate (3953.64 hashes/second). Compare that with the run where SMT was disabled, enabling SMT results in about a 10.3% increase in the hash rate performance result.
The Intel Core i9-12900K results are an interesting case. Despite the plethora of benchmarks talking about how great and how fast the latest and greatest from Intel is (and there ARE some things that said latest and greatest from Intel are great at), unfortunately, for Raptoreum mining, this is not one of them.
At best, the 5950X with SMT enabled compared to the 12900K with all 16 cores AND HyperThreading enabled puts the 5950X at about 80.9% faster in Raptoreum hash rate performance vs. the 12900K.
Comparing like-for-like thread count, whether it is 8P+8E without HyperThreading, the 16 core/16 thread of the 5950X is still approximately 64.2% faster than the 12900K. Without the efficiency cores, but turning HyperThreading back on (i.e. 8P+0E with HyperThreading), the 12900K again is bested by the 5950X by a 71.8% margin.
Unfortunately, running this in Linux meant that I didn't have or didn't know of a tool like Hardware Info 64 to be able to report power consumption figures/values. Maybe I might get around to re-running this test again in Windows, but for now, this might be helpful to those who might be interested in looking for guidance if mining Raptoreum is on your mind.