23 February 2022

Vastly differing results in WSL2 between 5950X and 12900K in Windows 10 21H2

A little while ago, I came across this video which was talking about how you can run Linux graphical applications natively in Windows (more specifically, in Windows 11).

However, when at the time when I watched said original video, I didn't have any hardware that could actually really run that probably. My "newest" system that I had was an Intel Core i7-6700K and as far as I know, it didn't have the Trusted Platform Module (TPM) anywhere (whether it is as an external add-on dongle) or integrated into the motherboard firmware/BIOS.

So, I didn't really make much of it back then.

But since then, I've built both my AMD Ryzen 9 5950X system and also my Intel Core i9-12900K system and I figured that with some of the work that I needed the systems to be doing over with, I had a little bit of time with the system to do some more testing with it.

So I grabbed two extra HGST 1 TB SATA 6 Gbps 7200 rpm HDDs (one per system), threw Windows 10 21H2 on it, and proceeded with the instructions on how to install and configure Windows Subsytem for Linux 2 (WSL2). I installed Ubuntu 20.04 LTS (which really, turned out to be 20.04.4 LTS), and proceeded to try and install the graphical layer/elements to it.

So that's all fine and dandy. (Well, not really because in both instances, neither of the systems was able to start the display and I can't tell if it is because I have older video cards in the system (Nvidia GeForce GTX 980 and a GTX 660 respectively - because as a CentOS 7.7.1908 system, it didn't really matter what I had in there since I was going to remote in over VNC anyways).)

But, since I had it installed, AND by some miracle, Windows 10 picked up on the Mellanox ConnectX-4 dual port VPI 100 Gbps Infiniband cards automatically, I just had to manually give each card in each system an IPv4 address so that it can talk to my cluster headnode (which was still running CentOS along with the OpenSM), and connect up to the network shares that I had set up. (SELinux is a PITA. But I got Samba going on said CentOS system so that on the Linux side, it can connect up to the RAID arrays using NFS-over-RDMA whilst in Windows, it's just through "normal" Samba (i.e. NOT SMB Direct).)

So, I might as well benchmark the systems to see how fast it would be able to write and read a 1024*1024*10240 byte file.

And for fun, I also installed Cygwin on both of the systems as well, so that I can compare the two together.

Being that both systems was able to pick up the Mellanox ConnectX-4 card right away (I didn't have to do anything special, install the Mellanox drivers, etc.), I was able to connect up to my cluster headnode and the Samba shares were visible immediately. As a result of that, I was able to right-click on both of those shared folders and map it to a network drive directly and automatically.

Now, in WSL2, I had to mount the mapped network drive using the command:

$ sudo mount -t drvfs V: /mnt/V

(Source: https://superuser.com/questions/1128634/how-to-access-mounted-network-drive-on-windows-linux-subsystem)

And then once that was done, I was able to run the follow commands in both Ubuntu on WSL2 and also in Cygwin:

Write test:
$ time -p dd if=/dev/zero of=10Gfile bs=1024k count=10240

Read test:
$ time -p dd if=10Gfile of=/dev/null bs=1024k

Here are the results:

Huh. Interrresting.

I have absolutely NO clue why WSL2 on the 5950X is so much slower compared to WSL2 on the 12900K.

But what is interesting though is that the speeds are close, with the 5950X being a little bit faster under Cygwin than the 12900K, also under Cygwin.

I decided to blog about this because there is a potential possibility that for those that might be working with WSL2, the hardware that you pick MAY have an adverse performance impact.

I'm not sure who, if anybody, has done a cross-platform comparison like this before but to be honest, I haven't really bothered to look for it either because you might have reasonably expected that this significant performance difference wouldn't/doesn't exist, but the results clearly show that there's a difference. And a rather significant difference in performance at that.

Please be aware and you should do your own testing for your workload/case/circumstance if you get a chance to be able to do so.