20-year-old chipset solutions have been hurting modern AMD Linux systems
- Seagate 12TB HDD: 7.46% failure rate and 1.5 year lifespan
- An American company made 0.7nm chips: EUV lithography machines can’t do it
- 14000 cores + 450W: RTX 4080 graphics card perfectly replaces the RTX 3080
- Big upgrade: The difference between Bluetooth 5.0 and 5.2
- Geeks Disappointed that RTX 4080/4090 doesn’t come with PCIe 5.0
- What are advantages and disadvantages of different load balancing?
20-year-old chipset solutions have been hurting modern AMD Linux systems.
AMD engineer K Prateek Nayak recently discovered that a 20-year-old chipset workaround in the Linux kernel is still used in modern AMD systems, and in some cases hurts the performance of modern Zen hardware.
And a related patch is proposed that aims to limit this workaround to older systems and thus help improve performance on modern systems.
According to the introduction , since ACPI support was added to the Linux kernel in 2002, there has been a “dummy wait op” to deal with some chipsets’ STPCLK# not being processed in a timely manner.
This dummy I/O read delays further instruction processing until the CPU stops completely.
“At least in some AMD Athlon era systems with VIA chipsets this was an issue…but not for newer chipsets from the last two decades or so “.
K Prateek Nayak stated:
Sampling some workloads with IBS on AMD Zen3 systems shows that a significant amount of time is spent in dummy ops, which are erroneously seen as C-State residency. A large C-State residency value can cause the cpuidle governor to recommend a deeper C-State during subsequent idle instances, starting a vicious cycle that results in performance degradation for workloads that rapidly switch between busy and idle phases.
One of the workloads is tbench, where a large performance drop can be observed during some runs.
So at least for Tbench, this long-term, unconditional workaround in the Linux kernel has been hurting AMD Ryzen/Threadripper/EPYC performance in certain workloads:
It does not affect modern Intel systems though, as newer Intel platforms use the MWAIT-based intel_idle driver code path instead.
It’s worth mentioning that Intel Linux engineer Dave Hansen made further simplifications on K Prateek Nayak’s patch .
The patch will not apply this “dummy wait” workaround, so AMD systems will drop this action that degrades performance on modern systems.
This patch has now been merged into Linux 6.0 as part of the x86/urgent fix.
- DIY a PBX (Phone System) on Raspberry Pi
- How to host multiple websites on Raspberry Pi 3/4?
- A Free Intercom/Paging system with Raspberry pi and old Android phones
- DIY project: How to use Raspberry Pi to build DNS server?
- Raspberry Pi project : How to use Raspberry Pi to build git server?