As someone who has written embedded firmware for many years (not for PCs), I can only dream of an end user being this capable to discover a bug. I want to live in the world where Asus immediately send an e-mail offering some kind of short-term contracting work to fly in and talk to their firmware people for a few days and get $FIVE_FIGURES or something, and leave with an updated laptop running their new production BIOS.
Obviously this bug has gone un-fixed for four years so that is not the world we're in. That makes me sad. :|
Edit: s/fix/fix proposal/.
- this sounds ubiquitous and reproducible. How did this not get fed back through tech support/RMA channels? Was there so little evidence that it wasn't correlateable, or did ASUS look and arrive at an incorrect conclusion, eg batch of bad silicon? Could it be that they had plentiful evidence and were negligent or incompetent?
- it sounds like this is plainly evident when using the machine. What is the QA process? This should not have been possible to miss?
- now that they know, what will they do?
Imo, the ceo calculus here is clear. If you're a luxury good with elastic demand, you fix the issue and fix the perception (two separate things). Multi-year, multifaceted issues like this have the potential to ruin a brand. I've bought ROG in the past, and I'm inclined to never do so again.
EDIT: on further reflection, the firmware bug itself is pretty troubling. the other bugs i get - hardware assumptions were changed, or good code was reused that didnt know or support the gpu mux, i see how those errors comes about. the method sleeping an interrupt... is awful? how did that get reviewed? what is that firmware test suite?
Every one of the affected ASUS laptops probably got a glowing 5/5 review from the usual suspects, and consumers have little hope of getting a fair deal
There are vendors which do better generally, or have less aggressive 5-star robots. I got an MSI board now which came at a fraction of the cost of an ASUS board. It has worse specifications, but in all honestly. It works. it does what it says on the box without any grief. -- maybe it was a lucky shipment -- , but I am not going back to ASUS or ASRock. rather have 2 FPS less but a device that stays operation and can do it's basic features...
A classic example of not giving a toss about performance is the _horrible_ integration done for Windows Hello protocol on many platforms. The protocol is really good, yet there are bypasses possible on a lot of devices due to bad/incorrect implementations, completely breaking an for-once-actually-good-thing that MS designed.
buying consumer hardware, especially for gaming, is like a lottery these days, and shops / vendors give a lot of grief often (not always..) declining refunds or blaming bad user practices for clear device defects.
Ultimately I think this problem will fix itself. ASUS will eventually burn through enough customers that they will have to exit certain segments I guess?
Have you used consumer goods [or virtually anything] from the last couple decades? By and large, nobody cares. Look at the timeline here; clearly nobody cares.
It's also just a terrible disaster of a programming environment, with a very large (terrifyingly so, given the limited capability) interpreter that needs to live at the highest privilege level of the kernel.
And it's generally used like a hatchet by system integrators for tricks like this, with pretty much exactly the code quality you'd expect. Almost always the path to writing a Linux driver for some oddball laptop subsystem starts with "throw away the ACPI stuff".
Windows laptops are dead on arrival for me, all windows laptops are physical shovelware
Mixed blessing, but still more blessing than curse.
I managed to reverse engineer a lot of my laptop's features but hit a wall when it came to this ACPI stuff. I dumped the tables and decompiled the code but all I got was stub code. I wanted to be the guy who wrote the Linux drivers for his own laptop but I just didn't manage it. Massive respect for anyone who can do this.
A quote from one of the linked reddit threads. I wonder if the warranty trip is part of their scheme.
"I did everything you suggested , but nothing changed. I send it back via garante. I am curious what they do whit it."
"what was it at the end? did they respond?"
"They have claimed that the plato works perfectly. So basically i just got use to it. I am using bluetooth earbuds all the time so i cant notice the problems."
One was the first gen Alienware M17 with two GTX 270M GPUs (yes two) and an onboard nvidia GPU whose specific model I can't remember. That one suffered from stutters and audio crackling, etc. It was sort of fixed by disabling SLI and the onboard GPU and sticking to a specific driver that was modded, the driver was by someone on the notebookcheck forums IIRC. Later on I think it got somewhat patched with a bios update that let you use SLI without the stutters, but I think the laptop reached EOL without it being fully fixed.
The second was an ROG ASUS laptop with a GTX 460m (I can't recall the laptop model). Pretty much the same story as the OP but I didn't have the knowhow to go deep into the ACPI code. The only change from the story is that latencymon kept attributing the latency spikes to multiple dlls, sometimes it was some wifi driver, other times it was an nvidia one. I don't remember the full fix for that one, but it involved me changing the wifi card and disabling the dGPU (not the onboard one) when I was not gaming so I could watch videos and such without it crackling. Funnily enough it didn't crackle much when actually playing games (it still happened, just very rarely).
I stopped buying gaming laptops after that. Seeing this story makes me think things haven't changed one bit.
What use is a "good price", when what you get is a quality and support minefield?
First one was a Clevo (rebranded as Medion) with a GTX 970m that I bought in 2017. An absolute beast, I lugged it in a backpack around the world for 4 years, including to places you really shouldn't bring a laptop like beaches and rainforests. I passed it my girlfriend's nephew and it is still going strong and being used every day. I repasted once in that time.
My current laptop is an MSI GE66 with an RTX 3070m bought in 2022. It's loud, I've repasted recently because it started overheating. It had some problems with the screen connector which they fixed under warranty fairly quickly. But aside from that it's solid.
One thing about both of these laptops - they are very easy to open and it looks like I could repair/replace pretty much every removable component easily. No glue.
The only thing I consider a real problem is the MSI fan noise. Well, that and the power brick which is the size of a literal brick.
Never again. A laptop with a dGPU runs counter to the things a laptop should be. Keeping gaming activities on a desktop is the best option in my experience.
A few months ago, I started working at an e-waste recycling company, and discovered that used Microsoft Surface tablets are what I've been looking for. My work "laptop" is a Surface Pro 5 with Debian (my work desktop is an Optiplex micro). I'm typing this on a Surface Go (with BlissOS) that I bought for myself. The cameras don't work on either and the work Surface never knows it's battery status, but I don't care (it lasts an entire afternoon with a barcode scanner, good enough for me).
If you don't like a dGPU in a laptop, that's fine. But people have different needs. I travel a lot and do 3d content creation work.
I've had 2 now from different manufacturers and the firmware seems alright due to the integrated nature of the API making them all fairly homogenous.
Also, comparing a steamdeck to a modern gaming laptop is like comparing a $1 water pistol to a super soaker.
All the complexity of a PC, in a package the size of a book, with the engineering quality of a Happy Meal toy.
And booting something that isn't a funny variant of a locked down OS is relatively hard.
To go off your McDonald's analogy, you can get a lot of kCals without necessarily getting a lot of nutrients.
Edit: GP comment wasn't yours, but I think my point still stands.
That’s not a complaint have heard before. My needs aren’t huge and it has a lot more of everything than I need.
> And booting something that isn't a funny variant of a locked down OS is relatively hard.
I wouldn’t want anything else in it, but with a Mac mini I really wish it would run something Linux more easily. They are a great headless server, but the OS is really limiting.
Apple Silicon Macs are a 180. Fantastically fast and efficient hardware stuck with an increasingly locked down OS, zero upgrade path and still a premium price.
If you’re holding on to the memory of Intel Macs I can certainly agree, they were not great.
Skill issue.
In any other industry everyone would be returning their acquisitions day one.
About 35 years ago, I had a teacher asserting computers are like buying shoes that randomly explode when tying them.
Thankfully consumer laws are finally happening.
How come people have become so obedient ?! That's crazy.
Most people have always been that way, will always be that way. IME the vast majority of people who don't stand for this shit are autistic.
ASUS at the time had an exclusive deal with AMD to ship their Ryzen 4xxxHS line.
Initially it worked fine, but two years later performance was already much worse and dominated by thermal throttling. Repasting, though necessary due to the state of the paste, only helped partially.
I still don't know the root cause of the issue, but I investigated declining battery performance and it turned out that the iGPU was going full throttle at all times. Setting the dGPU as the preferred device actually improved battery life somewhat.
When mechanical failures started accumulating I switched to a FW16 and never looked back. I don't care what gaming laptop manufacturers have on offer and for how little if I can't buy having them give a shit about their products and customers.
It will thermal throttle itself to uselessness within seconds of a load being placed on it. The dGPU idles at about 15 W, the entire power budget of a single board computer, and it's one of those problematic nvidia GPUs that will never be properly supported on Linux. The Windows app that controlled things like fans and keyboard LEDs was so obnoxiously bad they required over one minute to show a window on the screen, reverse engineering that thing was one of the best things I've ever done. Mercifully the firmware wasn't broken by default but I still didn't manage to reverse engineer the ACPI nonsense, I dumped the tables and decompiled the code but there was nothing useful.
Looks like Apple has a monopoly on good taste and giving half a shit about the quality of the products they sell. I wish the Apple silicon macbooks existed at the time.
With the ASUS I had a setup with a cooling pad where the metal grid cover was removed and the sides were sealed with foam to enhance flow from the pad's fan and only with that I could maybe get 30-45min of gameplay until throttling started.
Meanwhile the Framework has overall much higher power consumption, but still manages to whoosh all that hot air out. I can't take these companies seriously if a much smaller business that is not focused squarely on gaming is running circles around them.
My mother rocks an M1 Air which she got for pennies and it's a great all around home computer.
The laptop works fine in Optimus mode even with external displays, you just lost a bit of performance and you're missing out on some display features like G-Sync. So it is highly likely that most users always use the laptop in Optimus mode. If you primarily use the laptop as a laptop you probably wouldn't even know the mux feature existed.
The problem is Asus shipping extra features in their hardware that are not properly QA tested. It looks like they only thoroughly tested the golden path.
The problem isn't limited to some units, there was plenty of discussion online of this issue at the time of release. [1]
Asus never recalled, fixed, or even responded to the issue. Indeed, even the marketing page [2] still talks about how you can use HDMI 2.0 to connect 4K TVs at 60Hz.
It was also an interesting showcase of laptop reviewer incompetence. All the reviews just regurgitated Asus marketing material on how it has HDMI 2.0, but apparently nobody actually tested it.
--
[1] https://rog-forum.asus.com/t5/rog-zephyrus-series/gx501-zeph...
[2] https://rog.asus.com/laptops/rog-zephyrus/rog-zephyrus-gx501...
For this specific case, getting no fix since the issues have been reported in 2021 is tough to brush over.
Asus already has a spotty reputation regarding to customer repairs and business practices, so this issue piling on top of that is unfortunate.
Apple is just as guilty for shipping laptops with hardware issues that you just have to work around. And unlike this Asus issue the Macbook mux was on by default. You had to turn it off in the settings if you wanted to entirely avoid the issue and then you would have no way of using the discrete GPU.
They had a special Lenovo driver that would occasionally become overriden by Windows updates but could be reinstalled manually, I dual-booted Debian though and getting the system to work properly under that was a nightmare. There were a couple years when I simply gave up, I got it to work with the iGPU and I wasn't running anything more graphics intensive than a browser so I simply left the discrete GPU idle while running Linux.
Incredibly frustrating.
I think all these manufacturers are desperate to get their published specs for battery life estimates to double-digit hours that can't be reached while running the discrete GPU at full speed all day. Heck, they can't be reached while running the CPU at full speed all day, you're not going to run a 35W processor and a 55W graphics card and a 20W display (10W when you arbitrarily reduce the max brightness when on battery power) all day.
You've got like 90 watt-hours available in the battery, at 100% usage on everything the real capacity is gone in under an hour...which is unacceptable. So Asus and Apple and Lenovo and everyone else have to come up with some hack to turn it off whenever it's on so that the spec sheet says you can get 8, 10, 12 hours of runtime.
>Even installing Linux, only to find the problem persists. [...] >The problem is far deeper, embedded in the machine's firmware, the BIOS.
Anyway it's not as if the Linux laptop user experience in general were much better.
It is possible on Linux to override some of the firmware (most notably the DSDT, e.g https://wiki.archlinux.org/title/DSDT because so much hardware is broken). So, if you can make or get a fixed version, you should be good. A wholesale replacement of all the ACPI assets, though, seems unlikely. I could well be wrong, though.
Anyway, in this case, I suspect the poster was advocating for Macs.
A percentage of users were "unlucky" to hit a bug where Installer.app would end up in infinite loop trying to unpack a pkg file when updating OSX. My personal record is minor update that ultimately took a week.
However, this is not the only problem with Asus bioses. My daughter has one and it randomly locks up if you add an extra SSD, sooner or later depending on the SSD. You'll blame the SSD's firmware, but the most locking one was one that I have in two desktops with no problems...
They're like four-seater off-road motorcycles. You have to NOT understand how sketchy that concept is to consider one. The engineers has to know that they're guilty to be involved in it.
What's sad is that a lot of buyers are falling for it from the presumption that laptops are the most standard and regular type of computers. But I guess there's little we could do about it.
The laptop was absolutely useless at playing games, because it would throttle itself thermally after about 30 seconds. Which was ironic given that I used to work at a games development company and the ability to play games was actually a core feature of the product. I then used to have a Razer Blade 15 which wasn't as bad but would also eventually start throttling hard - just inadequate cooling imho.
Funnily enough I have a much cheaper MSI gaming laptop now with an i7 and a 3070Ti and that never throttles, I can run games without it slowing down. But clearly the cooling system in it is massively overbuilt, which is great.
Maybe they learned their lesson. I had an MSI gaming laptop a while back, and it ran horribly, I never realised until long after it was possible for me to return it, that it was just poorly designed, and could never run beyond ~50% of its gaming performance. Within minutes of starting a game it would be thermally throttled and that was that; it also sounded like it was about to take off, to the point you could barely drown it out with headphones.
My main drive in choosing the MSI I did back then was the thinness and lightness, which was counter-productive to good cooling performance, mine had a GTX970M but was about 1cm thick; the bottom of the case got so hot it would burn you if you touched it after a while of gaming.
It's a gaming laptop. If you're playing any game released in the past 5 years, odds are you're getting constant stutters anyway due to Unreal Engine 5. And Windows 11 is a slow, bloated mess, too, so that covers stutters outside of games.
For most end users, and especially gamers, stuttering and overall bad performance is just the new normal that they've come to accept and even embrace. The recent success of Borderlands 4, a game that struggles to run smoothly on the best and most expensive hardware available today, is just the latest and best proof of this. If you complain about it, you'll be called poor for not owning a $3000 GPU and/or a luddite for not wanting to play at 720p 30fps AI-upscaled to 4K 300fps.
You buy product after stellar review, encounter problem, search for solution, find reddit thread where everyone is "yeah, it is always like that, why do you act surprised?"
Why indeed?
(I'm not sure if it lags in igpu pass-through mode)
notebookcheck does report latencymon numbers and even remarks when it says the laptop is not recommended for real time audio:
https://www.notebookcheck.net/The-RTX-5090-Laptop-and-mini-L...
They don't see these extreme values though.
Yet widely known (to enthusiasts) problems, like stutters from the OP, are often not mentioned at all.
LinusTechTips doesn't depend on ASUS money in any meaningful way, but still failed to mention stutters in their Zephyrus G16 review. Some might say LTT is not a reviewer, he is an entertainer, but he undeniably thrives to be accurate while doing so.
Asus has what looks like some great offerings and they are just loosing customers over the simple stuff. Dell has the same kind of problems. I have had a XPS with dual graphics since 2017 and only in 2023 did I finally get the magical combination of drivers and "old" firmware such that its perfectly stable. Of course its thunderbolt dock has a firmware issue where it detects external TB drives wrong about 50% of the time when it wakes from sleep. I just know its some bone head firmware code similar to this issue.
Message to the CEOs of these companies: Stop outsourcing your firmware dev to the lowest bidder! One can argue its kinda of the easiest part to get right! And then have a support channel that recognizes technically gifted people and fast track their issues to the top of your backlog! It will pay for itself in sales I promise!
I am not surprised by this story.
they are treating software as important as they need to in order to turn a profit. it is ofcourse disappointing that you can sell such trash to willing buyers, but the market is what it is.
I don't think I ever put any of my laptops into dGPU-only mode via MUX, it's stupidly power hungry with little upside.
Unless you specifically bought that model because you want low-latency output to an external display with G-SYNC.
I've used Asus motherboards in my gaming PCs for years, and their BIOS/UEFI firmwares there are equally awful, their Ryzen AGESA stuff has been a complete mess.
If the iGPU, then the dGPU is basically sending its frame buffer through the iGPU and is limited by it.
Here's a link to MSI forums also discovering the ACPI hiccups with Latencymon: https://forum-en.msi.com/index.php?threads/constant-micro-st...
Just google "nvidia gaming laptop stutter latencymon acpi"
But it never ceases to amuse me watching brands that position themselves as 'premium' spending pennies on firmware development team somewhere down in a basement compared to millions they spend on shiny marketing.
clever hardware suffers the burden of having lots of edge cases spread across a small customer base. going for dumb hardware that has lots of buyers is less exciting but statistically more likely to "just work".
for me: from now on, its either lenovo business laptops (which sell probably 10s of millions of enterprise users), or macs (which sell the same.)
volume > being clever
p.s. infact, even the lenovo TB docks are a bit shaky. ugh, more cleverness! i think sticking to their proprietary docks would have been the better play. with their volume, it would have made a better product.
It's incredibly obvious. I'm not doubting the actual information here, it's clearly thorough and well researched. The issue is that I cannot _stand_ the hyper-homogenized cadence and style that all LLMs use. It's "Corporate Memphis" all over again. I don't understand why everyone is so violently afraid of something looking like a human being made it?
It's unbelievable that something this bad has been shipping for four years. I guess I know what I'm not buying, at least...
The ASUS machine would be in the trash long before this Apple laptop.
And let's not even start digging on Apple Radar's backlog, which always makes for some entertaiment on Apple related podcasts.
Have you missed the OS X releases that were all about bug fixing, and still there are plenty of left overs?
Lets see how many Tahoe is bringing.
a) one time charging stopped working... thankfully I had a pretty full charge when I noticed and was able to migrate my data to a spare machine and not have to deal with it... removable storage would have been super handy.
b) for a whole year, there was about a 25% chance of loud static instead of music when I started playing a stream in iTunes; pause and play again would fix it most of the time. It started when I installed a named OS revision, and it stopped when I installed the next version. Did not have issues with sounds from other apps. Of course, there was no information to be found anywhere about this, because 'macs just work'
Less big, but if Outlook was running when I put the laptop to sleep, there was a good chance it would continue to eat battery and generate heat in my bag. Outlook is a travesty, but when the corp runs Exchange, Outlook is less effort to make work compared to fighting to make auth work with anything else and then still having to use Outlook from time to time.
Would you push gamers/VR people to macs ??
My Lenovo X1E regularly burns 20% of its CPU cycles on some high frequency recurring interrupt. I did get pretty far with debugging it, but eventually gave up since I can't justify spending so much time on fixing a 'professional' laptop that I paid top dollar for.
It also has a multi-GPU setup that has never worked reliably under Linux, which is ironic as I opted for Lenovo due to its supposedly good Linux compatibility.
Switching between GPU modes is a hit or mis, waking up from stand-by often results in a blank screen, screen flickering, sporadic high fan speeds, etc. And then there's the coil whine, which seems to be fixed in some BIOS versions, then returns in the next. Supposedly it has something to do with power-saving measures.
Since I owned it there have been at least 20 BIOS version releases for 'improved performance and security', but none seem to actually fix anything. It's such a mess.
/rant
pm-suspend --quirk-vbe-post
I have a 1st-gen Lenovo X1 Yoga, and it is the only laptop series that has everything I want: an OLED display, a stylus, and a TrackPoint input. ACPI worked fine when I first bought it, and I can't remember if it's had one BIOS update or two, but after the most recent update, it sometimes won't wake from sleep. When that happens, he power LED fades on and off, no matter how many times I press the power button or close and open the lid. I have to hold the power button to force it off, then power it back on from a fresh boot.Also, the Linux kernel added support for adjusting the OLED's brightness through ACPI, years ago, but it's never been supported on my laptop, and I have to resort to using xrandr to output dimmer images, which reduces bit depth.
My fan sometimes gets stuck on full speed, too.
Which I'm ready to believe, knowing the state of most laptops... but this entire thing is pretty clearly generated by Gemini with its over-the-top dramatic style, italics emphasis, and -isms like "It's not just X, it's Y", which was unable to handle the article of this size and started looping over. Not sure I should believe any of it, or at least be sure that it didn't mess up the specifics. Why would one do this in a technical writeup?
It feels a bit of a shame to wrap it all up in an AI-written summary, but I guess if that was the only way to get the info out, so be it.
I'm sure dell does the same terrible handling of DGPU power and badly written ISRs that pointless raise system latency. I had shoddy crashes for months that would cause my dell laptop to BSOD and burn up in my backpack because the DGPU got stuck on I a loop during some ridiculous windows modern standby wakeup.
I use mine on a train that has Wi-Fi with a captive portal and attempting to join it makes the whole UI unresponsive. Using the overlay with a guide in a game always resets the scroll location. These are the kind of things I can live with, but also things I don't expect ever to be fixed unless Valve come out with some wholesale replacement for some overt new strategic goal.
I also ran into weird Wi-Fi issues that required a reboot, and getting that thing to recognize external displays without corrupting video is some kind of dark art while my Lenovo and Steam Deck work just fine with the same USB C plug.
Apple beats some brands for sure (especially the cheap "consumer" lines with a starting price lower than Apple's headphones) but their computers are hardly flawless.
I have yet to run into issues with my Steam Deck, which is very impressive, but I'm sure I'll run into an issue at some point. No computer works flawlessly.
Steam Deck is just... extensively tested and debugged in ways that I don't think Apple did unless they got an egg on the face in national media (remember "you're holding it wrong?")
Replacing the battery costs like $200-500 because the screen likes to explode when removing it.
Lenovo docks of a specific gen will have the USB hub/billboard just crash and stop doing display port.
older Dell docks would pollute arp tables and crash switches.
Computers have always had some wild flaws, some worse than others. They are built to a price point typically and by humans under politics so the best design or parts usually don't make it -- cost and profit.
> I used an LLM for wording. The research, traces, and AML decomp are mine. Every claim is verified and reproducible if you follow the steps in the article; logs and commands are in the repo. If you think something's wrong, cite the exact timestamp/method/line. "AI wrote it" is not an argument.
Maybe the author is ESL or just not very good at writing...
If it's clearer and the information is still all correct - then isn't that great? More people can engage in clear communication with each other
That's a pretty big if in technical writeups like this, all you do by rewriting those is obfuscate the actual inputs you had. Was is generated from scattered notes? Entirely vibe-written? How many details are actually verified to be correct by a human? Seeing how even the structure seems generated, it's clear that there was little input, and I'm not sure about any of the above.
I can deal with poor writing, and in case of ESL it's enough to tell the LLM to proofread/rephrase the piece (and check it yourself afterwards). But lazy generations just make you trust the article less.
Why does that matter? Maybe the person hates writing. You need people to suffer and put effort for the end result to be worth your time?
> How many details are actually verified to be correct by a human?
I mean.. assume the best..? The author could also have written it by hand and just lied. Or he's a paid troll from an Asus competitor - it's all just made up and it's a work of fiction. You implicitly have to assume the author tried their best and be okay there might be some errors.
If the writing is more clear thanks to an LLM then you're likely to catch errors more easily.
If you feel the thing has errors, then engage with the material and point out the errors.
You're not judging the end result on its merits
It's like someone took a technical report from a bug tracker and ran a linguistic obfuscator on it.
I’m also curious if the debugging of the timing and ACPI events could have m been done under Linux…
I wrote some embedded firmware for simpler ARM CPUs, but I have very little understanding of this technical explanation.
I feel like these kind of issues are guarded from ordinary developers by being too complex to understand, although it might just be me. May be if technical stack was more approachable, we would have more quality reports and workarounds? If there're 100 people in the world who could understand the problem, it's bad situation.
It's great if you have a working software which shows the possible culprit (as in this case), but often you have only vague guess what could be the reason, and you need to test everything one by one, write you own code, or sometimes use hardware probe.
I've had an ROG Zeph collecting dust for a couple years now, specifically for the reasons you described, which I now have a good reason to dig out and poke around in. Got my weekend sorted :)
Bad ACPI is the #1 reason I don't buy gaming laptops. Mac for mobile, parts-built PCs for desktop/gaming.
It drastically reduced my perception of Asus as a brand - I wanted something I could game with, it promised the moon of portability and performance but they couldn't pull it off.
As I write this, the fans are either completely stopped or so slow that I cannot hear them. The fans can get a little "whooshy" under load, but nothing out of line with other Windows laptops, as far as I can tell.
I'm running under Windows profile (not one of the ASUS-specific ones in Armory Crate). I also limit the battery to 60%, since I'm mostly using it plugged-in, with external monitor and keyboard. A month ago, I upgraded RAM to 40 GB and added another 4 TB SSD (and cleaned the fans in the process, so it's even quieter). I think I'll keep this machine for a few more years...
That being said, I did experience occasional cursor stuttering, but mostly when the machine is under load (typically during Visual Studio build).
I got a good 2.5 years out of it, but I was hoping to use it for at least 6 or 7.
Now that I think of it, I did disable sleep and just use hibernate instead. I no longer remember why - perhaps I too had some issues. Maybe I was just annoyed by Windows restarting in the middle of a night and (c)losing my open applications whenever it wishes to update? Or did I disable Wake Timers for that? Sorry, my memory is a bit fuzzy.
As it is, hibernate works perfectly for me, is fast enough, and it never closes my applications behind my back.
OTOH, I had sleep issues with pretty much all PCs that I ever had (be it laptop or desktop), so maybe it was just inertia to always use hibernation instead. :)
And also surprised that windows actually allows the ACPI driver to sleep in an interrupt handler - on Linux that'll immediately BUG()... Unless windows doesn't and the above ACPI blocking is what they're measuring here.
People blame Windows being slow and etc but most of the times hardware manufactures don't even get into this level to make best out of thier hardware. This is the reason why Apple is so successful, they control hardware, software while in open world, software like Linux/Windows is written by someone while hardware is designed by someone else.
On the (these days) very rare occasion I game, I have a desktop.
I have a Mac now, and have yet to experience similar hardware bugs.
My only phantom issue is that safari sometimes doesn't save a google search page to history and ignores it when hitting the back arrow, though I suspect that might be google being fancy with their loading.
I do not have the same technical depth to dig this far as the author, but this kind of problem seems pretty common on laptops, especially those with "switchable" iGPU/dGPU setups.
I had an Acer laptop about 7-8 years ago with almost the exact same latency symptoms. In the end I just disabled the dGPU in the BIOS (since I only used it for office work), and that instantly solved the issue.
This kind of thing is very infuriating because not only is it hard to track down the root cause (which I am very grateful the author did), but it is also even harder to get the vendor to actually acknowledge or fix it.
I've had weird issues with this setup since the core2duo days upgrading once a year.
when it works it's amazing, however.
AMD igpu and dgpu work super well together but I feel like over time, since I use this configuration the most things either improve or go to hell with driver updates. depends on the laptop OEM really.
This all said, where the hell are the strix points igpus where they rival desktop cards (yes the lower mid end) at laptops where stuff just works without compromise ...if there is power and cooling.
Side note - I have a rog g14 that until I loaded a beta bios for thunderbolt over usb4 would reboot when shut down and shut down randomly. (amd CPU and GPU)
Those do very much exist! My go-to is System76. There are others, e.g. arguably Framework.
> and are rewarded for that...
Oh well, one can dream anyway.
The freedom Linux gives you also gives you the freedom to slap Linux on some random bit of Windows kit and then blame Linux for failing to work around the broken firmware. Apparently this is preferable to buying hardware that works with Linux.
I won't defend Asus, they have a crowd of skeletons in the closet and need to pay more attention to feedback, I'm just saying there will be a set of use cases were issues won't be plagging the experience.
LatencyMon does show elevated ISR and DPC max exec times (i.e. reports that for realtime tasks my system is busted), and they're also primarily carried out by CPU0. But it's on the order or 2 ms, not 30 ms.
Still a long time though for interrupts I guess. ACPI.sys taking the lead too, with NDIS.sys being close second, and then pretty much nothing else.
It does other stupid things with power management, too:
- There seems to be some "cooldown" logic that keeps it awake with the fan running for a while (sometimes minutes) after closing the lid. If I just unplug the laptop stick it straight in a backpack, it'll keep doing this (getting hotter and hotter, and burning half of the battery capacity) until it hits the critical high temp shutdown. It's great fun taking it out at the start of a plane flight and finding out it's on low battery and has bbq'd itself.
- Even if I do wait for the fan to turn off before stashing the laptop, when I open the lid and wake it up, it immediately goes into hibernate mode, and I have to wait for it to finish hibernating, turn it back on, and wait for it to boot up, which is really frustrating.
The solution to both of these (for me) is to reassign the power button to be 'hibernate' instead of 'sleep', and to explicitly hibernate it every time I'm packing it up. It's still stupid and annoying, and a damn shame because it's otherwise a really nice laptop. The OLED screen is beautiful and the build quality feels great. I just wish it wasn't crippled.
This isn't just throttling, it's unusable. And it instantly goes away for a while when you disconnect and reconnect the USB-C power supply, even when gaming etc.
https://eshop.macsales.com/blog/61253-power-your-macbook-pro...
(Why did I get another ASUS? Well, after the throttling issue was fixed, the 17" was a beast, it survived dozens of mine site commissioning trips, tons of abrasive iron ore dust, and having a 2" ring spanner dropped on the keyboard (which left a nasty dent but the keyboard still works!). It's still going as my kid's gaming laptop, battery life is now only a few minutes but while plugged in it's fast enough for most modern games. And my partner had just bought a 13" Zephyrus and it was really nice and we hadn't noticed any issues with it.)
Sometimes you get hilarious errors, like Intel not having any way to verify if their driver is actually loading a dumped memory image (intel rapid start), so if you forgot to disable Rapid Start and installed anything on a drive in the bay that was specified for rapid start, on boot the intel driver would just... blit it into RAM and be happy dumb
Open source devicetree + u-boot can be maintained independently of any manufacturer's support.
Does anyone know if windows can do the same ?
Unfortunately you need to disable signing for that, which will trigger anticheat in online games and makes Windows nag you about it.
Someone wrote a bootloader specifically to patch ACPI on boot in CSM mode if you're okay with reinstalling Windows on an older system that can't play certain games: https://github.com/MovAX0xDEAD/ACPI-Patcher
There is nothing physically forcing it to run the code that’s stored in the motherboard flash, though; it could, say, use a patched version instead. The equivalent function is well-supported on Linux, because Linux uses a different interpreter (the reference one from Intel, in fact) and in general manages the hardware differently enough to regularly expose bugs in the ACPI code of manufacturers whose QC pass condition is “boots Windows” (all of them) and who can’t be bothered to fix bugs not affecting Windows (almost all of them).
The laptop was a nice deal on fire sale, but I guess I get what I pay for.
In the beginning I had sticky keyboard issue, with a button repeatedly triggered for no reason. This got fixed after one of the bios updates.
Later, for a while I had the stuttering issue when the GPU was in standby (when loaded in hybrid mode) [1]. I then switched from PopOS to Manjaro with a hope that newer kernel might fix it, and it did.
Today I'm struggling with a new issue - GPU sometimes gets disconnected, and requires system reboot to turn it on again. I generally use GPU for compute, not for graphics - so you don't see it until you do. New Nvidia drivers might have fixed this [2].
[1] https://forums.developer.nvidia.com/t/random-freezes-rtx-407...
[2] https://forums.developer.nvidia.com/t/bug-sporadic-hang-on-s...
AML is an interpreted language. Interrupts need to be handled quickly, because while a CPU is handling an interrupt it can't handle any further interrupts. There's an obvious and immediate conflict there, and the way this is handled on every other OS is that upon receipt of an ACPI interrupt, the work is dispatched to something that can be scheduled rather than handled directly in the interrupt handler. ACPI events are not intended to be performance critical. They're also not supposed to be serialised as such - you should be able to have several ACPI events in flight at once, and the language even includes mutex support to prevent them stepping on each other[1]. Importantly, "Sleep()" is intended to be "Wake me when at least this much time has passed" event, not a "Spin the CPU until this much time has passed" event. Calling Sleep() should let the CPU go off and handle other ACPI events or, well, anything else. So there's a legitimate discussion to be had about whether this is a sensible implementation or not, but in itself the Sleep() stuff is absolutely not the cause of all the latency.
What's causing these events in the first place? I thought I'd be able to work this out because the low GPE numbers are generally assigned to fixed hardware functions, but my back's been turned on this for about a decade and Intel's gone and made GPE 2 the "Software GPE" bit. What triggers a software GPE? Fucked if I can figure it out - it's not described in the chipset docs. Based on everything that's happening here it seems like it could be any number of things, the handler touches a lot of stuff.
But ok we have something that's executing a bunch of code. Is that in itself sufficient to explain video and audio drops? No. All of this is being run on CPU 0, and this is a multi-core laptop. If CPU 0 is busy, do it all on other cores. The problem here is that all cores are suddenly not executing the user code, and the most likely explanation for that is System Management Mode.
SMM is a CPU mode present in basically all Intel CPUs since the 386SL back in 1989 or so. Code accesses a specific IO port, the CPU stops executing the OS, and instead starts executing firmware-supplied code in a memory area the OS can't touch. The ACPI decompilation only includes the DSDT (the main ACPI table) and not any of the SSDTs (additional ACPI tables that typically contain code for additional components such as GPU-specific methods), so I can't look for sure, but what I suspect is happening here is that one of the _PS0 or _PS3 methods is triggering into SMM and the entire system[2] is halting while that code is run, which would explain why the latency is introduced at the system level rather than it just being "CPU 0 isn't doing stuff".
And, well, the root cause here is probably correctly identified, which is that the _L02 event keeps firing and when it does it's triggering a notification to the GPU driver that is then calling an ACPI method that generates latency. The rest of the conclusions are just not important in comparison. Sleep() is not an unreasonable thing to use in an ACPI method, it's unclear whether clearing the event bits is enough to immediately trigger another event, it's unclear whether sending events that trigger the _PS0/_PS3 dance makes sense under any circumstances here rather than worrying about the MUX state. There's not enough public information to really understand why _L02 is firing, nor what is trying to be achieved by powering up the GPU, calling _DOS, and then powering it down again.
[1] This is absolutely necessary for some hardware - we hit issues back in 2005 where an HP laptop just wouldn't work if you couldn't handle multiple ACPI events at once
[2] Why the entire system? SMM is able to access various bits of hardware that the OS isn't able to, and figuring out which core is trying to touch hardware is not an easy thing to work out, so there's a single "We are in SMM" bit and all cores are pushed into SMM and stop executing OS code before access is permitted, avoiding the case where going into SMM on one CPU would let OS code on another CPU access the forbidden hardware. This is all fucking ludicrous but here we are.
Unfortunately I can't remember the details now.
Suspect may be in other products too, have seen similar issues elsewhere in Asus produce line