10% of Firefox crashes are caused by bitflips

JensSpahnpasta@feddit.org · 4 months ago

10% of Firefox crashes are caused by bitflips

w3dd1e@lemmy.zip · 4 months ago

Firefox kept crashing on me a few days ago. Decided to run MemTest86 and sure enough. Bad RAM.

Photonic@lemmy.world · 4 months ago

Ouch, my condolences to your wallet

u/lukmly013 💾 (lemmy.sdf.org)@lemmy.sdf.org · 4 months ago

Time to make a compromise by buying the cheapest €130 8GB stick.

w3dd1e@lemmy.zip · 4 months ago

Luckily for me, I was already running 64GB so now I’m down to 32GB. I can try to wait it out. -_- I don’t really need that much anyway, but I’m glad I had it when it was cheap

Delusion6903@discuss.online · 4 months ago

I really don’t remember the last time Firefox crashed on me and I’ve been using it for many years

Blackmist@feddit.uk · 4 months ago

I often have to kill it because it refuses to load things on new tabs.

I do use a VPN extension with it, so it could be that, but the result is the same.

Delusion6903@discuss.online · 4 months ago

I’ve never seen that. How much memory do you have?

Blackmist@feddit.uk · 4 months ago

16GB

aln@lemmy.world · 4 months ago

Ok wow so it’s not just me thank god

amateurcrastinator@lemmy.world · 4 months ago

Yeah same here. Sometimes I think some people either have no clue how to use a computer or they do it on purpose and then complain.

datavoid@sh.itjust.works · 4 months ago

Technically every that happens on a computer is a bit flip 😏

Phoenixz@lemmy.ca · 4 months ago

Unprovoked bitfllips then

dogdeanafternoon@lemmy.ca · 4 months ago

Naughty bitflips 😏

deltaspawn0040@lemmy.zip · 4 months ago

Nonconsensual bitflips

Phoenixz@lemmy.ca · 4 months ago

We should jail those bits

Zozano@aussie.zone · 4 months ago

Indignant bystander: “fucking whore!”

BlackLaZoR@lemmy.world · 4 months ago

And ECC memory still isn’t standard in PC computers

kurwa@lemmy.world · 4 months ago

God I wish

Björn@swg-empire.de · 4 months ago

Guess Linus was right again to only use ECC RAM.

Staff@piefed.world · 4 months ago

Which Linus?

otp@sh.itjust.works · 4 months ago

The guy with the blanket from Charlie Brown

vpklotar@lemmy.world · 4 months ago

Torvalds

baatliwala@lemmy.world · 4 months ago

Linus when he was with Linus

Retail4068@lemmy.world · 4 months ago

Let’s spend a ton of extra money minimizing edge case crashing in a browser!!!

🙄

douglasg14b@lemmy.world · 4 months ago

I always love it when folks who don’t actually know what they’re talking about, comment like they do…

It’s not just the browser. This example is the browser, but it’s your entire system stability that is affected by random bit flips.

FauxLiving@lemmy.world · 4 months ago

I don’t know about you, but I use my RAM for a lot more than a browser.

bruhduh@lemmy.world · 4 months ago

I have lga 1356 xeon 2470v2 with 64gb ddr3 ecc ram, cheap and good setup

Retail4068@lemmy.world · 4 months ago

You enthusiasts server people, the dozens of you, are not the average consumer.

Buddahriffic@lemmy.world · 4 months ago

Who is talking about average consumers? We’re not trying to market something here.

Retail4068@lemmy.world · 4 months ago

Linus was. The answer is Linus. 🤦‍♂️ Jesus Christ guys. Two of you.

Eheran@lemmy.world · 4 months ago

DDR5 pretty much has ECC built in.

greybeard@feddit.online · 4 months ago

Linus would disagree with you there. It’s got a form of ECC, but it isn’t the same as server RAM ECC.

BladeFederation@piefed.social · 4 months ago

When ECC no longer costs a mortgage, I will look into upgrading.

roofuskit@lemmy.world · 4 months ago

Yeah I can’t remember the last time my browser crashed. No way I’m upgrading all that hardware to avoid something that happens that seldom.

partofthevoice@lemmy.zip · 4 months ago

Probably not the use case you’d want to buy ECC for. I considered it for my homebuild because I figured I might process a lot of data at once, and I would appreciate the piece of mind… but I still decided no because I could get more ram for the same price if it were not ECC.

flamingo_pinyata@sopuli.xyz · 4 months ago

This is how dev humblebrag sounds like.
Our app is so stable only random hardware events like bitflips can crash it.

grue@lemmy.world · 4 months ago

LOL, nah, Firefox isn’t that stable. If 10% of crashes were caused by bad RAM, it means 90% were still caused by something else.

(My install regularly gets a memory leak that eventually makes my system unusable, BTW. I don’t think it’s necessarily the fault of Firefox itself – more likely Javascript running in tabs, maybe interacting with an extension or something, and some of the blame goes to the kernel’s poor handling of low memory conditions – but it’s definitely not “dev humblebrag stable” for me.)

SkyeStarfall@lemmy.blahaj.zone · 4 months ago

10% of all crashes is definitively a brag. Crashes due to faulty hardware/bitflips is rare rare, generally I would expect that percentage to be less than 1% in any complex app

Liketearsinrain@lemmy.ml · edit-2 3 months ago

deleted by creator

Bitflip@lemmy.ml · 4 months ago

Figures, sorry.

Toes♀@ani.social · 4 months ago

I used to be a part of an anticheat dev team and we discovered that this was a common problem back in the Windows XP era.

We added a routine to check the memory addresses used after a crash and notified the user if we suspected hardware failure.

At the time we suspected unstable overclocks because the metrics showed us the computers affected were typically overclocked as well.

Katherine 🪴@piefed.social · 4 months ago

The 90% are caused by Fhqwhgads.

Burninator05@lemmy.world · 4 months ago

Fhqwhgads pushes every bit to the limit.

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 @pawb.social · 4 months ago

I flip my bits looking at porn using FireFox and that shit almost never crashes 🤷‍♂️

hakunawazo@lemmy.world · 4 months ago

Maybe it was too vanilla to crash. 🍨

OwOarchist@pawb.social · 4 months ago

*interest in parity-checking server RAM intensifies*

llii@discuss.tchncs.de · 4 months ago

When I upgrade my home server I would like a low-power system with ECC RAM. I hope it will be financially viable in the future.

tal@lemmy.today · edit-2 4 months ago

The problem is that ECC is one of the things used to permit price discrimination between server (less price sensitive) and PC (more price sensitive) users. Like, there’s a significant price difference, more than cost-of-manufacture would warrant. There are only a few companies that make motherboard chipsets, like Intel, and they have enough price control over the industry that they can do that. You’re going to be paying a fair bit more to get into the “server” ecosystem, as a result of that.

Also…I’m not sure that ECC is the right fix. I kind of wonder whether the fact is actually that the memory is broken, or that people are manually overclocking and running memory that would be stable at a lower rate at too high of a rate, which will cause that. Or whether BIOSes, which can automatically detect a viable rate by testing memory, are simply being too aggressive in choosing high memory bandwidth rates.

EDIT: If it is actually broken memory and only a region of memory is affected, both Linux and Windows have the ability to map around detected bad regions in memory, if you have the bootloader tell the kernel about them and enough of your memory is working to actually get your kernel up and running during initial boot. So it is viable to run systems that actually do have broken memory, if one can localize the problem.

https://www.gnu.org/software/grub/manual/grub/html_node/badram.html

Something like MemTest86 is a more-effective way to do this, because it can touch all the memory. However, you can even do runtime detection of this with Linux up and running using something like memtester, so hypothetically someone could write a software package to detect this, update GRUB to be aware of the bad memory location, and after a reboot, just work correctly (well, with a small amount less memory available to the system…)

AA5B@lemmy.world · 4 months ago

I wonder if ai can actually help here. As the industry abandons consumer hardware in favor of datacenter equipment to profit from the ai bubble, perhaps ecc memory will become cheaper

Mihies@programming.dev · 4 months ago

In the middle rampocalypse you even wish for an ECC one?

grue@lemmy.world · 4 months ago

There’s no real good reason that all RAM shouldn’t have been ECC since decades ago. It doesn’t actually cost much more to implement. The only reason it isn’t, as tal’s reply mentioned, is artificial price discrimination.

postmateDumbass@lemmy.world · 4 months ago

How many are caused by reddit trashing the Back stack?

PokerChips@programming.dev · edit-2 4 months ago

The other 90% can be contained with containers and temporary containers and tab suspender

reddig33@lemmy.world · 4 months ago

Wouldn’t that mean ten percent of all crashes in all apps would be caused by bit flips? What makes Firefox special?

Kairus@lemmy.world · 4 months ago

You’re assuming that app quality is constant. But if I made an app that crashes on launch, I can confidently say 0% of those crashes would be from bitflips.

Firefox isn’t special in some way that could cause bitflips, but it’s 1) where this data was collected (and why this post isnt talking about some other product) and 2) speaks to the quality of FF, because crashes are rare enough for bit flips to be a significant crash factor.

The takeaway is that for the FF team, and anyone using ram (everyone), bitflips are more common than expected

thebestaquaman@lemmy.world · 4 months ago

You can’t effect the number of bit flips your users hardware has, but you can affect how often buggy code corrupts their memory or otherwise crashes your program.

Let’s say any app will crash about once a year on my machine due to a bit flip. If the app is crap and crashes hundreds of times for other reasons, the bit flip is irrelevant. If the app is robust enough that the bit flip accounts for 10 % of the crashes, that basically means the app is pretty much never crashing due to poor code.

MoogleMaestro@lemmy.zip · 4 months ago

That’s the way people should be looking at it. It basically means hard crashes are extremely rare in the firefox ecosystem.

To be fair, I can’t remember the last time a browser crashed on me in general.

caschb@lemmy.world · 4 months ago

I’ve had Safari of all things crash on me a couple of times. Still, not enough to actually be disruptive.

Deestan@lemmy.world · 4 months ago

As a long time Firefox user, I believe Firefox sees orders of magnitude more RAM issues than other apps because it is using orders of magnitude more RAM than other apps.

r00ty@kbin.life · 4 months ago

No, they’re saying Firefox uses so much ram they’re far far more likely to be a victim!

grue@lemmy.world · 4 months ago

Laughs in Memory: 46.84 GiB / 62.72 GiB (75%) with (probably) several hundred tabs open

tal@lemmy.today · 4 months ago

Anecdotal evidence, but I had both a 13th gen and 14th gen Intel CPU with the bug that caused them to over time, destroy themselves internally.

The most-user-visible way this initially came up, before the CPUs had degraded too far, was Firefox starting to crash, to the point that I initially used Firefox hitting some websites as my test case when I started the (painful) task of trying to diagnose the problem. I suspect that it’s because Firefox touches a lot of memory, and is (normally) fairly stable — a lot of people might not be too surprised if some random game crashes.

SleeplessCityLights@programming.dev · 4 months ago

I had to turn down my block multiplier so that I could play Unreal Engine games. I would suspect that would extend the lifetime. After the underclock I have perfect stability.

JensSpahnpasta@feddit.org · 4 months ago

It would be interesting to see how this works in Chrome. I would guess that it could be the same - people tend to leave their browsers open with hundreds of tabs and will never reboot their laptops. If you play a random game for 2 hours, bit flips shouldn’t be a problem. But if you keep your browser open for weeks or months with hundreds of tabs, that may cause problems.

Jarix@lemmy.world · 4 months ago

… I can’t imagine having a browser with hundreds of open tabs. That would tend me of the old days of Netscape Navigator and all the popups and browser add on cancer.

Ahh the nostalgic days of the early Dotcom era. I sometimes miss you geocities

GreenBeanMachine@lemmy.world · edit-2 2 months ago

deleted by creator

xthexder@l.sw0.com · 4 months ago

This checks out with Linus Torvalds saying most OS crashes across linux AND windows are caused by hardware issues, and also why he uses ECC RAM.

douglasg14b@lemmy.world · 4 months ago

Honestly yeah it’s 100% checks out.

I have device that has ECC ram and I can keep it online and applications running for well over 18 months with no stability issues.

However, both my work computers and my personal computer start to become unstable after about 15 to 20 days. And degrade over the course of 1 to 2 years (with a considerable increase in the number of corrupt system files)

Firefox and chrome start to become unstable after usually a week if they have really high memory usage.

spizzat2@lemmy.zip · 4 months ago

I don’t think they’re arguing that Firefox is more susceptible to bit flips. They’re trying to say that their software is “solid” enough that a significant number of the reported crashes are due to faulty hardware, which is essentially out of their control.

If other software used the same methodology, you could probably use the numbers to statistically compare how “solid” the code base is between the two programs. For example, if the other software found that 20% of their crashes were caused by bit flips, you could reasonably assume that the other software is built better because a smaller portion of their crashes is within their control.

GreenBeanMachine@lemmy.world · edit-2 2 months ago

deleted by creator

toddestan@lemmy.world · 4 months ago

Programs that use more memory could be slightly more susceptible to this sort of thing because if a bit gets randomly flipped somewhere in a computer’s memory, the bit flip more likely to happen in an application that has a larger ram footprint as opposed to an application with a small ram footprint.

I’m still surprised the percentage is this high.

GreenBeanMachine@lemmy.world · edit-2 2 months ago

deleted by creator

Buddahriffic@lemmy.world · 4 months ago

No, the exact % depends on how stable everything else is.

Like a trivial example, if you have 3 programs, one that sets a pointer to a random address and tries to dereference it, one that does this but only if the last two digits of a timer it checks are “69”, and one that never sets a pointer to an invalid address, based on the programs themselves, the first one will crash almost all the time, the second one will crash about 1% of the time, and the third one won’t crash at all.

If you had a mechanism to perfectly detect bit flips (honestly, that part has me the most curious about the OP), and you ran each program until you had detected 5 bit flip crashes (let’s say they happen 1 out of each 10k runs), then the first program will have something like a 0.01% chance of any given crash being due to bit flip, about 1% for the 2nd one, and 100% for the 3rd one (assuming no other issues like OS stability causing other crashes).

Going with those numbers I made up, every 10k “runs”, you’d see 1 crash from bit flips and 9 crashes from other reasons. Or for every crash report they receive, 1 of 10 are bit flips, and 9 of 10 are “other”. Well, more accurately, 1 of 20 for bit flip and 19 of 20 for other, due to the assumption that the detector only detects half of them, because they actually only measured 5%.

10% of Firefox crashes are caused by bitflips

10% of Firefox crashes are caused by bitflips

Gabriele Svelto (@gabrielesvelto@mas.to)