Because they are random with what they run and drive your temps and volts differently depending on what they are doing.
You can run prime95 in 3-4 short 4 hour stints and get different tests each time and still not run all the possible tests. Same with the memtest86+ 4.20 program.
I've had prime95 fail 16hours in on one core. 24 hours is considered stable, 36-48 is gold standard (and you might test all the patterns by then).
Not going to stare at Prime95 screen for 48h but it doesn't seem random at all, it starts with the same tests everytime.
I'm not bothering with the long tests anymore, Prime/Linx are good for quick stability tests but I've ran multiple overnight Prime tests and everytime I had to increase vcore mutiple steps to get real stability because it was nowhere near.
I'd say it's better to use an intentionally heavy multithreading scenario with lots of regular stuff running at the same time: video, music, p2p, game, browser, messaging, all while doing a handbrake encode.