hmmm, this may be trickier than i thought then
Not really. I was just addressing one specific question/requirement you had. The takeaway point is that if you want to preserve easy random access, then 1) don't use compressed tarballs (or something like that) 2) if using RAR, make sure solid archiving is unchecked and if using 7zip, make sure you're non-solid or have a small solid block size. For other compression programs, options may vary, but once you know the jist of what you're looking for, it should be easy to identify.
I often use WinRAR with "Fastest" compression and solid turned off--it's pretty fast, preserves random access, and while the compression isn't great compared to best/solid, it does take care of the "low-hanging fruit" as far as compression is concerned and saves enough space that it's worthwhile.
That is pretty interesting, did not know this. So the "word sizes" are essentially blocks of bytes right?
I quickly looked at 7-zip and didn't see the option to for the reset settings.
No, word size is a different setting. 7-Zip offers three different settings.
Dictionary size: Bigger is better, but bigger is slower and increases the RAM requirements for both compression and decompression (Zip's biggest weakness is a tiny dictionary--just 4KB for the default "implode" setting--but keep in mind that zip was developed back in the days of DOS and memory that was measured in hundreds of kilobytes; this is also why solid [no-reset] compression doesn't make much sense in Zip, because if you're only discarding a few KB of dictionary per file, that's not going to make a huge difference with modern file sizes.)
The dictionary reset in 7-Zip is controlled by the "Solid Block size" option, with "non-solid" being always-reset-per-file, "solid" being never-reset, and the various sizes in between being reset-after-X-bytes.
I'm not 100% sure what Word size controls, but I think it controls how 7-Zip chunks the data for finding patterns.
Say you have the following data:
0123456789abcdef0123456789abcdef0123456789abcdef
With a word size of 8 bits, it'll look like this:
01 23 45 67 89 ab cd ef 01 23 45 67 89 ab cd ef 01 23 45 67 89 ab cd ef
And with a word size of 64-bits, it'll look like:
0123456789abcdef 0123456789abcdef 0123456789abcdef
In this example, the pattern repeats every 64-bits, so a 64-bit word size would be more efficient for finding patterns. Larger isn't necessarily better with word size--the optimal word size depends on the data that you are compressing. In practice, though, different word sizes rarely make a huge difference (in part because most files have repeating patterns of different lengths), so it's fine to just leave it at the default and not bother with it.