|
|
 |
|
11-13-2010, 03:32 PM
|
#1
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
Will somebody benchmark WHOLE DISK COMPRESSION (HD,SSD,RAMDISK) please!!
Please use the windows 7 or 2008R2 "compress the whole darn drive" then benchmark.
I'd love to see bottleneck scenarios on a modern multi-core (4 core or 2c4t) cpu.
you'd have to create a disk image.
loop:
benchmark
compress the whole drive
benchmark
restore the image since disabling compression doesn't decompress the objects.
switch from HD to SSD to RAID (HD/SDD) and maybe even ramdisk.
Maybe ANAND is too old for this fun - but it makes me wonder if i have a dual 6-core westmere with ESXi - 6gb SATA and SAS - i don't mind carving off a couple of cores for compression - already do for sql server 2008 (compress *.*) at a program level and why yes it is faster when used correctly (!!).
Also note: you can not compress certain objects by directory in NTFS. so if you have swap or pagefile or JPEG/PR0N VIDS you could dimension this even further on the test.
I think it would be a worthy ANAND article - today's compression.
1. ESXi compresses pages (ramdoubler!) before it has to swap them out since 4.1
2. many modern SAN will compress objects and deduplicate them. heck vmware does ram (ramdoubler?) and vmfs (clustered storage filesystem) deduplication at a low level already.
3. Backup Exec System Restore 2010 has the ability to send to a datastore for dedupe so those 50 pc's at work that were imaged all at once contain 90% of the same code - can be deduped.
4. SQL server since 2008 has TCP/TABLE/BACKUP compression because we have more CPU than DISK I/O. SQL server is really an os and is called SQLOS by some old times - soft-numa - etc - It must have been a good idea. This is one product microsoft can proudly call their own (best acquisition lol).
Anyone think its worth the time?
CPU usage would go way up but load time on constrained i/o devices (laptop 4200rpm drive) might yield more usability? or not?
What about dedupe? Ramdoubler? Is it time we can afford to use those again in consumer land? I'd rather have a GC process compress stale objects then page them to reduce pagefile wear or prevent paging completely. I can't afford more than 8GB for a personal pc right yet.
Inquiring minds would like to know..
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-15-2010, 11:07 AM
|
#2
|
|
Senior Member
Join Date: May 2002
Location: Sunny Los Angeles
Posts: 861
|
You've thought about this quite a bit since you got a bit of a plan and have all the questions laid out. You probably got enough knowledge and resources as well. It's always the last 10% that's the hard part. You got 90% to your answer, why not do that last 10% yourself?
|
|
|
11-15-2010, 07:29 PM
|
#3
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
man i got a kid - i have to go read some stories and put to bed - I was hoping someone like Anand would analyze compression specific entities to increase disk i/o.
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-15-2010, 11:19 PM
|
#4
|
|
Diamond Member
Join Date: Jul 2009
Location: Equestria
Posts: 6,520
|
Just for fun I'm compressing my Program Files directory under XP to see what kind of compression level we're talking about. 15GB mostly in Resident Evil 4 and Demigod. Not gonna benchmark anything, though.
Earlier I did my GIMP directory and found 18 seconds for the program to load uncompressed vs. 17 compressed with the directory as a whole compressed to 0.65; but I only did one run and my timing method wasn't the most precise, either.
Chrome compresses to 0.68 original, which is more compression than I would've thought.
I'd like to see some benchmarks, but it's gonna be so highly variable depending on the compression level of what's on the disk, layout compared to usage, with superfetch throwing a huge curve into things.
|
|
|
11-16-2010, 04:42 AM
|
#5
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
using the compact command? google it. you can tell it to find all dll and exe's on your drive in one swoop
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-17-2010, 01:27 AM
|
#6
|
|
Diamond Member
Join Date: Jun 2000
Location: Denver, CO
Posts: 4,120
|
For people with SSDs in laptops, compression can be an important factor to gain more space. I have an 80GB SSD in my laptop, and have compressed a lot of folders that I don't use often, but still might need. I haven't compressed my program files yet, though.
Edit: I wonder if turning on drive or folder compression negatively effects SandForce SSDs?
__________________
Main Rig - MSI Z77-DS3H
i5-3570K
Sapphire HD6850
Soyo Topaz 24" MVA LCD
Corsair TX650W PSU / Antec P182
256GB M4 SSD
|
|
|
11-17-2010, 05:55 AM
|
#7
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
you can turn on compression (compress items) then turn it off (but the stuff remains compressed) iirc from some old training i did.
question is? why do binaries compress? I thought most dll's are encrypted and compressed already? Why wouldn't the companies (microsoft,etc) run MAD zip compression level 11 on every bit of their files (compression is a form of encryption)
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-17-2010, 08:19 AM
|
#8
|
|
Elite Member
Join Date: Sep 2001
Posts: 30,636
|
Quote:
Originally Posted by Emulex
you can turn on compression (compress items) then turn it off (but the stuff remains compressed) iirc from some old training i did.
question is? why do binaries compress? I thought most dll's are encrypted and compressed already? Why wouldn't the companies (microsoft,etc) run MAD zip compression level 11 on every bit of their files (compression is a form of encryption)
|
No, AFAIK MS has never encrypted or compressed their binaries.
But in theory compression should help with load times since there will be less I/O to do and CPUs should be fast enough to decompress the data in memory with little to no noticeable latency. But the real world affects of that are very hard to benchmark.
|
|
|
11-17-2010, 11:11 AM
|
#9
|
|
Senior Member
Join Date: Mar 2009
Posts: 648
|
From what I can tell, compression will most likely increase latency (slightly) and increase bandwidth (slightly to dramatically).
However some people have shown that 4kb random writes are impacted quite dramatically by compression. (we're talking 50 to 5 MB/s). Can someone validate?
|
|
|
11-17-2010, 11:14 AM
|
#10
|
|
Lifer
Join Date: Mar 2004
Posts: 13,331
|
Quote:
Originally Posted by jimhsu
From what I can tell, compression will most likely increase latency (slightly) and increase bandwidth (slightly to dramatically).
However some people have shown that 4kb random writes are impacted quite dramatically by compression. (we're talking 50 to 5 MB/s). Can someone validate?
|
just to be clear, you are saying 4k random writes drop an order of magnitude when you turn on compression?
__________________
I do not have a superman complex; for I am God, not superman!
The internet is a source of infinite information; the vast majority of which happens to be wrong.
How to protect your data guide
AA Naming Guide
main: Win7x64, i5-3570K, 16GB DDR3-1600, XFX HD6950, Gigabyte GA-Z77MX-D3H. 240GB Intel 520 SSD
fileserver: Solaris 11, Athlon2 X4 @ 3ghz, 4GB DDR2, 160GB samsung OS drive, 5x750GB WD CaviarGP drives in raidz2 (ZFS raid6).
|
|
|
11-17-2010, 11:26 AM
|
#11
|
|
Senior Member
Join Date: Mar 2009
Posts: 648
|
I have the evidence here actually. Did an iometer run without compression and with compression. The results are ... interesting to say the least.
Test conditions:
1. Small test file
2. 4 Outstanding IOs
3. Compression enabled using Win7 x64 "right click > properties"
4. Processor - E8400
5. Test interval - 1 sec warm up time, 5 sec test. I found out that longer test times actually introduce more drift (i.e. disk accesses, TRIM, etc)
Run Normal MB/s Normal IOPS Compression MB/s Compression IOPS
1MB; 100% Read; 0% random (1) 153.458158 153.458158 4686.168947 4686.168947
1MB; 0% Read; 0% random 81.301776 81.301776 22.910449 22.910449
4K; 100% Read; 100% random 74.333424 19029.35662 496.191356 127024.987
4K; 0% Read; 100% random 72.17019 18475.56867 1.460037 373.769533
CPU utilization for compression cases peaked at 56% and remained there basically constantly. This supports my theory that compression is singlethreaded (at least while reading/writing to a single file).
Note on graph below, the y-axis (MB/s) is in LOG scale (on a semilog graph). My 1MB sequential read run for uncompressed was a little screwed up (warm up time) so ignore that.
Also realize IOmeter uses highly compressible data for benchmarks.
Last edited by jimhsu; 11-17-2010 at 02:02 PM.
|
|
|
11-17-2010, 11:32 AM
|
#12
|
|
Senior Member
Join Date: Mar 2009
Posts: 648
|
What I can conclude:
Compression is good for read-heavy databases that are sparse (i.e. easily compressable). You can experience substantial to extreme boosts provided CPU resources are not taxed.
Compression is HORRIBLE for write heavy databases, especially random writes.
For everything in between, it depends on your application. I'd say for databases with a 90/10 read/write ratio or greater, you might consider compression.
|
|
|
11-17-2010, 11:40 AM
|
#13
|
|
Elite Member
Join Date: Sep 2001
Posts: 30,636
|
Quote:
Originally Posted by jimhsu
What I can conclude:
Compression is good for read-heavy databases that are sparse (i.e. easily compressable). You can experience substantial to extreme boosts provided CPU resources are not taxed.
Compression is HORRIBLE for write heavy databases, especially random writes.
For everything in between, it depends on your application. I'd say for databases with a 90/10 read/write ratio or greater, you might consider compression.
|
Well SQL Server won't let you mount a compressed mdf/ldf so that's not an option there. I doubt other databases like MySQL or PostgreSQL would care though.
|
|
|
11-17-2010, 12:19 PM
|
#14
|
|
Lifer
Join Date: Mar 2004
Posts: 13,331
|
jimhsu, is this mislabeled or did you make it a semilog scale graph? if so than it is an extreme difference that isn't quite conveyed to someone not used to reading semilog graphs
A assume 100% read is "read" and "0% read" is "write"?
you clearly show huge improvements in read, and huge losses in write performance. thank you for sharing the data.
__________________
I do not have a superman complex; for I am God, not superman!
The internet is a source of infinite information; the vast majority of which happens to be wrong.
How to protect your data guide
AA Naming Guide
main: Win7x64, i5-3570K, 16GB DDR3-1600, XFX HD6950, Gigabyte GA-Z77MX-D3H. 240GB Intel 520 SSD
fileserver: Solaris 11, Athlon2 X4 @ 3ghz, 4GB DDR2, 160GB samsung OS drive, 5x750GB WD CaviarGP drives in raidz2 (ZFS raid6).
Last edited by taltamir; 11-17-2010 at 01:34 PM.
|
|
|
11-17-2010, 01:23 PM
|
#15
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
sql server 2008/R2(10)/DENALI (11) has built in table level compression. IT IS AWESOME.
Imaging you have your databases spliced over iscsi - files 1 lun, tmpdb 1 lun, log 1 lun.
peak speed is gigabit right? But everyone stores stuff in databases that is highly compressible. you can even turn it on and off (mixed) or batch run it.
ie. compress always table A
and/or compress on/off when you determine load is too high disable compression
and/or compress every night.
Compress backups (yay!) and compress TCP/IP connects (YAY!) as well!
-------
I'd rather have a table fit in a page than split too.
Keep in mind i consider SQL server an OS. It sits on top of windows but has numa, resource governor, i/o controller (logs are less lazy than main table writes).
It's become so smart the old adage that you must separate log from core storage is not really a must any more. Light years more advanced than mysql.
anyhoo.
My idea was to compress items that aren't updated frequently on batch using the COMPACT command (dos command). things that are updated alot - just don't.
Things that do not compress skip.
NTFS allows you to set compression on/off at a file level , dir , or whole drive. you can also turn it off but the objects remain compressed.
Anyone care to try that?
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-17-2010, 02:00 PM
|
#16
|
|
Senior Member
Join Date: Mar 2009
Posts: 648
|
Yes, it's semilog. The data is unreadable as a linear scale graph.
Using highly compressable data (IOmeter), you get at most a 30x performance increase, and at least a 49x performance DECREASE. You can see why your data workload matters so much - even a 1-2% increase in writes can affect performance drastically.
|
|
|
11-17-2010, 02:53 PM
|
#17
|
|
Lifer
Join Date: Oct 1999
Posts: 22,329
|
Quote:
Originally Posted by Emulex
Will somebody benchmark WHOLE DISK COMPRESSION (HD,SSD,RAMDISK) please!!
|
Tell tweakboy to do it... He'll benchmark anything.
__________________
...whenever any Form of Government becomes destructive...
it is the Right of the People to alter or to abolish it
|
|
|
11-17-2010, 08:34 PM
|
#18
|
|
Lifer
Join Date: Jul 2000
Posts: 14,061
|
There are articles on Phoronix.com benchmarking the BTRFS filesystem using compression vs not using it. It's under Linux, but the results may give you some insight into the performance boost that is possible. AFAIK BTRFS was faster across the board when compression was enabled. From what I remember, the boost was around 20%.
|
|
|
11-18-2010, 12:12 AM
|
#19
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
Bring up the command prompt: Start => Accessories => Command Prompt. You’ll have a small black screen as per the MS-DOS days
In Vista and 7 - Change Directory to Users with the command CD \Users
In XP Change Directory to Documents and Settings with the command CD \Documents and Settings
Enter the compression command as follows: compact /c /s /i. Your system will go through and compress all the files in this directory and subdirectories. It could take 10 to 20 minutes depending on the amount of data.
When complete do the same with Program Files. Issue the command CD \Program Files (for 64 bit systems do the same with Program Files (x86).
Enter the compression command as follows: compact /c /s /i. Your system will go through and compress all the files in this directory and subdirectories. It could take 10 to 20 minutes depending on the amount of data.
When complete do the same with Windows Files. Issue the command CD \Windows
Enter the compression command as follows: compact /c /s /i. Your system will go through and compress all the files in this directory and subdirectories. It could take or so 20 minutes
When complete we are going to compress all other .exe and .dll files across the whole drive. Issue the command CD\. You should only have C  > prompt showing.
Enter the compression command as follows compact /c /s /i *.exe. This will compress all other .exe files across your drive. When complete issue the command compact /c /s /i *.dll. This will compress all .dll files across the entire drive.
When you complete steps 1 through 9, restart your system into safe mode. To do this, restart your system and as it is starting tap the F8 key a few times. When the safe mode dialogue comes up select the top option Safe Mode and your system will boot into safe mode.
Repeat steps 1 to 9. This will compress some of the files that were locked in normal mode.
When you have completed steps 1 to 9, restart your system normally.
What you have just done is applied NTFS compression to all Program Files, User files and Windows files. These now occupy about 2/3 the space on your hard drive they normally did. Your hard drive will spend 1/3 less time reading these files when and as your computer accesses them because there is 1/3 less data to read from the hard drive. You won’t see much of a performance increase at this stage since many of the files will be fragmented. In a moment we’ll move onto the defragmentation, optimal file placement and confinement of those files to the outer tracks of your hard drive where transfer performance for those files will be increased by an average of 50%.
from ultimate defrag (defrag pointless in ssd i spose?)
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-20-2010, 08:36 PM
|
#20
|
|
Senior Member
Join Date: Mar 2009
Posts: 648
|
Can someone explain though why write performance (specifically RANDOM write) drops so dramatically with compression enabled? Are writes not being parallelized correctly in this scenario? Is my CPU (3.6 GHz E8400) holding me back?
|
|
|
11-21-2010, 08:11 AM
|
#21
|
|
Diamond Member
Join Date: Jan 2001
Location: ATL
Posts: 9,419
|
depends is the app writing with threads? Is your drives thread safe? do you actually have NCQ?
i see all 4 of my cores (q6600) pushing up to 30-40% so i'd have to disagree.
__________________
-------------------------
NAS: Dell 530 Q6600 8gb 4tb headless VHP
KID PC1: Mac Pro Dual nehalem - 6gb - GF120 - HP ZR30W
Browser: Dell 530 Q6600 4GB - Kingston 96gb -gt240- hp LP3065 IPS - 7ult
Tabs: IPAD 1,2,3 IPOD3,HTC flyer, Galaxy Tab - all rooted/jb
Couch1: Macbook Air/Macbook White
Couch2: Macbook Pro 17 2.66 Matte screen - 8GB - SSD
HTPC: Asus C2Q8300/X25-V - Geforce 430- 7ult - Antec MicroFusion 350
|
|
|
11-21-2010, 10:20 AM
|
#22
|
|
Elite Member
Join Date: Sep 2001
Posts: 30,636
|
Quote:
Originally Posted by jimhsu
Can someone explain though why write performance (specifically RANDOM write) drops so dramatically with compression enabled? Are writes not being parallelized correctly in this scenario? Is my CPU (3.6 GHz E8400) holding me back?
|
Because the filesystem compresses data in chunks larger than the cluster size, so whenever a write is issued the system has to read in that chunk, decompress it, make the update, recompress it and write it back.
|
|
|
11-21-2010, 10:52 AM
|
#23
|
|
Member
Join Date: Dec 2009
Location: Volgograd, Russia
Posts: 88
|
Typically, a compression unit on NTFS would be 16 clusters, 64KB. So the system has to recompress 64K even if you write one byte. There is another problem, more important, that compression coupled with random writes induces bad fragmentation.
The compression unit is 16 clusters. If the original data was compressed into 10 clusters and after the write it is now 11 clusters, there is no way to contiguously store that block. It is not uncommon for a compressed file have upwards of 10,000 fragments after some use. In practice this affects e.g. email software databases.
|
|
|
11-21-2010, 02:05 PM
|
#24
|
|
Elite Member
Join Date: Sep 2001
Posts: 30,636
|
Quote:
Originally Posted by ElenaP
Typically, a compression unit on NTFS would be 16 clusters, 64KB. So the system has to recompress 64K even if you write one byte. There is another problem, more important, that compression coupled with random writes induces bad fragmentation.
The compression unit is 16 clusters. If the original data was compressed into 10 clusters and after the write it is now 11 clusters, there is no way to contiguously store that block. It is not uncommon for a compressed file have upwards of 10,000 fragments after some use. In practice this affects e.g. email software databases.
|
I wouldn't call that more important, the effects of file fragmentation is hugely overblown by developers of software to fix it.
|
|
|
11-21-2010, 02:09 PM
|
#25
|
|
Lifer
Join Date: Mar 2004
Posts: 13,331
|
fragmentation not that significant.
but if a compression unit is 16 clusters that means random writes suffer from read-modify-write cycles just like an SSD without trim (but for different reasons of course).
so to write 4kb you need to read 64 kb, decompress it, modify the data, recompress it, then write it back down as 64kb or more (more if it doesn't compress as well).
The smallest 10,000 fragment file you can have is a 5,120,000 byte file (4.88MB using 1024 base conversion) where each every single one of its 10,000 fragments is non contiguous... Or a SIGNIFICANTLY larger file which is still extremely fragmented, both are patently ridiculous...
a 64kb file has a MAX of 128 fragment (because it takes up exactly 128 sectors, since each sector is 512 bytes)
__________________
I do not have a superman complex; for I am God, not superman!
The internet is a source of infinite information; the vast majority of which happens to be wrong.
How to protect your data guide
AA Naming Guide
main: Win7x64, i5-3570K, 16GB DDR3-1600, XFX HD6950, Gigabyte GA-Z77MX-D3H. 240GB Intel 520 SSD
fileserver: Solaris 11, Athlon2 X4 @ 3ghz, 4GB DDR2, 160GB samsung OS drive, 5x750GB WD CaviarGP drives in raidz2 (ZFS raid6).
Last edited by taltamir; 11-21-2010 at 04:09 PM.
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 06:33 PM.
|