I'm working on a PVM program that I'm afraid is going to run into some network bandwidth issues.
In the initialization phase, the program has to generate about 1GB of data via a relatively expensive operation. So I want to split this up over the cluster. But all of the nodes need all of the data.
So, if I have 16 nodes, each node will generate 1000MB/16 = 62.5MB (data generated by each node)
It then has to send that data to the other 15 nodes: 62.5MB * 15 = 937.5MB (network traffic generated by each node)
Each node generates this much traffic, so: 937.5MB * 16 = 15000MB = 15GB (total traffic on the network)
Looking at it another way ... each node has to receive 937.5GB of data. This is would ideally take at least 75 seconds on a 100baseT net (yea I know ... but the myrinet is down and not likely to get fixed soon). In practice, with this huge multicast going on, I think it will be considerably slower.
Ok, so finally to the question ...
Does anybody know of a C library that I could use to compress the data array before packing it into the PVM message? I think it may be worth trading the CPU time for the network bandwidth if I could get maybe 2-to-1 compression. Would take some benchmarking to know for sure.
In the initialization phase, the program has to generate about 1GB of data via a relatively expensive operation. So I want to split this up over the cluster. But all of the nodes need all of the data.
So, if I have 16 nodes, each node will generate 1000MB/16 = 62.5MB (data generated by each node)
It then has to send that data to the other 15 nodes: 62.5MB * 15 = 937.5MB (network traffic generated by each node)
Each node generates this much traffic, so: 937.5MB * 16 = 15000MB = 15GB (total traffic on the network)
Looking at it another way ... each node has to receive 937.5GB of data. This is would ideally take at least 75 seconds on a 100baseT net (yea I know ... but the myrinet is down and not likely to get fixed soon). In practice, with this huge multicast going on, I think it will be considerably slower.
Ok, so finally to the question ...
Does anybody know of a C library that I could use to compress the data array before packing it into the PVM message? I think it may be worth trading the CPU time for the network bandwidth if I could get maybe 2-to-1 compression. Would take some benchmarking to know for sure.
