Wow. JPEG 2000 (OpenJPEG) compression needs a lot of RAM.

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
I was adding JP2 support to a server app I'd written, so I'd put together a .NET wrapper class for the openJPEG JPEG2k CODEC.

I was testing it on some images from my Canon 5D2, and it worked great doing one at a time. However, in production it was going to have to parallelize the work because it would be doing thousands of images in a batch.

So I converted the code from:
Code:
foreach (BatchInfo bi in BatchList) {
      using (Bitmap bmp = (Bitmap) Bitmap.FromFile (bi.sourceFile) ) {
            JP2.CompressToFile(cParams, bmp, bi.destFile); }
}

To

Code:
ParallelOptions pOpt = new ParallelOptions();
pOpt.MaxDegreeOfParallelism = 4;
Parallel.ForEach (BatchList, pOpt, (bi) => {
      using (Bitmap bmp = (Bitmap) Bitmap.FromFile (bi.sourceFile)) {
            OPJWrapper.CompressToFile(cParams, bmp, bi.destFile, OPJCODECType.JP2); }
});

The first worked great. The 2nd imploded everytime; and if I changed it to 2 threads it imploded some of the time.

It took me most of a day of tinkering to work out what had happened: I'd run out of 32-bit address space, and the library was failing with an Out of Memory situation, and on occasion it was taking the .NET framework with it.

I'd essentially wrapped the whole library, and I hadn't even considered the implication of holding 20 Mpix images in 24bpp RAW format in RAM, then copying the data to a 96bpp buffer for compression, the need for internal CODEC working-space equal to the size of the frame buffer, plus memory sufficient to hold the compressed codestream. For reference, the working set needed to compress 1 image is about 700 MB.

The other issue was that .NET really discourages you from writing to "unmanaged" memory. My first go at the wrapper, allocated a 96bpp managed buffer, and then copied it to the CODEC's 96bpp input buffer. That way round, the working set was over 1 GB per image!
 
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,695
4,658
75
I'm curious: Why JPEG2000, instead of, say, WebP?
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
I'm curious: Why JPEG2000, instead of, say, WebP?

JPEG2000 is more standard really. It isn't great, but it isn't changing any time soon.. WebP has a good chance to totally change in the near future (Google has a habit of breaking backwards compatibility).
 

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
Because the industry standard in the particular field I am targetting is JPEG2000, and has been for the last 5 years.

Support for high-resolution images, HDR and hyperspectral images are all mandatory features in scientific imaging, and JP2k handles all this.

WebP while lean and relatively simple, is limited to low resolutions, 8 bit, and 3+1 channels. This is satisfactory for web-browsing, but is unsuitable for "professional" or "scientific" use, where containers like TIFF, and advanced, complex specifications like JPEG2k rule.
 
Last edited:

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
Interesting. I have used a custom JPEG 2000 algorithm in an embedded environment with very high resolutions before so I know its certainly possible to get the algorithm down to small amounts of RAM if you need to as well as ensuring it runs within hard realtime constraints. Of course you'll need to write your own and tweak it appropriately and that took me quite a few months but its certainly possible.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Assuming you're on a modern system (64-bit Windows, >4GB RAM), why not do it the Unix way, and parallelize by running several program instances? :)
 

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
Assuming you're on a modern system (64-bit Windows, >4GB RAM), why not do it the Unix way, and parallelize by running several program instances? :)

It was something I was considering and is something I've done before. I'd always considered forking another process as a rather heavyweight way of parallelizing something, but when your process has a 1GB working set, it's rather moot.

It's funny, because on unix, every thing uses pipes and redirection to executables. Whereas on Windows people traditionally dynamically link to libraries. Of course, you can use pipes and redirection on Windows and it works great, but it's something I've only done as a last resort, fr no real good reason.

I have used a custom JPEG 2000 algorithm in an embedded environment with very high resolutions before so I know its certainly possible to get the algorithm down to small amounts of RAM if you need to as well as ensuring it runs within hard realtime constraints. Of course you'll need to write your own and tweak it appropriately and that took me quite a few months but its certainly possible.
Interesting. I'm pretty sure that OpenJPEG suffers by being a reference, general purpose implementation. Everything is done internally as 32 bit, which is unnecessary for most purposes. And indeed, JPEG2000 can be heavily tuned through the use of image tiles and precincts top permit piecewise encoding/decoding.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
It was something I was considering and is something I've done before. I'd always considered forking another process as a rather heavyweight way of parallelizing something
In all fairness, it is, on Windows. But, it's fine for big batches of work. On Linux or FreeBSD (not sure about Big Unixes), forking, dying, and joining are heavily optimized, and programs that are made to be communicated to through pipes tend to also be optimized to start up quickly.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
It was something I was considering and is something I've done before. I'd always considered forking another process as a rather heavyweight way of parallelizing something, but when your process has a 1GB working set, it's rather moot.

You pay mostly up front in start-up costs. After that, it could actually have some nice benefits given the amount of memory you're using. Individual processes get more ram, and can scale proportionately to the amount available in the system. It's also easier to measure and understand what's going on with memory utilization at a process level. Given that your problem doesn't involve shared access to resources, i.e. each job gets an image to work on and doesn't need to know anything else past that point, I'd naturally lean toward processes.