• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Binary data in Python...

Armitage

Banned
Anybody know how to do this??

I'm reading a binary file and getting the byte values out using the array module - works beautifully. Most of the data in these files is interpreted as unsigned char, but a few chunks are unsigned longs. I can get the 4 bytes that go into the unsigned, but I don't know how to squach them together into one unsigned long. In C I'd use a union I guess.

Any ideas?
 
Originally posted by: Crusty
bit shifting 🙂

Yea, I'm considering that as we speak - taking it apart bit by bit and then putting it back together again. But again - that's something I know how to do in C but not in Python.
 
Originally posted by: Armitage
Excellent 😀

a_long = (a_byte[0] << 24) | (a_byte[1] << 16) | (a_byte[2] << 8) | a_byte[3]

Which is exactly how you would have done it in C 🙂. And don't try to run it on a mac.
 
Originally posted by: notfred
Originally posted by: Armitage
Excellent 😀

a_long = (a_byte[0] << 24) | (a_byte[1] << 16) | (a_byte[2] << 8) | a_byte[3]

Which is exactly how you would have done it in C 🙂. And don't try to run it on a mac.

Which makes perfect sense given the rest of python's syntax- I've just never gotten down into the weeds like this in Python before.

I probably should put in a byte-order test of some kind in case somebody ever tries to run it on another arch.
 
Originally posted by: notfred
And don't try to run it on a mac.
This may be a dumb question, but why don't they abstract that in the interpreter? Seems kind of silly to have such a simple way to shoot yourself in the foot.
 
Originally posted by: kamper
Originally posted by: notfred
And don't try to run it on a mac.
This may be a dumb question, but why don't they abstract that in the interpreter? Seems kind of silly to have such a simple way to shoot yourself in the foot.

Java does that, don't they? Always uses the same byte order internally regardless of architecture.
 
Originally posted by: Armitage
Originally posted by: kamper
Originally posted by: notfred
And don't try to run it on a mac.
This may be a dumb question, but why don't they abstract that in the interpreter? Seems kind of silly to have such a simple way to shoot yourself in the foot.

Java does that, don't they? Always uses the same byte order internally regardless of architecture.
Believe so, so why not python? Well, I doubt it uses the same byte order. It'd be much simpler to rewrite the operation at runtime, depending on the platform. I've never really thought about this before though. Does c suffer from the same problem? Obviously it would have to be fixed at compile time there.
 
Originally posted by: kamper
Originally posted by: Armitage
Originally posted by: kamper
Originally posted by: notfred
And don't try to run it on a mac.
This may be a dumb question, but why don't they abstract that in the interpreter? Seems kind of silly to have such a simple way to shoot yourself in the foot.

Java does that, don't they? Always uses the same byte order internally regardless of architecture.
Believe so, so why not python? Well, I doubt it uses the same byte order. It'd be much simpler to rewrite the operation at runtime, depending on the platform. I've never really thought about this before though. Does c suffer from the same problem? Obviously it would have to be fixed at compile time there.

Java is the only language I know of that does this. Its a performance hit if your language byte order doesn't match your architecture byte order, but in my experience, Java isn't to concerned about performance :evil:
 
Originally posted by: Armitage
Originally posted by: kamper
Well, I doubt it uses the same byte order. It'd be much simpler to rewrite the operation at runtime, depending on the platform. I've never really thought about this before though. Does c suffer from the same problem? Obviously it would have to be fixed at compile time there.

Java is the only language I know of that does this. Its a performance hit if your language byte order doesn't match your architecture byte order, but in my experience, Java isn't to concerned about performance :evil:
Well of course it is, but that's a whole other can of worms 😛

But like I said, you don't need to use a different byte order than the platform you're running on. All you have to do is make the language guarantee that "<<" works like so on little endian machines and then change the interpreter (or the jit compiler, in the case of a virtual machine) on a big endian system to change the "<<" to ">>" before executing the code (or compiling it, in the case of a virtual machine...).

So far as I understand it, a c compiler could do the same thing, except at compile time. Just guarantee that "<<" will always work the same way and then switch the instruction used under the covers, depending on your target platform.
 
Originally posted by: kamper
Originally posted by: Armitage
Originally posted by: kamper
Well, I doubt it uses the same byte order. It'd be much simpler to rewrite the operation at runtime, depending on the platform. I've never really thought about this before though. Does c suffer from the same problem? Obviously it would have to be fixed at compile time there.

Java is the only language I know of that does this. Its a performance hit if your language byte order doesn't match your architecture byte order, but in my experience, Java isn't to concerned about performance :evil:
Well of course it is, but that's a whole other can of worms 😛

But like I said, you don't need to use a different byte order than the platform you're running on. All you have to do is make the language guarantee that "<<" works like so on little endian machines and then change the interpreter (or the jit compiler, in the case of a virtual machine) on a big endian system to change the "<<" to ">>" before executing the code (or compiling it, in the case of a virtual machine...).

So far as I understand it, a c compiler could do the same thing, except at compile time. Just guarantee that "<<" will always work the same way and then switch the instruction used under the covers, depending on your target platform.

It's not just the bitshift operators though. Reading & writing binary files and other stuff along those lines has the same sort of problem.
 
It should still handle it under the covers shouldn't it? Alright, at that point, yeah, I guess I'd fully expect c to allow you to shoot yourself in the foot. I wrote a Huffman code compressor in java a few years back. I'll see if I can dig it up and move compressed files between my mac and pc. If I can find it, rewriting it in python would be a fun way to learn a bit 🙂
 
Back
Top