Skip to content

Question about position encoding #52

@javagl

Description

@javagl

This is going to be embarrassing. Maybe for me, but that's a risk that I'm willing to take.

I encountered some issues with SPZ-encoded data versus the original PLY data. The following is an archive that contains the original data as ASCII- and BinaryLE PLY data, as well as in SPZ format, once created with my own tools, and once created with the SPZ library:

spzPositions-2025-08-23.zip

It's not tremendously complex. In fact, this is the ASCII version of the PLY data:

ply
format ascii 1.0
element vertex 8
property float x
property float y
property float z
property float f_dc_0
property float f_dc_1
property float f_dc_2
property float opacity
property float scale_0
property float scale_1
property float scale_2
property float rot_0
property float rot_1
property float rot_2
property float rot_3
end_header
-2050.0 -0.0 -0.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2000.0 -0.0 -0.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2050.0 -50.0 -0.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2000.0 -50.0 -0.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2050.0 -0.0 -50.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2000.0 -0.0 -50.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2050.0 -50.0 -50.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0
-2000.0 -50.0 -50.0 1.7724539 1.7724539 1.7724539 20.0 1.0 1.0 1.0 1.0 0.0 -0.0 -0.0

Yeah, it's a "unit cube" (rather one with an edge length of 50), somewhere at x=-2000.

Dragging-and-dropping the BinaryLE-PLY file into the BabylonJS sandbox shows the expected result:

Image

When converting this data into SPZ, using the SPZ library, and dragging-and-dropping the resulting file into BabylonJS, then this is the result:

Image

That doesn't look right.

I've inserted some debug statements after this line in packGaussians

    // XXX TEST
    {
      // Decode 24-bit fixed point coordinates
      float scaled = 1.0 / (1 << packed.fractionalBits);
      int32_t fixed32d = packed.positions[i * 3 + 0];
      fixed32d |= packed.positions[i * 3 + 1] << 8;
      fixed32d |= packed.positions[i * 3 + 2] << 16;
      fixed32d |= (fixed32d & 0x800000) ? 0xff000000 : 0;  // sign extension
      float resultPosition = static_cast<float>(fixed32d) * scaled;
      std::cout << "Position " << i << " was " << g.positions[i] << " fixed32 is " << fixed32 << " converted to " << (int)packed.positions[i * 3 + 0] << " " << (int)packed.positions[i * 3 + 1] << " " << (int)packed.positions[i * 3 + 2] << " back to " << fixed32d << " sign check " << (fixed32d & 0x800000) << " back to result " << resultPosition << std::endl;
   }

This is doing what is done during the decoding, to print what the actual encoded information will become.

The output of this debug part is

Position 0 was -2050 fixed32 is -8396800 converted to 0 224 127 back to 8380416 sign check 0 back to result 2046
Position 1 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 2 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 3 was -2000 fixed32 is -8192000 converted to 0 0 131 back to -8192000 sign check 8388608 back to result -2000
Position 4 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 5 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 6 was -2050 fixed32 is -8396800 converted to 0 224 127 back to 8380416 sign check 0 back to result 2046
Position 7 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 8 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 9 was -2000 fixed32 is -8192000 converted to 0 0 131 back to -8192000 sign check 8388608 back to result -2000
Position 10 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 11 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 12 was -2050 fixed32 is -8396800 converted to 0 224 127 back to 8380416 sign check 0 back to result 2046
Position 13 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 14 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 15 was -2000 fixed32 is -8192000 converted to 0 0 131 back to -8192000 sign check 8388608 back to result -2000
Position 16 was -0 fixed32 is 0 converted to 0 0 0 back to 0 sign check 0 back to result 0
Position 17 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 18 was -2050 fixed32 is -8396800 converted to 0 224 127 back to 8380416 sign check 0 back to result 2046
Position 19 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 20 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 21 was -2000 fixed32 is -8192000 converted to 0 0 131 back to -8192000 sign check 8388608 back to result -2000
Position 22 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50
Position 23 was -50 fixed32 is -204800 converted to 0 224 252 back to -204800 sign check 8388608 back to result -50

The negative x-coordinate (for the first splat) is encoded in a way that when decoding it, the coordinate becomes positive. Given that this seems like a pretty fundamental issue, I wanted to confirm that I didn't do something embarassingly wrong here. I'm prepared for a "🤦‍♂️", but right now, I'm pretty stumped.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions