What are the limits for publishing large splats?

I was hoping to use the new SOG format to test whether a really large scene could run smoothly, maybe my ambitions were a bit high! :sweat_smile:
The dataset is 26 million splats. The original file was 7 GB, but after converting to SOG format it’s only 600 MB, which is pretty incredible compression.

However, after trying to publish this file, 11 hours later it was still showing ā€œSuper Splat SOG upload busy processing.ā€ Is that normal? Should I wait longer, or does this mean it’s stuck?

Also, are there any hard limits we should be aware of to avoid pushing things beyond what’s supported? For example:

  • Max Gaussian splat count** (e.g.1/ 5/ 10/ 20 million etc?)
  • Max published file size** (either .ply or SOG)?
  • Max boundary size (if any)

Just trying to understand the boundaries to plan better :slight_smile:

@slimbuck

I’d just add that 26M splats might run a bit slow … and you might want to consider streaming format, which we’re busy adding support for to publishing.

Thanks, I’m definitely hoping to take advantage of LOD streaming and was the original idea behind testing a 26 million splat with the SOG format!
I didn’t realise it wasn’t working yet in publish mode.
I’ll follow the instructions and try to build my own viewer to test it out.

The reason why you likely run in to a time bottleneck is because to convert from sog to ply format you must run k-means clustering to create codebooks. In a bit more detail its near lossless compression is through essentially reusing near identical shapes or patterns of rgb, scales (there are codebooks for colour, scales, and rotation (however rotation doesn’t perform k-means iirc)). For example ply row 1 might have 222, 301, 444 and at row 120390 it might also be 222, 301, 444 using a codebook you cluster to map indexes and is done to also further optimize these shared values by fitting said values ie. we have row 6 222, 301, 445 we would likely also put that value in the 222, 301, 444 index.
This codebook compression & optimization is used on the spherical harmonics which are the largest and most data intense component of any gaussian splatting format. However if you do not care about super precise colour (its comparable to sdr vs hdr imo) I would reccomend running your conversion of your ply through the splat-transform cli and setting the -sh_bands flag to 0 avoiding the spherical harmonics altogether which will both make it smaller, process quicker, and run smoother with the cost of colour fidelity.

LOD Streaming will definitely help with performance and runtime but if your concern is max size, count, etc from what I’m aware of looking through open source code there should be no issue but it can and will take exponentially longer the bigger you get.
If you are interested in the specifics of the SOG format I would suggest looking at their documentation here - The SOG Format | PlayCanvas Developer Site