Depth issues when enabling a post process effect

heretique · September 18, 2023, 11:16am

Hi all,
we noticed some depth issues that appear when enabling a LUT post process effect.
As soon as the effect is enabled UI elements start to get rendered above the terrain as shown in the picture bellow:

and at the same time a water material that uses depth grab to implement some depth based effects get s broken and it looks like the depth doesn’t get updated anymore, please see the video bellow (I’m sorry for the low quality, I’m only allowed to upload 8MB at max):

You can also notice from the video that as soon as I enable the post process effect the console is logging a lot of WebGL errors GL_INVALID_OPERATION: Depth/stencil buffer format combination not allowed for blit.
I found a couple of places where gl.blitFramebuffer is called but I couldn’t get the code to break at any of the two call sites (webgl-render-target.js and scene-grab.js).
I’ve also tried setting

this.needsDepthBuffer = true;

but it doesn’t seem to do anything else than adding more to the reference count that informs the camera it should grab depth.
I’m not exactly sure why this is happening and why it is affecting the depth. I’m assuming that the post processing affect is applied after all layers are rendered so I don’t see why it should affect what is rendered before it. Or maybe my assumptions are wrong?!

Any hint on where to look, how to debug with more info or what might be happening is appreciated!
Thank you!

mvaligursky · September 18, 2023, 11:57am

Enable pc.Tracing.set(pc.TRACEID_RENDER_ACTION, true); to see what order things get rendered at. You will also see where the posteffects get called. Rendering after the post effects does not have the depth buffer available. Typically post effects run before the UI, so the UI rendering does not have a depth anymore. You could either move post-effects to render after UI (but the UI will be processed), or render your world UI on a layer before posteffects, but they would still be post-processed. Let us know what you find out.

heretique · September 18, 2023, 12:39pm

Thanks @mvaligursky , I’ll let you know how it goes after I figure out how to build PC in debug mode within our monorepo , by default it builds prod.

heretique · September 18, 2023, 2:29pm

@mvaligursky this is what I get when I enable that tracing channel with the post process effects off:

but I don’t get another trace when I enable the LUT effect because the layer composition doesn’t change.
Should I create a separate layer for the post process effects to better handle the layer composition?

heretique · September 18, 2023, 2:37pm

Is it possible to keep the depth buffer or the implementation makes it impossible/impractical?
I noticed that post process effects like bokeh use the this.needsDepthBuffer = true; to work.
Is the depth buffer cleared after the post process effects run?

mvaligursky · September 18, 2023, 2:52pm

So the way you have it set up, it seems the 3DUI renders before post-effects, and so it should have depth - so it should be occluded by 3d geometry. Is that what you’re seeing? But it just gets affected by post-effects, which is what you don’t want? Your capture also seems to show the state without any post-effects enabled - so render target (RT) is not used and we render directly to a backbuffer.

I assume you use 3DUI layer when setting up disablePostEffectsLayer on the camera - so the postprocessing kicks in after that layer has been rendered.

When you enable post effects, all layers till post-effect render to a texture. That render target has depth, so all those layers can do depth testing. But then post-effects render to the backbuffer (that’s where the final result need to go for brower to use it on the page), and so that buffer never had any data in the depth buffer - so rendering on following layers (UI) does not have depth.

heretique · September 18, 2023, 4:20pm

That was my expectation but it seems that as soon as I enable the post process effect the UI element renders in front of the terrain, like there is no more depth tests going on (see first picture). And something strange happens with the water as well, it seems that it just holds the last frame’s depth from before activating the post process effect. The video shows that if I move the camera around the depth doesn’t update anymore, it’s like having the same depth “picture” in the background.
I believe it is all connected to the error log that the browser is spitting right after enabling the effect: GL_INVALID_OPERATION: Depth/stencil buffer format combination not allowed for blit.. Maybe it’s trying to blit the depth but it is failing.

I’ve gathered two Spector.js captures one with the effect on and one with the effect off that I’ve uploaded here:

With the debug build of PC I’m getting a lot more useful information in the captures ( I don’t know how I lived without this so long )
The one with the effect off has a single blitFramebuffer call that blits the depth buffer

and a few draw calls later when the water starts to get rendered all looks fine. Here all rendering happens on the canvas framebuffer directly, no RT.

On the other capture, with the effect on, we have two blitFramebuffer calls in the same place there was only depth before, one for the depth and one for the color, this one I don’t understand why? It shows a RESOLVE-RT for the post process

These all happen on a render target, and then we have a third one at the end before rendering to canvas that does blit and renders the actual post process effect:

Maybe any of these may give you any hint. Meanwhile I’ll have a deeper look, with the debug build I’m getting a lot more useful information that might help spot the issue.
Thanks!

mvaligursky · September 18, 2023, 4:38pm

Try in Firefox, that often gives more detailed WebGL errors - could tell us more about thej blit issue.

Also enable

pc.Tracing.set(pc.TRACEID_RENDER_FRAME, true);
pc.Tracing.set(pc.TRACEID_RENDER_PASS, true);
pc.Tracing.set(pc.TRACEID_RENDER_PASS_DETAIL, true);

This might give you further info of what is happening (but not about post-effects yet). This logs every frame … so maybe set it to true only once every 100 frames or so.

Also try to disable antialiasing (I’m assuming that is used due to RESOLVE blits) - those resolve multisampled buffer into a single sampled.

Rendering without post effect seems ok - it executes a single blit where it grabs multisampled depth and copies it to a single sampled texture.

With post effects there are more blits. Can you post a screenshots with render actions for that? It should have render target info. And also render pass into for one frame.

mvaligursky · September 18, 2023, 4:39pm

And also, give you camera entities names, to have that in the captures. It seems you might have some cameras created in code or something without a name. (see those Cam: Untitled) - it could also be some internal engine camera, but I think they all have names.

mvaligursky · September 18, 2023, 4:41pm

And also see if have stencil enabled, and disable if you don’t need it to see if that helps.

heretique · September 19, 2023, 4:16pm

I’m back with some findings:

I tried running on Firefox, to my surprise the water is broken from the start, and I’m getting similar blitFramebuffer warnings from app start, I don’t have to enable our LUT effect:

image1408×100 10.5 KB

but unfortunately in this case Firefox doesn’t provide additional info.
Looking inside the scene-grab.js file I noticed that the texture for the depth grab render target is created with PIXELFORMAT_DEPTHSTENCIL format that’s 24bit for depth and 8 for stencil.
The interesting fact is when I tried forcing the format to be PIXELFORMAT_DEPTH the water material is getting broken from the start in Chrome/Edge too and I get the same blitFrambuffer warnings even before enabling LUT.
So it seems that the blitting is the problem but I can’t figure out how the two buffers are different as I don’t know how to get the canvas’s framebuffer format for the depth attachment, as that one is bound with null in WebGLGraphicsDevice.copyRenderTarget.

I still haven’t enabled yet the other tracing channels that you suggested, that’s next on my list, after figuring out how to get the canvas framebuffer format.
I also tried disabling the stencil (I hope I was doing it right, device.setStencilState(undefined, undefined);) but that didn’t seem to make any difference.

mvaligursky · September 19, 2023, 4:34pm

Just checked in the SceneGrab - the depth grab texture is PIXELFORMAT_DEPTHSTENCIL, and by default the device has stencil as well (options.stencil defaults to true if not specified) so that’s probably not a problem, unless you pass false to it somewhere during creation. The Editor does not expose it, so if you use an Editor project, you get stencil. So that should blit fine - it does in our examples, and I assume those run fine on your matchine? Try PlayCanvas Examples, it should show a little depth map in the bottom right. Make sure to be on WebGL.

Do you maybe have a depth layer added to some other camera rendering to texture, and that tries to do a depth grab as well?

heretique · September 19, 2023, 4:50pm

I tested the example you linked and it shows fine in both Edge and Firefox, no webgl blitFramebuffer warning.

Honestly I’ll have to check, we do have a “debug” camera that shows how our culling works but that should be disabled by default.
Thank you so much Martin for taking the time to help us with this and in such a timely manner.
I’m going to keep chipping away at this and get back once I have more info.

Adriaaaaan · September 20, 2023, 12:54pm

Do ui elements even use depth testing? I say this because previously I manually forced them to be enabled in order to make use of order independent transparency (which is the only way I could get this to work)

heretique · September 20, 2023, 4:07pm

@Adriaaaaan I’m not sure I understand the question but in our case they do, as they’re world space UI elements (billboards) not screen space.

heretique · September 21, 2023, 2:42pm

@mvaligursky I got to spend more time today on this and I would like to share my findings with you hopping that you can correct me or give me some hints (especially on how I could disable antialiasing properly). I’ve shared this with our team as well to keep them in the loop so please ignore the parts were I explain things too much

So I’m trying to describe what’s happening from what I’ve observed. I’m describing here the case where we’re testing in Edge/Chrome and we enable the LUT. The problem I’m describing here is somehow happening from the application start on Firefox, no need to enable the LUT, but it has been harder to debug.

The TLDR! is that the issue is pretty certain due to antialiasing being enabled or partially enabled even if app creation options explicitly disable it. And PC doesn’t properly handle the cased where we have a scene depth grab when multisampling is enabled. At least that’s what I think it is happening.

The long story follows:

When the application starts we’re calling

PlayCanvasApp.createApp

that in turn calls

createPlaycanvasAppOptions

That will create a WebglGraphicsDevice which is the core of the engine and the layer sitting direclty on top of WebGL. Now when this is created even if we set the graphics device to be initialized with antialiasing set to false , using the graphics device creation options I’ve noticed that the WebglGraphicsDevice.samples gets set to 4 inside the

WebglGraphicsDevice.initializeCapabilities() {
.....
this.samples = gl.getParameter(gl.SAMPLES); ---> this returns 4
.....
}

So somehow the MSAA gets enabled even if I hardcoded the antialiasing options passed to graphics device creation to false. I was wondering if somehow this is a property of the that we’re creating because this happens right after we get the WebGL context from the canvas through canvas.getContext("webgl"); but I haven’t seen any additional attributes set on it for our client when comparing it with the Playcanvas Post-Process Example.

For all the PC examples the antialiasing is set to off and all depth grab is working as expecting, and during initialization of the graphics device gl.getParameter(gl.SAMPLES); returns 0.

Where the problem comes from?

A bit of background:

when we use depth effects we ask PC to give us a texture with depth information. PC does this by creating a render target called renderTargetSceneGrab with a texture used to store depth information called uSceneDepthMap. In order to keep the depth information up to date, after all opaque layers are rendered to the main framebuffer (default canvas or another RT), it blits/copies the depth information from the main framebuffer to that uSceneDepthMap render target using a gl.blitFrambuffer call. In order for this blit to be successful both render targets/framebuffers’s attachements that we want to copy (color, depth or stencil) need to have the same format.
Important to note here is that the render target used for updating the renderTargetSceneGrab buffer is initialized with
```
this._samples = Math.min(options.samples ?? 1, maxSamples);
```
set to 1 , so no multisampling enabled on it.
the interesting thing is that on Edge even if the grafics device is initialized with this.samples to 4, suggesting a muplisampled default framebuffer, that bliting seems to work, but it doesn’t work on Firefox who starts emitting WebGL warning: blitFramebuffer: Depth buffer formats must match if selected. as soon as the renderTargetSceneGrab render target is created RenderTargetAlloc | Alloc: Id 28 renderTargetSceneGrab: 640x1094 [samples: 1][Depth][Stencil][Face:0]
also notice from the above trace, the renderTargetSceneGrab has depth and stencil, being setup with format pc.PIXELFORMAT_DEPTHSTENCIL (24 bits depth, 8 bits stencil mask)

Now what happens in our case when enabling the LUT:

another render target is created that will be used as the main framebuffer that is supposed to be similar to the default one from the canvas . But what it does there, it creates a multisampled depth buffer attachment with no stencil RenderTargetAlloc | Alloc: Id 31 MainCamera-posteffect-0: 640x1055 [samples: 4][MRT: 1][Color][Depth][Face:0] by calling these lines of code:
```
gl.renderbufferStorageMultisample(gl.RENDERBUFFER, target._samples, gl.DEPTH_COMPONENT32F, target.width, target.height);
gl.framebufferRenderbuffer(gl.FRAMEBUFFER, gl.DEPTH_ATTACHMENT, gl.RENDERBUFFER, this._glMsaaDepthBuffer);
```
Notice the format is gl.DEPTH_COMPONENT32F (32 bit depth, no stencil)
Now when it is time to do the scene-grab pass the blit that’s copying the depth information from the main framebuffer to our depth render target fails. This is happening inside SceneGrab.initMainPath >> onPreRenderOpaque >> device.copyRenderTarget(device.renderTarget, this.depthRenderTarget, false, true); because the gl.READ_FRAMEBUFFER is multisampled with depth 32 bits and the gl.DRAW_FRAMEBUFFER is not multisampled with depth/stencil 24/8 bits

mvaligursky · September 21, 2023, 2:59pm

If you do this

new WebglGraphicsDevice(canvas, {
    antialias: false
});

That should definitely not have antialiasing. Is this what you use?

heretique · September 21, 2023, 3:06pm

Yes, that is what I use, our creation code looks like this:

      const appOptions = createPlaycanvasAppOptions(this.canvasManager.canvas, {
        autoTick: false,
        elementInput: devices.elementInput,
        keyboard: devices.keyboard,
        mouse: devices.mouse,
        gamepads: devices.gamepads,
        touch: devices.touch,
        graphicsDeviceOptions: Object.assign({}, window.CONTEXT_OPTIONS, {
          antialias: false,
          alpha: false,
          preserveDrawingBuffer: false,
          powerPreference: "default",
          maxPixelRatio: perfOptions.maxPixelRatio,
        }),
        assetPrefix: window.ASSET_PREFIX || "",
        scriptPrefix: window.SCRIPT_PREFIX || "",
        scriptsOrder: window.SCRIPTS || [],
      });
   .......

notice the antialias: false and that createPlaycanvasAppOptions function looks like this:

export function createPlaycanvasAppOptions(canvas: HTMLCanvasElement, options: ApplicationOptions) {
  const appOptions = new pc.AppOptions();

  appOptions.graphicsDevice = createDevice(canvas, options) as unknown as pc.GraphicsDevice;
  addComponentSystems(appOptions);
  addResourceHandles(appOptions);

  options.autoTick !== undefined && (appOptions.autoTick = options.autoTick);
  options.elementInput && (appOptions.elementInput = options.elementInput);
  options.keyboard && (appOptions.keyboard = options.keyboard);
  options.mouse && (appOptions.mouse = options.mouse);
  options.touch && (appOptions.touch = options.touch);
  options.gamepads && (appOptions.gamepads = options.gamepads);

  options.scriptPrefix && (appOptions.scriptPrefix = options.scriptPrefix);
  options.assetPrefix && (appOptions.assetPrefix = options.assetPrefix);
  options.scriptsOrder && (appOptions.scriptsOrder = options.scriptsOrder);

  appOptions.soundManager = new pc.SoundManager();
  // appOptions.lightmapper = pc.Lightmapper as unknown as pc.Lightmapper;
  appOptions.batchManager = pc.BatchManager as unknown as pc.BatchManager;
  return appOptions;
}

function createDevice(canvas: HTMLCanvasElement, options: ApplicationOptions): pc.WebglGraphicsDevice {
  if (!options.graphicsDeviceOptions) {
    options.graphicsDeviceOptions = {};
  }
  options.graphicsDeviceOptions.alpha = options.graphicsDeviceOptions.alpha || false;

  return new pc.WebglGraphicsDevice(canvas, options.graphicsDeviceOptions);
}

notice in createDevice we’re doing

return new pc.WebglGraphicsDevice(canvas, options.graphicsDeviceOptions);

heretique · September 21, 2023, 3:17pm

and inside the WebglGraphicsContext.initializeCapabilities I’m getting:

mvaligursky · September 21, 2023, 3:31pm

I tried in the example here, which runs latest engine: PlayCanvas Examples

I added antialias: false and also tried with default true, and device.samples is 0 and 4, which is correct. How come this does not work on your side … can you try the same steps to confirm it’s not some platform / browser issue?

see the first and last lines of selection, those are lines I added.