Positional Audio for Video

I have a scene with a video texture; it’s shown as a TV screen in the scene. The video has audio, but it really needs to have positional audio so that you only hear it when you’re next to the TV.

My current plan (the only thing option I can think of at the moment) is have the audio for the video stored in a separate file and to just play the audio track on the video’s play event. My only concern with having them be totally separate entities is potential de-sync if there are buffering issues in the video or something like that.

Before implementing any of that, though, I wanted to make sure there’s not a simpler way of doing all this :-). Basically I’m just wondering: is there any other simpler way to make video audio positional?

I’m not aware of a way do do this other than how you suggest (namely storing the audio in a separate MP3 file and playing that back via the positional sound component). Presumably, you can get the video time and audio time and adjust either to keep in sync if they happen to de-sync.

1 Like

I tried that, but had issues. I did it this way (without success):

  • Play audio on ‘canplay’ video event
  • Sync the audio to the video every 100 frames, based on the video’s currentTime and the audio’s currentTime

Unfortunately they didn’t line up for some reason (even though I was using an audio track directly exported from the video). I learned that if I offset the audio by 0.3 seconds it lined up with the video in Chrome and Edge, but in Firefox it had to be offset more like 0.5 seconds to line up. After hours of experimentation I dropped the idea.

What I ended up going with was simply faking positional audio on the video. That is, I used the original video audio (so no need to load extra audio tracks), and every frame (though it could be done less) I calculated the distance between my audio listener and the video textured object. Then I set the video volume based on a custom fall-off curve and max fall-off distance.

It works great as far as volume, but obviously it’s missing some other positional components (e.g. panning). So…it’s really just a “faked” approach, but it works fine in some situations and is definitely lighter since it doesn’t need the extra audio track(s).

I’d like to look into it more as time allows.

1 Like

Could it be due to video buffering behaviour in different browsers?

I don’t think so, since I was syncing regular (every 100 frames)—not just at the beginning.

Based on Web Audio API, you can create instances of audio nodes from stream, in your case from Video, this should allow you to attach positioning effect node.

Worth asking @vaios if he could come up with some example where sound slot source is taken from video. That way it would only need to point sound slot on sound component to video audio source, and the rest possibly could work. Might not be as simple as that though.

Sweet; I wondered if that was possible, but I’ve only had a chance to briefly look into it and didn’t find anything in that time. I’d love to hear @vaios’s thoughts.

Facing this exact issue, video is streaming at run time on a TV(video texture) and I need the audio to be positional (3D). is this possible or any other alternative to this?

In my case I cannot download the audio and place it in the editor as it is a live stream.

This can be achieved by using the Web Audio API directly. Made an example project: