Morphing through Audio

So I’m working on a rudimentary facial animation script based on sound.

https://launch.playcanvas.com/579047?debug=true

The premise is simple:

  • using the visualizer tutorial, I’m taking the freqData from the audio to get 0 - 1 values
  • these values are passed as .setWeight() to the MouthOpen blendshape of my model.
  • I’m using the tween.js library to make the movements less jagged. update() calls renderData(), which starts a 0.5 sec animation from current weight to new weight, which prevents any other animation from happening until the previous one has completed.

Although I have movement, I’m looking for a bit of help on how to filter the audio or tweak the values that are coming out of the freqData so that the mouth movements are more precise.

For example, there are moments of silence in the track where the model still has her mouth open. Technically that should throw a 0 value and close the mouth.

Questions:

  • Is there a way to smooth out the data? Maybe I can rely less on tween and just setWeight per value.
  • Is there a way of only getting the more relevant peaks (imagine having a CONTRAST slider for the sound wave)
  • EXTRA POINTS: Are there any libraries out there that might help me to better capture data from the audio which might help to move into more detailed animation (ex: Can we detect an “O” sound and morph accordingly?)

I tried working with tone.js but got stuck on how to pass the audio data to this library.

They do:

var player = new Tone.Player({
	"url" : "./audio/FWDL.[mp3|ogg]",
	"loop" : true
}).connect(meter).toMaster();

but I grab my audio as:

var slot = this.entity.sound.slot("track");
slot.setExternalNodes(this.analyser);

I could probably pass an asset value to that URL, but I’m not sure I want to transfer all audio control to that library. Thoughts?

1 Like

Your work sounds very cool and interesting but I was so disappointed when I got this on my screen.

image

To pass data to Tone.js I guess you should use this one method

image

I hope you will have no problems with obtaining sound buffer data. You can prepare how you want.

To smooth data you probably just need some math.
If you can show your data, probably I can tip something more specific…

Oh man, I just made the project public. Could you give that another try?
I greatly appreciate your input.

Definitely math will do the trick.

Okay, but in 8 hours :slight_smile:

1 Like

Ok, so now I see that you wanna make this girl sing your song, right?

So for sure, you have to isolate your vocal frequencies and check them to control mouth.

There is a nice demo, how to isolate voice and mute it, but you need opposite - mute anything except voice.

I can’t fork your project, cuz I can only lauch it, so you have to do it by yourself.

After that, when you have audio node with only vocal there, you can get bands from there.
Create analyser, connect it to your node and get data by this method:

this.bands = new Uint8Array(this.analyser.frequencyBinCount);

Oh, don’t forget about AnalyserNode.fftSize to set your data size (it should be power of 2)

Now you have an array of values:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …]

Probably, you can even set fftSize to 1 and get clear value, i’m not sure it will work.

Afterwards, just use pc.math.smoothstep(analyser.value, 0, your_max_vocal_power) you will get value from 0 to 1 and then animate your mouth by this value.

Also, you can add me to your project and I’ll try it by myself. Or just follow these steps;

Working sample of getting how loud is your sound:

var value = 0;
    
    for (var band in analyser.freqData) {
        value += analyser.freqData[band];
    }
    

    value = value / analyser.freqData.length;
    value = pc.math.smootherstep(-100, -50, value).toFixed(1);
    console.log(value);

Okay, so for now I stuck on isolating vocals.

I did it, but it is really dirty, so we need another way to isolate it, because karaoke method doesn’t allow it.

https://www.soundonsound.com/sound-advice/q-can-remove-vocals-track-using-phase

Unfortunately, removing everything else to just leave the solo vocal cannot be achieved with any kind of variation of this technique, simply because of the variation in the stereo panning of every other source. In order for all of the (unwanted) backing to cancel out when the polarity of one channel is reversed, it would need to be identical in both channels, with the vocal panned to one side.

Today i meet one DJ guy who absolutely sure knows how to do that.

My project: https://playcanvas.com/project/536009/overview/audio-fork