This reminds me of how people are unaware pixel art for consoles was designed around the limitations of the medium which is why it looks like shit without scanlines.
Not just the CRT scanlines, but also the NTSC distortion of the signal, which blended in the dithered pixels.
Yes, they knew the audio recording devices had limitations, and they were making voices that sounded best within that limitation, not something to be cleaned up later.
Except when we're training an AI voice clone, we aren't making it for a fake 1930s movie - we're making it for a mod for a contemporary video game, where high-quality audio is expected.
Except when we're training an AI voice clone, we aren't making it for a fake 1930s movie - we're making it for a mod for a contemporary video game, where high-quality audio is expected.
Except when we're training an AI voice clone, we aren't making it for a fake 1930s movie - we're making it for a mod for a contemporary video game, where high-quality audio is expected.
But it still sounds wrong.
Would you rather have a mixture of voices (in the same game or same mod), where some have heavy static (and zero bass) while others sound ultra-HD?
I forgot to unsubscribe from ElevenLabs, so they charged my credit card $5 yesterday. So I decided to use up all the monthly credits as soon as possible, and then unsubscribe.
So, here is some stuff from my fictional world's lore:
I forgot to unsubscribe from ElevenLabs, so they charged my credit card $5 yesterday. So I decided to use up all the monthly credits as soon as possible, and then unsubscribe.
So, here is some stuff from my fictional world's lore:
Yes, I used the Stronghold Crusader narrator voices and Stronghold Crusader soundtrack.
Probably a lot. And more varied input in some cases. Sometimes when you don't have enough samples you can train an eleven lana sample, have it read a book chapter or something and use the output as further input for RVC
Probably a lot. And more varied input in some cases. Sometimes when you don't have enough samples you can train an eleven lana sample, have it read a book chapter or something and use the output as further input for RVC
If only I knew this earlier.
(I generated a ton of conversations between Baldur's Gate 1-2 characters for the lulz using ElevenLabs. I still have them saved on my HDD.)
Probably a lot. And more varied input in some cases. Sometimes when you don't have enough samples you can train an eleven lana sample, have it read a book chapter or something and use the output as further input for RVC
If only I knew this earlier.
(I generated a ton of conversations between Baldur's Gate 1-2 characters for the lulz using ElevenLabs. I still have them saved on my HDD.)
Ah but i think you can continue training an already trained model.
Ah but i think you can continue training an already trained model.
And that is exactly what i am doing.
Turns out, it's taking longer than I expected though. I can guarantee that I will post the 500-epoch retrained versions this week though. Just not today.