xvasynth

Interim batch (14 voices); Second community voice actor model; xVATrainer nearing completion; Hardware upgrade

Added 2022-02-25 22:28:08 +0000 UTC

A fair bit to update on! Let's get into it.

First of all, I've been going through more v1->v2 voice re-training. The 13 re-trained voices are (all Skyrim):

- Serana
- FemaleCondescending
- MaleBrute
- MaleYoungEager
- Frea
- MaleEvenTonedAccented
- MaleDrunk
- Galmar
- Esbern
- Tulius
- Ulfric
- Kodlak Whitemane
- Arngeir

I will use these to update the FuzRoBork integration plugin with pre-cache files for these voices, in the next few days.

I'll finish off a few more voices, and then I'll get the next poll going, for some brand new voices.

Next, we have our second community voice model, from a voice actress named Ellie Mars. As mentioned last time, please keep in mind fair use of the voice, to avoid any trouble. The voice model is licensed as CC BY-NC, which just means don't use it for commercial products.

If you've been active on the Discord server, you may be aware that I've recently made a lot of progress on xVATrainer, and it is now nearing completion. All the initial v1.0 features and components are in, save for some smaller non-critical bits and bobs that need polishing/finalizing. The app is usable, and any teething issues apart, pretty much good to go. In fact, the first ever voice trained through xVATrainer is done - the second community voice!

The first round of early testing and feedback has started, and I will be spending a bit more time polishing things up, along with implementing any feedback that I get. I'll post more about this soon, after which a slightly wider beta test period will start.

One annoying issue with the app, over the scripts, is that HiFi-GAN training is currently stuck on num_workers=0, because any higher makes it inexplicably quit() without errors after a deterministic number of training iterations. This means that training is running about half as quick as it could be, for that small stage of the training. If you're experienced with PyTorch and would want to help me find a fix for this, let me know!

Finally, I have some good news on the hardware side of things. I've been saving up the donations from Patreon, proceeds from a temporary second job (other than the phd), and some money of my own, and I recently significantly upgraded the hardware in my workstation, boosting CPU and GPU compute further, along with a crap-ton more RAM. Voice training has been faster since I finished setting things up, a very noticeable amount!

This would not have been possible were it not for the amazing support I've had from everyone here, and I can't thank everyone enough! Not only does the upgrade mean faster voice training, but it also means faster research, as I'll begin to set my sights on v3 models.

In other updates, we're at about ~4500 Steam activations, and ~1000 unique users!