xvasynth

More v2.0 models; Public v2.0 release date; xVATrainer

Added 2021-10-21 13:10:11 +0000 UTC

The training scripts have been whirring away, and I've actually gotten them to be faster and more high quality still, with some further work.

I've been working through some base voices to use as pre-training, in future voices, as well as some voices for use in the v2.0 showcase video (which is now finished). The v2.0 voices are as follows:

- Skyrim: FemaleEvenToned
- Skyrim: MaleEvenToned
- Oblivion: FemaleAltmerBosmerDunmer, and Morrowind: FemaleDunmer
- Fallout 3: FemaleGroupRaider
- Overwatch: Widowmaker

There's also Falout 4: Nate, but I'm not yet happy with this version (mostly trained before my most recent quality improvements). I'll likely re-attempt this, but you can still try it out meanwhile, if you'd like.

Speaking of which, the planned date for public release is this Saturday! I'll have a new poll ready very shortly, and we can go back to training new voices again! This time using v2.0 models.

Finally, now that the v2.0.0 build is finished, I am switching my main development efforts/attention to xVATrainer, the secondary app which will be used for training voices. That way, anyone can train voices (without having to get the environment and dependencies installed right, write code, or anything like that).

The (rough) steps are as follows (not necessarily this order):

I quickly finish off adding in some new data pre-processing tools that I've written since adding in the tools, originally
I design and draft up the UI/flow for model and batch model training
[the hard part] I set up a model training backend, with all the necessary groundwork for managing model training instances, and inter-process communication, for multiple model types
(done together with 5 or 6) Implement a tensorboard-style graph for losses and maybe some metrics, alongside text log feedback
(or 6.) Integration of modified FastPitch v2.0 model training into the framework
(or 5.) Integration of HiFi-GAN model training into the framework
Harmonize everything into a good batch training flow, to allow automated training of a list of voices
Optimizations, and other "glue" features

I'll have more updates for this, as I finish off steps.