NokiMo
xvasynth
xvasynth

patreon


Technical (?) post #3 - xVA modding support for other games

xVASynth is built to be very easily extendable to other games. It is mainly just a framework to serve voice models for games, so nothing in the code is actually specific to any specific game. I have already done everything necessary to include support for the games I have listed on the description page, for which I train voice models. However, if people wish to provide support for other games, I fully support, and actually encourage this (I would prefer to take care of the existing xVA base games myself though). The process of adding support for a new game is actually really easy.

I will eventually release a separate app which people can use for training their own voice models. However, I still haven't actually finalised the process myself! I am still changing things, so I don't yet have a final step-by-step procedure that I can package up. Rest assured though, this will be coming.

In the meantime, if you can/know how to train voices yourself, or if there are voices I've trained which are also used in other games (eg same voice actor), I am happy for people to re-distribute these to the other Nexus pages. There are a couple quick steps to be taken to include support for other games.

  1. Create the game skin data for the xVA app
  2. Prepare the model(s) for use with the new game
  3. Let me know about this, so that I know to keep track of it, and link to it from the xVA page. You should also include a link back to one of the xVA pages, so that people know where to download the actual tool from. If the mod is JUST a voice (eg not an actual mod as well), I'd also ask that you include "- xVASynth" at the end of the title, so it's searchable via xVASynth - but this is not a hard requirement.

Important: Complete support for other games is functional starting with v1.2.2 of xVASynth.

-----

Step 1.

For this step, please refer to the images in the /resources/app/assets folder. Here you will see the background images used in the app for a specific game (among some other files, ignore those). You will notice that the file naming is a bit strange, but these file names contain ALL the information needed for supporting that game. To add your new game, pick a nice (do make it a good one) background image representative of the game, and simply follow the same naming convention, namely:

<gamename>-<hex colour for the skin>-<prefix>-<presentable game name>.jpg  (or .png)

Let's go through each of these:

You can use either .jpg or .png images. So to recap, let's look at a couple of examples:

This file has "skyrim" for the <gamename>, "8197ec" for the light blue <hex colour for the skin>, "SK" as the app <prefix>, and "Skyrim" for the <presentable game name>.

This file has "falloutnv" for the short (one-word) <gamename>, "8c2119" for the dark red <hex colour for the skin>, "NV" for the <prefix>, and "Fallout New Vegas" for the <presentable game name>.

I am happy to just do this myself if someone requests it, and I just include the asset file with an xVA update, but if you don't want to wait for that, you can just include it with your mod (or a separate mod file, whatever you wish). You can do both if you want, if you want me to eventually integrate the new asset file into the main xVA download (credits assigned, of course).

That's all that's needed for a new game to be supported. However, the game won't show up in the dropdown, unless there is a model available for it. As an end-to-end example of a new game being added, I have now added support for Fallout 76 to xVASynth! This is now a base game supported by xVA (though it's currently impossible to get the dialogue data from the game files...). I will create the mod page for this soon, when I have more voices for it (by seeing if there are any common voice actors with other Bethesda games, like the robot).

I will include this in a future xVA update, but I've attached the asset file to this post, until then. I will describe in Step 2 how I ported over the Assaultron voice from Fallout 4 to Fallout 76. 


Step 2.

This step is for preparing the models. If you explore the data from an existing voice model which I have published, you will notice that there are three (or four if the voice has a bespoke HiFi-GAN vocoder) files. A .pt file for the voice model, a .wav file for the audio preview (when right-clicking a voice in the voice selection panel), a .hg.pt file if there is a bespoke vocoder, and lastly, a .json file. This .json file contains the metadata for how the app should load a voice model into the app. The contents of this file are as follows:

{
    "version": ____,
    "modelVersion": ____,
    "modelType": "FastPitch",
    "emb_size": ____,
    "games": [
        {
            "gameId": ____,
            "voiceId": ____,
            "voiceName": ____,
            "gender": ____,
            "emb_i": ____
        }
    ]
}

tl;dr, you just need to change the "gameId" value, if you are using an existing model. Optionally (recommended) "voiceId" as well. Otherwise, let's go through all of these:

So looking at an example, let's take Cass from New Vegas. The files for this currently are:
nv_cass.json
nv_cass.pt
nv_cass.wav

And the contents of the .json file are:

{
    "version": "1.0",
    "modelVersion": "1.0",
    "modelType": "FastPitch",
    "emb_size": 6,
    "games": [
        {
            "gameId": "falloutnv",
            "voiceId": "nv_cass",
            "voiceName": "Cass",
            "gender": "female",
            "emb_i": 0
        }
    ]
}

This is not a necessary step, but to keep things organised well, I would recommend adding a prefix to the "gameId" value you use formed as the lower case version of the <prefix> value from step 1, with an underscore after it. So for New Vegas where the <prefix> is NV, the prefix is "nv_". Again, this is not necessary, but it does make things more clean. Also, keep this same name across all the files for a voice (the .wav, the .pt, the .json, and if there's a vocoder the .hg.pt also - so nv_cass.hg.pt if this example had a vocoder too).

For the Assaultron example above, I re-named all the files from f4_robot_assaultron to f76_robot_assaultron, and I adjusted the .json file. I have attached the .json file as well.


Step 3.

You are done, and ready to release. I give permissions to people to re-distribute models I've published (on the nexus) onto other NEXUS pages. I do ask that you link back to one of my xVA pages, mainly so people know where to download the xVA app from (which I'd rather didn't get re-published - multiple reasons).

Please do let me know as well, if you are making a release. Just because I want to know :) . But also, I'm happy to link to it from the xVA description page, if you are too.

Finally, though it's fine if you'd rather not, it would make things easier to search for if you include " - xVASynth" at the end of the mod title if all the mod is is a voice re-release (or a new voice). Don't worry about it if there's more to it than just that.

If you do re-distribute a voice model that I have already trained, I suggest keeping an eye on it (track the mod page or something), because I may release updates to improve the quality of the voice (especially early on). You'll likely then want to also update your release, to make sure you're publishing the best version.

---

Good luck! As always, the best way to contact me about notifying me about a release, or asking for help with this is on Discord (though comments here, or PMs on the Nexus are also fine - just slower).

I will also release a short version of this post on the Nexus, as an article at some point in the future. But as it's likely a first-come first-served basis, I would of course want to give that to people here :) . I would STRONGLY recommend waiting until I have Tacotron2 versions of the voices, before re-releasing, just because the quality is much much better (your release will go down much better when the quality of the voice is good).

Technical (?) post #3 - xVA modding support for other games

Comments

Ah yeah, I remember that repo. I think even the author said not to use it, due to how bad it was to set up. Thank you for the support, yes, you got that right. I'll leave the transcription process to people to do themselves, but the app (I'm imagining) will have you put the .wav|text data in a folder, and you have different tabs for the different models that need training. Perhaps another electron app just like xVASynth, with buttons as input, and a terminal feed/graph to show the training process. Maybe a couple of inputs for some tuning. I'll get started on this in earnest when I'm done finalising what the training procedure should be.

"I will eventually release a separate app which people can use for training their own voice models." This is what I'm here for. Last year I had JCorentin's Realtime Voice Cloner set up with Tensorflow - but 1: It was a huge PITA to get set up, 2: The state of the art has clearly moved quite a bit in 2 years - if you look at his commercial Resemble.ai company, it's pretty shocking how far it's come. 3: The quality and editability of the TTS input in RVC to stress or draw out sounds was nowhere near what you've displayed here. I'm so stoked to hear that there might be a relatively straightforward app that can be used to train specific voices (For me, among them are the World of Warcraft voices - unfortunately there aren't *that* many samples available, so RVC falls very, very flat on them) I'm hoping that the process is more or less transcribing the dialog in the input audio files (so there's a transcription of the input audio for the supervised learning part), then ...maybe?...train the model by pointing it at a folder full of wav/MP3/whatever files and identically named txt files that contain the transcription. A guy can hope, right?

Jonathon Barton


Related Creators