NokiMo
Bambi Sleep
Bambi Sleep

patreon


Bambi behind the scenes

By the time a new Bambi Sleep track makes its way to you it’s gone through many different iterations, and every line of the script has been obsessed over and tweaked probably hundreds of times. This is a quick behind-the-scenes look at the work that goes into producing the tracks that I hope you’ll find interesting!

Preparation

Usually by the time I start properly working on a new track I already have a rough outline of how I want it to turn out in the form of page upon page of jumbled notes: programming concepts, specific phrases to use, what sorts of backing audio elements I want, whole segments of script, and whatever other random ideas I’ve scribbled down.

It’s a mess and not usable as-is, so I start by coming up with a rough structure for the track - usually 4 or 5 main sections, each with a particular purpose. Then I go through all the notes and sort and organise them by section, and try to get everything into an order where the progression makes sense and the ideas flow naturally into one another.

If there are some specific sound effects I know I’m going to need, I might also spend some time searching sample libraries for audio material and editing things together into usable backing tracks at this point.

Script-writing

Next it’s time to turn the rough notes into an actual hypnosis script. That means going through the whole thing and writing out line-by-line what the main vocal track is actually going to say.

I usually end up reorganizing things as I go, reordering things to fit better, coming up with new ideas and expanding on old ones. I’ll add side-notes with any ideas I have for other audio elements or background effects that would be appropriate in each section.

4 or 5 pages of rough notes might become 7-10 pages of written script.

Keeping the length of the script from spiralling out of control is the most difficult part. Good scripts need to be both concise and effective. I spend a lot of time deliberating over phrasing and word choice, cutting things down and deciding which concepts to dedicate more time to.

I always have some idea in mind of how long I want the finished track to turn out, and know roughly how many minutes of audio a given number of pages of script will translate into. I try to pack the most bang-for-the-buck into every minute of it, but the finished script pretty much always ends up longer than initially planned.

TTS conversion

At this point it’s time to fire up the TTS software and turn the raw script into spoken audio files that flow properly and sound right.

This is a laborious process. Even high-quality TTS produces a lot of jarring and unnatural speech if just fed plain English. So I go through the whole script again line-by-line and make the TTS engine play each line back over and over, making small changes one at a time until it sounds natural.

TTS voices are fickle, and the timing and pronunciation of the spoken output can be changed by garbling the text in certain ways. This means things like misspelling words or running them together, adding hyphens in weird places, and so on.

The effects are never quite predictable, and fixing one part of a sentence will often suddenly make another part before or after sound all wrong. I’ve gotten a lot better at guessing what sort of massaging might make a particular phrase come out right, but it’s still very time-consuming. A correctly-pronounced line often doesn’t look all that much like it did to begin with!

While I’m doing this, I’ll also reword the script if necessary to make it flow better, and make any improvements that jump out at me now I have a script for the whole track: reordering or reworking things, adding triggers, or foreshadowing or references to other parts of the script or the scripts of other tracks.

With the script prepared for TTS, I render it out as audio.

For older tracks, I used to convert each section of the script into one long continuous audio file, and rely on special tags in the text itself to insert pauses of varying lengths. This wasn’t flexible enough to achieve the kind of tight timing and synchronization I wanted for the newer material. (It also made it more awkward to make changes to the script at a later stage, once the whole track was assembled, since changing one line in a segment would alter the timing of everything after and require a lot of work to sync everything back up again.) Starting with the Mental Makeover session, I render every line of the script out from the TTS engine into a separate audio file.

Audio production

With the speech rendered, everything’s finally ready to load into audio production software and put together.

For the first pass, I focus on the main vocals and the elements that need to be synchronized with them. I start with some standard background tracks and add in the TTS files one sentence at a time, inserting finger snaps, vocal responses, and other foreground sound effects as necessary.

I spend a lot of time here lining everything up right so it flows nicely when listened to. The makeover tracks especially use audio effects to keep a steady rhythm at all times (metronome, wipers, more to come…). I’m not sure if people notice this just listening to it, but it’s much more hypnotic when everything, up to and including the syllables and emphasis of the main voice, is timed to the beat. It’d sound off otherwise.

I often have to go back to tune the TTS pronunciation or reword the script if I can’t get things to flow naturally. In extreme cases I’ll resort to chopping up the rendered TTS audio samples and altering the timing of syllables manually.

The main vocals and foreground effects that go along with them are the driving force of a track, but while I’m putting together these core elements piece by piece I’m also working on the overall structure. That structure is created by the way rhythmic effects, mantras, and layers of background textures evolve and fade into one another to match each part of the script.

I generally know what the most important audio elements for each part of the script will be already and will arrange those as I go, grabbing things I’ve used already or creating new ones if necessary.

When the whole thing is done from start to finish I go back for another pass-through to tweak and fill in the subtler bits of background audio and make sure everything feels consistent from start to finish.

Just for fun, here's a gigantic screenshot of the project file for one of the newer tracks (open in a new tab for more detail):

QA

Even though I’ve listened to each line and every part of the track to the point of insanity by now, there’s still a lot of little details I might not notice until I experience whole thing through as intended without touching the audio software: too much or too little going on in a particular section; awkwardly timed effects or sudden unexpected changes; tricky bits of pronunciation that stick out unnaturally in context; parts where the script seems to be building up to something (maybe a particular trigger) that never comes, and so on.

So now the track is close to finished, but I need to try it out, see if anything feels off, and hopefully remember what it was to go back and fix it (not always easy). Changes at this stage tend to be time-consuming: all the details are in place, and altering the timing of something deep in the middle can mean a lot of fiddling to sync everything else up with it.

Usually it takes more than one round of QA before I’m happy. When I can listen to the whole track through without getting jolted out by something I want to change, it’s ready to give to all of you!

Parting thoughts

The methods I use to produce this stuff have come a long way since I started out, but I hope you agree the results are worth it. Each of the different stages above can take 5-10 evenings or more of solid work for a 10-15 minute track, so each finished product represents easily hundreds of hours of work, not even counting other distractions (I spent probably a month or two just sorting through samples before starting work on the Mental Makeover series in order to have a library of things to use and fewer interruptions).

I just mention this to offer some explanation as to why it takes so long to produce new material. I also have a full-time job and a lot of other stuff going in my life (and just need breaks sometimes!), so I never have as much time as I’d like to work on things. It might sound over-the-top for weird internet kink stuff, but it’s important to me to do something unique with each release and create the highest-quality material I can. And I like to think I’m doing a bit to push the boundaries of what can be done with the format.

Thank you all for being so patient and sticking with me!

Comments

This is so incredible! People don't really realize the heart it requires to be this passionate about something to make it such high quality and organized. Truly beautiful work

Dolly Crash

Thank you for all your hard work! :)

chris hackett

i don't have the money to buy myself into the patreon, if you can, support me please. here over patreon or i have linked up my paypal over the contact information, the name is my official public/societal name.

i'm the creator of this program. please upload it in high quality. i want it in 32Bit/382kHz atleast. my women are crying over the 44kHz 221kBit/s. atleast here on patreon. FLAC 1411kBit/S 44kHz is 80s tech. many DAC support even 786kHz and they are not even that expensive. the FiiO Q3 or Shanling UA5 (the UA2 does that too) do it. Also for my women listening to this wanting to experience it better. I've made a audio equip set for under $300 and it will be better than everything else you get for the price. It's the KZ ZEX Pro Electric, but you need to buy Balanced output cables for it. i have the Linsoul Tripowin 8 Core 2.5mm QDC Adapter one. they fit with the KZ In Ears. They are very comfortable, you can listen with them inside your bed and roll onto the side, the form is adapted on the human ear physiology. And for the audio processing you get yourself the FiiO Q3, it's a newer gen mobile DAC, it's microchips rather than condensators, which is way better, we don't live in the 80s-90s anymore. if you get yourself a DAC with 4.4mm balanced output, buy yourself the cable for it, but i advice the 2.5mm since the Shanling UA5 will be the best for mobile use and it has 3.5 unbalanced and 2.5mm balanced, which is what you want. the FiiO Q3 supports 2.5mm and 4.4mm, so you can use the same cable and In Ears for booth Amplifiers. If you have the Money get the FiiO Q3 and the Shanling UA 5 and test them out. but trust me, get the balanced audio cable, espescially the one i adviced, the distortions get filtered out and the in ears get out of the low frequencies. they get a bit more aggressive, but they will fit daytime consciousness way, better, which is important for trigger induction, the sound characteristic and precision is nearly perfect for what you would expect for $40 In Ears.


Related Creators