Dovetail will be using an AI voice generator for TSW5. It will hopefully make it easier for them to incorporate train announcements in the future as it should cut down on production time.
Indeed. To allow for voiceovers, Matt will be talking, but his voice will be overwritten by an AI voice. The people who supplied those voices get paid, the voices have an element of DTG Matt's tone to them, and, since AI voices are much easier to work with spontaneously, we get voiceovers while still getting good scenarios. I also suspect the AI voices are used to radio the driver in conductor mode in San Bernadino Line, but I'm probably wrong.
Something it should help with is changes to the tutorials. The previous voiceovers were made quite early in the development cycle, meaning that alterations weren't possible as the actors had been and gone. You do occasionally see minor alterations in the text, possibly a small elaboration - never a major change. With this new system, updates would be feasible. It may also mean more extensive tutorial voiceovers with more content, but that's up to DTG.
Matt is flexible, but has limited vocal range. Others are less flexible, plus... We get to hear female Matt P!
I wonder if they'll graduate to AI audio/video for streams and the Roadmap. That would be kind of interesting, all kinds of possibilities.
So in other words, this does not improve matters regarding changes to the tutorials, their extent, etc.
Replace Emily Turner with a DTG Matt Character, and then he can tell us about the disruption on the route!
Correct. Although, it opens up the possibility of voice-overs for scenarios again. As for their scope, nothing new.
TTS works pretty much fine now (especially in "big" languages), our transport service now uses it for temporary announcements about disruptions etc as well. That said, german content will never be complete without Heiko Grauel's ACHTUNG.
Just to pick up on what one of those screenshots states. Most railways actually don't use TTS. It's typically a real person recording separate sentences or phrases that the computer strings together, hence why the tone keeps shifting. For example: The next train at/ platform 4/ is the/ twelve/ twenty-seven/ LNER Azuma/ service to/ London King's Cross.
Rather contradictory as to whether they're using text to speech (which would give the claimed benefits) or speech to speech (which wouldn't)...
We use a combination of the two. And both give the benefits we need. If I need to re-record some speech at the last minute, I can just pull a microphone out and do it. I recorded all the tutorials in about three hours one evening at the end of the dev cycle once all the scripts were finalised, and then Adam worked on the speech renderings over the next three or four days - that's for the entire 17 tutorials (which is about double last year). Previously it's been a multi-week exercise and required timing and scheduling from multiple voice actors and studio time along with onsite production at the studio - a huge endeavour and one that we didn't see a way forward on with this, alongside the desire to allow the tutorials to receive more testing and feedback prior to getting their VO added. The US and German VO is primarily done with TTS, all the british stuff is done with STS (because otherwise, it tried to apply my British accent to all the US voices and the result was comically bad) - and given that's gone well, we might just move to TTS full going forwards but we'll see what people think. Yes, the San Bernadino radio and engineer dialogue was done using the same mechanic, a literal last minute "ooh, we could get some voice in for a radio message here if you want?" which would have been an unheard of comment prior to using this tech as it'd have required production voice recording again to get set up and planning months in advance. I think there's a half dozen slight variations. The tutorials this year have benefited from it, like I said, 17 tutorials was a mammoth undertaking, all fully rewritten from scratch, rescripted, reviewed numerous times, played through and then VO added. I'm sure they're not perfect, but I'm comfortable they are in a significantly better place than they would have been - we would likely have lost VO this year to be honest otherwise, because i'd rather lose the VO than continue with tutorials that just don't quite hit the mark. We are looking at passenger and train announcements as I've said in the past, and yes, the defaults for routes would use a set of generated voices like this, and then where we can get appropriate local ones licensed we're building the system to allow those to be slotted in and used instead. Matt.
I have never done the core VO (or at least not for a few years anyway), that's always been professionally done in a studio. I've only ever done DLC. So, for core, that's a massive difference. Plus, there is a big difference between a "final" recording and one that is going to be re-mastered via Speech to Speech. If there's a background noise (common) such as cleaners banging around, or someone swearing outside in the dockyard, I have to stop and start again until I have a clean recording on each phrase as cleaning it up in post serves to reduce the quality massively. With the speech to speech - it doesnt matter, that stuff is all naturally gone because it's focusing on the voice and recreating new audio. In short, even just for the DLC's it's much simpler, quicker and easier for me to record than if my recordings are going to be the final voice. Plus, this way, it can be a variety of different voices and not just me all the time, I'm sure you're all fed up of that ! Matt.
So will that mean instead of some scenarios where you go to someone on the platform to talk to in order to start a mission, instead of reading text they will speak to you instead?
Potentially - though the text will never go away as it's needed for accessibility reasons, but being able to get voices in the scenarios I think will be a nice step up. That said, just want to re-iterate that while it's a desire, there's no plan to do that right now.
It's a great topic around immersion boosting. Signaller contact, driver handovers, announcements of real-time delays when you're in stations behind schedule etc etc. It could elevate the product IMHO.
I’d rather have AI converted announcements, than none at all. If this is what DTG have to do to bring announcements to the game then I’m all for it.
The AI voices are good. A shame there is no German accent AI voice though. Definitely time to start implementing more voices like in scenarios. I would add a radio effect to the AI voices in tutorials.
The voices as usual are extremely irritating. we should be able to turn them off, especially the guy with the UK
This is a cool change, many of the advantages have already been mentioned! Should make getting voices into the game much less of a hassle