It does use AI for the voices, but god dang if it doesn’t sound like a genuine talky version, with Gary Owens in particular sounding like they genuinely voiced it.

  • @[email protected]
    link
    fedilink
    English
    5
    edit-2
    10 days ago

    It sounds really impressive to me. Looks like a lot of hours went into this, although the 100 hours on generating and selecting the voice lines would be more interesting if it was spilt between human labour and the GPU run time.

    To me this is a great use case, it’s a free mod for an old game - I can’t imagine anyone could/would pay actual voice actors to do this.

    • Synthetic Voice Generation: Tortoise-TTS can take hours to generate hundreds of vocal lines on a GPU. Additionally, you may try generating multiple samples at once, as the quality and delivery may vary with random seed.
    • Quality Control: Filter out bad audio samples and select the best ones. If a vocal line isn’t working try generating more, correcting any issues with the text or rewriting it more phonetically. For example, rewriting “EVA” as “E.V.A.” or “II” as “Two”.

    Rough estimates.

    • 100 Hours of work generating, selecting, splicing and cleaning audio files.
    • 30 Hours of programming with the SCI Companion.
    • 1/2 Hour on SCIProgramming.com.