Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

[Generative AI] DeepVoice - Text To Voice

Discussion in 'Assets and Asset Store' started by AiKodex, Jul 9, 2023.

  1. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    DeepVoice is an ultra-realistic Text To Voice AI solution. This tool can create voices from text, trim, combine and equalize audio files. Choose from 80+ voices.

    No sign-up, No API Keys, no recurring payments, no subscription fees, no additional costs, just one-click easy to use inferences on our voice model.

    ABOUT

    DeepVoice is an LAM (Large Audio Model) of networks and libraries that are capable of life-like voice generation through text using AI and deep learning made for Unity.



    INVOICE NUMBER
    You can find the Invoice Number here : https://assetstore.unity.com/orders
    Enter this invoice number to gain access to the Voice Generator. Please contact us at info@aikodex.com if you need any assistance.

    QUOTA

    30,000 characters per month (refreshed every 15 days -> 15,000 characters) of voice over and narration takes with DeepVoice. 15,000 characters translates to 5 pages of 12-point text in Calibri. This quota is issued on the 15th and the 1st of every month.

    LINKS
    Works in realtime, both in, Edit Mode or Play Mode inside of the Unity Editor. This asset has a one-click, beginner friendly GUI and does not require any coding to use.

    Website and Support | Documentation

    Pipelines Supported: Standard, HDRP, URP and SRP. (All)

    FEATURES
    Text to Voice Converter: The main function of the asset is to provide you with ready for production voices. Simply enter the text to be voiced out and click on generate.

    Examples for prompting:

    Narration / Dialogues / Voice over / Dubbing
    "In the darkest of nights, hope shines like a single star, reminding us that heroes are born from adversity."

    ▶︎ Play

    "Had to be me. Someone else might have gotten it wrong."

    ▶︎ Play

    "I think it was called Ueno Station, but I'm not sure. I've never been to Tokyo before, so everything is unfamiliar to me."

    ▶︎ Play

    Pauses
    "So I think - I should take this route if I want to reach on time"

    ▶︎ Play

    Or
    "But well... I'm not entirely convinced"

    ▶︎ Play

    Emotions
    Note: The dialogue tag ("he said confused", "he shouted angrily") has been cut out using the audio trimmer within the asset.
    "I have had enough!" he shouted angrily.

    ▶︎ Play

    "I wish you were right, I truly do, but you're not" he said, assertively.

    ▶︎ Play

    Famous Personalities
    "I don’t hire a lot of number-crunchers, and I don’t trust fancy marketing surveys. I do my own surveys and draw my own conclusions."

    ▶︎ Play

    "Nothing can stand in the way of the power of millions of voices calling for change."

    ▶︎ Play

    More examples are given in the description of the asset page.

    Language and Accent Support: The DeepVoice_Multi model supports different languages such as English, Japanese, German, Hindi, French, Korean, Portuguese, Italian, Spanish, Indonesian, Dutch, Turkish, Filipino, Polish, Swedish, Bulgarian, Romanian, Arabic, Czech, Greek, Finnish, Croatian, Malay, Slovak, Danish, Tamil, Ukrainian.

    Voice Modulation controls: These controls allow users to adjust parameters such as speech clarity and variability in voices, as well as add emotions through text prompting. By manipulating these parameters, users can customize the generated speech to better suit their needs and preferences.

    〰️ Preview waveform: Play sound clips right inside the editor without going into the play mode. Scrub the play head to play any part of the clip. Timestamps and simple graphic of the waveform is shown for better clarity inside the editor.

    ✂️ Trim audio: A user friendly GUI in the Editor to trim the ends of an audio clip if in case a part of the clip is not required or is empty.

    Combine clips: Multiple audio clips can be combined into one using an intuitive user friendly feature in the editor. Simply select clips, rearrange their order with ease and merge them into one.

    ⚙️ Equalize tracks: Mastering audio clips involves equalization of clips which can easily be done within the editor itself. Simply select the clip, adjust gain, pitch and frequency band sliders. A 6 band equalization is offered in the editor.

    Editor Script: The Editor Script displays all the options neatly in one panel. The editor has an in-built preview audio player. Simple design for trimming, combining and equalizing or mastering audio tracks.

    EDITOR
    Keeping it all in the editor:
    Keeping all assets in one workspace inside the Editor and having to switch to fewer services can have several benefits, such as:

    - Improved Efficiency: When all assets are located in one workspace, it becomes easier to access and manage them. Users do not have to spend time switching between different services or applications, which can be time-consuming and lead to a loss of productivity.

    - Streamlined Workflow: Having all assets in one workspace can help create a more streamlined workflow. This is because users can easily move between different assets, such as code files, images, and documents, without having to navigate between different services. This can help to speed up the development process and make it more efficient.

    - Reduced Complexity: Using fewer services can help to reduce the complexity of the development process.

    In the pack, you will find a demo scene and an editor window which help you to access the TTS models. There are other useful audio settings like trimming, combining and mastering the audio track that can be accessed through the DeepVoice Editor Window.

    DEPENDENCIES
    This tool requires the Editor Coroutines package from the package manager and an active internet connection.

    LIMITATIONS
    Since this tool is still under development, there are a few limitations:
    - For now, the text that can be processed is set to a limit of 200 characters or 30 to 50 words or 5 to 6 sentences or one paragraph.
    - There are around 80 voices to choose from, out of which Mono/Multi have 15. We are working on adding more.
    - Audio generation time is ~8-15 seconds per clip. This may increase with an increased number of tokens and user base.
    - Character count per fortnight is limited to 15000. Per month, this translates to a limitation of 30000 characters.

    Please check out the documentation for an in-depth explanation and working of the asset. If you have any questions, suggestions, inquiries for private servers or would like to share your thoughts, please send us an email at info@aikodex.com
     
    Last edited: Sep 9, 2023
  2. Darkcrash2007

    Darkcrash2007

    Joined:
    Jun 6, 2021
    Posts:
    3
    What about copyright? Can this product be used commercially? For example, Steam may remove the game due to AI content!
     
    pumpkinszwan and devramyun like this.
  3. DigitalIPete2

    DigitalIPete2

    Joined:
    Aug 28, 2013
    Posts:
    44
    Hi guys,

    Im having issues, I bought this asset today and it seemed to work for the first hour. I have only used 3882 characters so far, but I have had some fun.

    Now I settled down to do some work with the asset and it refuses to generate the voices anymore. My invoice number has been accepted and I hit save, when I select any voice and hit generate, it starts then stops immediately with no new voice beiung generated. Im using different voices so Im not getting filename issues.

    I get this error:

    "There was an error in generating the voice. Please check your invoice/order number and try again or check the documentation for more information."


    There's nothing in the docs regarding this particular error. I'm using small sentences in fact I have 125 characters left for the message which fails (they all fail atm actually)..

    Any help would be gratefully received

    thanks,

    P.
     
  4. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Thank you for reaching out to us. We received your email and have offered solutions that should hopefully resolve these issues. We have also updated the back end to process requests with special formatting.

    For developers browsing this forum thread:
    For a new line please use \n instead of the Enter key.
    To use quotes, please use \" instead of ".

    We will be bringing out an update which allows the user to use special formatting without the above exceptions.

    Thank you.
     
  5. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Yes, this product can be used commercially. All the voices offered in the asset are in the open public domain or are based on fictitious characters. We've included a terms of use and service within the asset for the developers perusal. The asset has models such as Neural and Standard based on Text to Speech software by Amazon that is thoroughly licensed and has been on the market since November 2016 and includes 50 voices across with many different accents. Steam states that the legal ownership of AI-generated art is unclear. The use of AI generated voices on the other hand have been widespread, some of which were licensed a decades ago (Google Text to Speech - 13 November 2013).
     
    DragonCoder likes this.
  6. DigitalIPete2

    DigitalIPete2

    Joined:
    Aug 28, 2013
    Posts:
    44
    I can confirm within an hour the team had resolved my issue and I was back developing new lines of speech for my game.

    Thanks AiKodex for an excellent asset and excellent service! Great job!
     
    blueivy and AiKodex like this.
  7. yung_beezy93

    yung_beezy93

    Joined:
    Oct 23, 2019
    Posts:
    1
    Hello! Really interested in this app, but I wanted to ask some questions to see if this might work with my current project. Without going into details, I am building iOS/Android app experience where it would be cool to have AI generated voices narrate random moments that occur within the game. So my questions are as follows:
    1. Will this run on a phone app?
    2. If it can run on an app, would each instance running (say 100 users playing) eat up the 30,000 character limit? In other words would this be scalable to multiple instances.
    Thanks in advance, and major kudos! This is a really cool asset!
     
  8. tt10977

    tt10977

    Joined:
    Sep 11, 2021
    Posts:
    1
    Will other language models be added in the future? Such as Chinese, Japanese, etc.
     
  9. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello yung_beezy93,

    Yes, the application can run on a mobile app. Since it is server-based, the mobile devices will have the same generation time as on a PC. The app can send out requests and download files, and it can use the Application.persistentDataPath to write the bytes.
    Regarding the character limit, it is difficult to calculate how many players will generate how many characters. However, to give you an idea, 30,000 characters can fill approximately 10 pages of a document with 12 point Calibri font. If the sole functionality of the game is to generate voices, then 30,000 characters may not be sufficient for a large number of users.
    If you have 100 users playing simultaneously and each instance consumes characters for voice generation, it is possible that the 30,000 character limit could be reached quickly. This may impact the scalability of the application for multiple instances.
    It is worth noting that there is an option to purchase another license of the asset to avail a new invoice number and double the character quota. However, it may not be the most economical solution.
    Nevertheless, the developers behind the asset are working on finding ways to increase the character quota, so there may be future improvements to address scalability concerns.
    Thank you for your interest in the app, and I'm glad you find it cool! If you have any more questions, feel free to ask.
     
    yung_beezy93 likes this.
  10. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello tt10977,
    As of now, Chinese and Japanese is not supported by the asset. We would like to support them in the future, but as of now, we are uncertain if we will be able to do so.
     
  11. bvonline

    bvonline

    Joined:
    Feb 27, 2021
    Posts:
    85
    Any knowledge about a subscription plan in the future which could occur by the company which created this AI voices? I assume, that they might not keep it free forever... that is my only concern.
     
  12. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    We are in the process of developing an offline AI voice model using Barracuda instances with a python framework in Unity. Using ONNX open neural network exchange, conversion of these NN models is possible. We already offer Ai.Fy which offers two offline super-resolution AI models. We hope to bring this into DeepVoice as well - at least for a few voices initially if not all.
     
    yunus_unity9 and mgsvevo like this.
  13. Ghosthowl

    Ghosthowl

    Joined:
    Feb 2, 2014
    Posts:
    231
    How long does it take for Unity to assign an invoice number?

    EDIT: It took 2.5 hours to populate.
     
    Last edited: Jul 14, 2023
  14. Ghosthowl

    Ghosthowl

    Joined:
    Feb 2, 2014
    Posts:
    231
    This asset is absolutely amazing - perfect for our project!

    I think this asset definitely needs some sort of packs to buy more characters though. I have already found myself having to re-do my inputs trying to get the perfect emotion that I find myself quickly burning through my character allotment even though it is just for one generation.

    I also think the character limit needs to be much higher. I have so little characters left when using the tagging feature that it becomes quite a nuisance because I can only get one sentence done. If I break it up into smaller submits, I lose the tone and the flow of the dialogue and I also burn through more characters having to re-do my same tags each submit. I think the best way to go about doing this is to make tags free of the character limit such as *he said nervously* to know where they are in the sentence. It would be even greater to have an option to automatically omit these tags in the final generation if a toggle is on - though this may be asking to much programatically.

    Either way, absolutely amazing asset especially at the current cost!
     
    AiKodex likes this.
  15. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello Ghosthowl,

    Thank you for your positive feedback!

    We understand that there may be a character crunch. Our margins are slim when we talk about the server support in terms of longevity. However, we have seen an exponential growth in the customer base in these short days. Feedback and reviews motivate us a lot to do better, and we'd be grateful if you could write a few words on the store.

    As for increasing the quota, we will actively work on expanding our offerings. We acknowledge the inconvenience it may cause when utilizing the tagging feature and will explore options to either increase the limit or exempt tags from counting towards it.

    Your valuable feedback contributes to our ongoing efforts to enhance the asset's functionality and user experience. We appreciate your support and encourage you to share any further suggestions or questions you may have :)
     
    Ghosthowl likes this.
  16. sael-you

    sael-you

    Joined:
    Dec 5, 2020
    Posts:
    6
    i'm thinking about buying this Amazing plugin, but i need to know if it actually do runtime text to speech ? if yes what's the delay of response ?
     
    teawa_ likes this.
  17. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello sael-you,

    Yes, you can perform Text to Speech during runtime. We offer a demo scene that performs runtime generations. The delay in generations is ~5 to 10 seconds depending upon the number of characters. Hope this answers your question :)
     
    sael-you likes this.
  18. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    [Announcement]

    Automatic Quota Reset to 15,000 characters.

    15,000 characters allotted for the period 15-07-2023 to 31-07-2023.
     
  19. KenzoGames38

    KenzoGames38

    Joined:
    Oct 24, 2019
    Posts:
    19
    Add japanese please
     
  20. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    [Announcement]

    We will have a short server maintenance check on UTC 17:00 to UTC 17:30. The servers are expected to return to normal functionality on UTC 17:30.

    Your patience is appreciated.
     
    Last edited: Jul 24, 2023
  21. Graham-B

    Graham-B

    Joined:
    Feb 27, 2013
    Posts:
    331
    Is this uncensored?
     
  22. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello Graham-B,

    Yes, these voices also support NSFW generations.
     
    Graham-B likes this.
  23. The42ChickensDilemma

    The42ChickensDilemma

    Joined:
    May 22, 2016
    Posts:
    2
    Hello! Love the asset. Could you please provide more details about the model used for data inference and the data's origin? It would be great if the data could be made available to the public. Alternatively, a better solution would be to allow developers to train their own models using custom voice datasets that they provide. This approach is important because Steam's guidelines prohibit the publication of games that contain "Content you don’t own or have adequate rights to." The usage of Obama, Trump, and Biden "characters" without their consent or permission is a significant concern, since I strongly believe they did not give permission for me to use their voices. If there is no option to train our own models, then this asset is very much useless.
     
  24. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello The42ChickensDilemma,

    Thank you for your kind words. Sure, we will reach out the developers of the model that helped us make DeepVoice AI for Unity and let you know. Regarding the usage of the voices of famous personalities, these voices are in public domain.
     
  25. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,285
    Hi you don't seem to show documentation on what API you are using for voice generation - I think this is important. Are you basically selling a Unity interface to some openly available API? Will this API have costs after a while or perhaps in the future?
     
  26. TomLeeLive

    TomLeeLive

    Joined:
    May 2, 2018
    Posts:
    5
    I just bought DeepVoice asset for using TTS in my project.

    I think this quata thing is no good for everyone who use this asset in their game.

    I am really looking forward to offline model version in the future !

    Thank you for creating this nice asset.
     
    mgsvevo likes this.
  27. lLcrowe

    lLcrowe

    Joined:
    Sep 12, 2017
    Posts:
    7
    some Find it.
    Standard Voices => Ola
     

    Attached Files:

    Arnold_2013 likes this.
  28. MightyAnubis

    MightyAnubis

    Joined:
    Jan 29, 2018
    Posts:
    67
    Just to mark it:

    1. - Equalizer do not work inside the Project.
    Only "Crazzz" and that s it. No changes possible.

    2. - i told you: watch your Prices.

    3. - "upgrade packages to bigger studios" i see it comes with big steps:
    unlimited pay in
    for less delivery, like everythere.

    but this is an other Thematic.

    **most Critics**
    1. - price to expensive
    2. - to much voices sound so familar, that even the extrem limit of only 15 000 Charakters at 2 weeks
    not enugh just to try out every voice with 1 default Phrase.
    3. - Prices.

    Around (including Tax) 100$ for nearly ~ 20 Voices, in middle up to good Quality in complete.
    All other, are just familiar Voices, sounding nearly the same
    a lot of "AI Voices" just: there to be there, and not really useful, consumes senceless Letters / Charatkers
    in Try around with.

    was best Idea to fire 50$ out of the Window.
    not more.

    okay.
    So:
    be so nice, fix the EQ to the guys who want to use it.
    i myself, go and hide it now.

    After i read: "We do a Upgrade Pack to big studios" - i know the Road map.
    no thank you! Indis need to work hard for the money, the most are Hobbyists, never release a game
    really, only play around private, mostly.

    lot of Projects, never see the Light of Day, but costs a lot money.
    This here, is a good idea to place some Money there you never find it again.

    and at last to all here:
    think about:
    Lets say you want a Dialogue, with ~ 6 Statements.
    To get the first up to 2 "Poitns" you will need up to 10 000 or 15 000 Charakters, to let it sound like you need it.
    here is a Error, there is sounding something not good... it s everytime the case, bcause you need to regenerate
    a lot times, up to a Quality you really want.

    you in the middle of your dialogue ? wait 2 Weeks, or wait a month.
    if you will do a quest giver with this ? No chance ever to finish him.

    if you have in mind, to overtone more than 4 or 5 Charakters ?
    not to made, with this limit.

    free solutions, giving 80 000 Charakters + all time usable voices.
    others, sell no charakter Limits to 50$

    This here, cost 100$ if not in sale.
    *AND* if you want more input charakters, like one of the raters reviewed, you will get "Upgrade Packs"
    later.

    will be funny to watch, what will this cost in the end.

    Why i am so "Bad" about ?
    Really ?
    free solutions, giving the same!
    this here want 50$ even in sale.
    EQ for this not working
    limit questions are awnsered, with "We bring expensive Updates ..."

    this is not a good idea, and not the solution, AI should drive!
    so i stand @ the side of TomLeeLive who say:

    "I wait for an Offline solution".
    He right.
    Absolutly right.

    this here, was one of the most useless Buys i ever did, in my own foult:
    i did not read the Limit, i did think: "Oh, a cool Asset with TTS"
    i got an other one.
    unmlimited Charakters
    good results.
    half of the Price, if you take the Endprice here.

    this ?
    Was really waste of Money, like it is today.
    absoltuly 100% not useful.
    just: burned money.

    From an Awnser inside the "Reviews" here:
    [As the other reviews suggest, we are considering releasing an upgrade pack for studios and businesses that increases the number of characters per generation. This is still work in progress. ]

    Will be fascinating to watch.
    An overpriced Offer - now -
    will go more expensive, if you want more, then the less you get.

    o_O

    nice - i need popcorn. Well be great entertainment.

    Last i see "Unity AI Solutions, we make it possible"
    yes. Always, to subscribe to expensive monthly solutions, bound to online portal Services
    what bind your work, what is 10 000x more than just "one Asset", to a online solution
    you pay for lifetime.

    Nice roadmap .
    Perfect.
    :rolleyes:

    but okay, this is an other theme.

    oh:
    one thing i miss to told you:

    Write the charakter limit out in editor! No one knows how much charakters are left up
    to the limit!

    ( and fix the EQ for ppl want to use)

    Thank you.
    okay 4 now, after my monthly "robbing off my family" etat, is gone for this
    i need to double him, and now buy an Asset what is ~ helpful

    after (what i see coming) i did rework it, with 50% of time the creater needet
    in blender, to make it possible to use.

    lets buy.
     
    Last edited: Jul 20, 2023
  29. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Thank you for making this issue known to us. We will fix this in the next update as soon as possible.
     
  30. Boy132

    Boy132

    Joined:
    Jun 30, 2017
    Posts:
    1
    Does this support "non-word sounds" too? Like laughing, crying, coughing etc.
     
  31. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello Boy132,

    Non-word sounds such a laughing, crying or coughing are not properly generated by DeepVoice. In a rare cases, you can hear laughter if you input “haha”. Most of the times it’s an awkward laughter. Sobbing can be heard faintly too when you hint crying in the text prompt. We have yet to encounter a proper cough in generated audio when coughing is indicated in the text prompt.
     
  32. KenzoGames38

    KenzoGames38

    Joined:
    Oct 24, 2019
    Posts:
    19
    hi, i dont have the invoice number of my order
     
  33. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello johncenakenzo38,

    Unity typically takes around 2-3 hours to generate an invoice number after which you will be able to use the asset.
     
  34. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    [Update]

    DeepVoice v1.2
    - Quota Increased to 50,000 characters
    - Process up to 2.5x characters at a time now (500 chars)
    - Server Improvements
    - Bug fixes

    [The UI still shows v1.1. We will fix this. Rest assured, after updating the asset you will be transferred to the new quota and be able to use v1.2 features]
     
  35. KenzoGames38

    KenzoGames38

    Joined:
    Oct 24, 2019
    Posts:
    19
    ok thanks i will wait then its been 2 hours so far, also will you add more language in the future ? like japanese ? (EDIT: i still didnt get the invoice number its been more than 3 hours by now)
     
    Last edited: Jul 24, 2023
  36. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Please could you try using your order number instead? Please tell us if this worked for you.
    Currently, Japanese is not supported by the asset. We would like to support it in the future, but as of now, we are uncertain if we will be able to do so.
     
  37. Kozaki2

    Kozaki2

    Joined:
    Apr 8, 2019
    Posts:
    47
    Hi, at the beginning I will say that the generated files are good and meet my expectations, good job. Also, I encountered some errors and warnings. I'm using Unity 2022.3.0.
    1, when refreshing the DeepVoice window GUI, I get tons of warnings
    2, the "Save" button for "Invoice number" doesn't seem to work. Have to give it every time.
    3, after generating the file I get an error, nothing critical, the generated file is ok
    4, Equalizer doesn't work at all. When I try to play the file, I get the errors
    GUI Error: Invalid GUILayout state in DeepVoiceEditor view. Verify that all layout Begin/End calls match
    and NullReferenceException
     
  38. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello Kozaki2,

    Thank you for reporting this issue to us.

    We’ve taken a look at the error you have put in quotes and we suspect the error could be caused by placing the DeepVoice folder in the “Vendor” folder in your project.

    Could you please try placing the DeepVoice directly under the Assets folder and see if the error is resolved?

    DeepVoice currently uses absolute and hard coded paths to work. This means that the asset assumes a project structure and operates within that frame of reference. Any deviation from this project structure may break the paths and hence affect the functioning.

    For optimal functionality, we recommend you to keep the default project structure and not move files internally within the asset either. You can add assets to your projects, they usually come with their separate folder and you can add files inside the DeepVoice folder as well. But please keep the DeepVoice folder directly under the Assets directory.

    Again, we apologise if the reliance on default project paths for the DeepVoice asset is an inconvenience.

    If you want, we could look into changing the default paths in the editor and controller scripts for your specific use case. Please send us an email at info@aikodex.com and we can modify the script accordingly.
     
    Kozaki2 likes this.
  39. LegendaryTaler

    LegendaryTaler

    Joined:
    Nov 8, 2014
    Posts:
    8
    Hi, AI Kodex :)
    I bought DeepVoice yesterday, and I'm testing it. However, Emotion does not appear to be working.

    Even if the I writes that actor is angry or that he is screaming, the voice always comes out in the same tone.
    Perhaps, given that the following Emotion description is also generated by narration, it does not seeming to be properly recognized.

    Also, there is no delay with "-" Delay is generated only with "..."
    Wouldn't there be a problem in the process of string parsing?

    I'm using Unity Engine 2022.3.4f1, and I'm not getting any errors / warning in the editor.
    I need your help :)

    Best regards,
    L.Taler
     

    Attached Files:

    Last edited: Jul 25, 2023
  40. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello L.Taler,

    Thank you for attaching the screenshot.
    The setting you are using to generate the voice - Variability 0.8, may be causing the audio clip to not show as much character and emotion. Variability set at 1 means that the generations will be very consistent which comes at the cost of the voice not displaying much emotion. Variability set to lower values makes it likelier for the voice to demonstrate a higher level of emotion. It is also important to note that the adherence to the dialogue tag however will be followed to a varying degree even with lower variability levels. The training of the LAM (large audio model) was done on a large dataset including many different expressions. The AI model replicates the mentioned expression to an uncertain degree. You can also adjust the clarity parameter to a slightly higher value. More about its functioning is mentioned in the documentation.

    As for the delay, you can use "..." for delays. The sentence - \"Hey...are you okay...what happened?\", he asked worriedly gets parsed without any issues.

    Please let us know if these suggestions helped you :)
     
  41. LegendaryTaler

    LegendaryTaler

    Joined:
    Nov 8, 2014
    Posts:
    8
    Thank you for your reply.
    I must have misunderstood the variability item.

    Buuuuut... I don't know, nevertheless :s

    I've tried the item [\"Don’t test me!\" he shouted angrily.] in the example of the document several times with various values using the Andrew - Deep Voice Mono option, but there are only little differences in emotions. They never reveal as distinct emotion as the example.

    Deep Voice is enough to use now, but I hope it will be much more effective if the actor can express his feelings more clearly :)
     
    Last edited: Jul 25, 2023
  42. KenzoGames38

    KenzoGames38

    Joined:
    Oct 24, 2019
    Posts:
    19
    i just got my invoice number and when i put it i got this message Invoice/Order number verification unsuccessful. Please check your invoice/order number and try again or contact the publisher on the email given in the documentation.(edit now it works thanks for the help)
     
    Last edited: Jul 25, 2023
  43. Arnold_2013

    Arnold_2013

    Joined:
    Nov 24, 2013
    Posts:
    287
    I also miss understood this. If something has more (higher value) variation, one would expect the results to be more different per 'generate' not less. (What you say is correct according to the documentation)

    I've been using the Deep Voice Multi version, even for English because I 'felt' the results were better. Could you comment on the difference between the Mono and Multi version. Its clear the multi version can create multiple languages, but what's the difference when using both for English. Would you expect the same results from Mono and Multi when using only English?

    Just as a personal experience. I typically need to run a voice line 5 times with variablility 0.2, and Clarity 0.8. To have 1 I am 'oke' with. I do some volume adjustment afterwards if needed to stich multiple sentences together. This is what I expect from an AI solution, its workable.
    Would it be possible to get a bigger set of examples? Currently every 'voice' has a short phrase, which helps with selecting a voice. But it would be nice to have several examples with different values for variability & clarity of one voice.
    To get an idea of what these values do and get a sense of the range of results to expect from a specific value.
     
  44. vice39

    vice39

    Joined:
    Nov 11, 2016
    Posts:
    108
    Is this local or cloud? Will it work without internet connection?
     
  45. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello vice39,
    The voice generation feature for DeepVoice is cloud based. The generations will not work without a cloud, but you can still access other functionalities like joining trimming and equalisation inside the editor without an internet connection.
     
  46. LegendaryTaler

    LegendaryTaler

    Joined:
    Nov 8, 2014
    Posts:
    8
    Is there any setting that can intentionally cause a tone like [\"Don’t test me!\" he shouted angrily.] in the example of the manual?

    I've tested different lines over and over again with the keyword "he shouted angrily." but under no circumstances can emotion be created so dramatically.
     
  47. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Hello LegendaryTaler,

    We generated that audio with the variability and clarity set to very low values. We have also replaced the voice of Andrew in the new update. The older version of Andrew was based on a popular public figure. The audio clips that the model was trained on was also a volatile mix of emotions. We will update the documentation to reflect this change.
     
    Last edited: Jul 27, 2023
  48. Arnold_2013

    Arnold_2013

    Joined:
    Nov 24, 2013
    Posts:
    287
    Could you add to the documentation which voices are from public figures? I don't want to accidentally have a voice in my game resembling a person with controversial views or 'drama' that I am not aware of.
    Thanks.
     
  49. LegendaryTaler

    LegendaryTaler

    Joined:
    Nov 8, 2014
    Posts:
    8
    I don't know if it's thanks to the lower Clarity or the update, but the actor's emotions are starting to show incredibly well!

    Thank you for your help!
    It's a really cool asset :)
     
    AiKodex likes this.
  50. AiKodex

    AiKodex

    Joined:
    Jan 21, 2021
    Posts:
    363
    Sure, yes, we will add this in.
     
    Arnold_2013 likes this.