Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Official Subtitle System for SFX

Discussion in 'Open Projects' started by Harsh-NJ, Jan 14, 2021.

  1. Harsh-NJ

    Harsh-NJ

    Joined:
    May 1, 2020
    Posts:
    315
    Hello all, I want to work on this system. Now some questions,
    Subtitles are only needed for SFX or for talking dialogue also?
     
  2. ChemaDmk

    ChemaDmk

    Unity Technologies

    Joined:
    Jan 14, 2020
    Posts:
    65
    Hey @Harsh-099, The dialogue is already written, and the sound FX for dialogue is a blabber so I think that we don't need to add subtitles for that. What do you think ?

    Do you think that when creating this system we can use the Localization and TMPro packages ?
     
  3. Harsh-NJ

    Harsh-NJ

    Joined:
    May 1, 2020
    Posts:
    315
    I saw the card on roadmap, so I created this thread, as the final decision will be of the community of course.

    Yes, we can use TMPro (for a bit styling), but I see no use of localization package as
    Chooo Chooo!
    In English is same as
    Chooo Cooo!
    In Spanish, german, Hindi, and all other languages.
     
  4. Smurjo

    Smurjo

    Joined:
    Dec 25, 2019
    Posts:
    296
    Except if they use different letters - for example in Greece, Russia, Arabic countries or China
     
    ChemaDmk likes this.
  5. ChemaDmk

    ChemaDmk

    Unity Technologies

    Joined:
    Jan 14, 2020
    Posts:
    65
    Many onomatopoeia differ between languages : Example Yummy in English is Miam in French, Woof-woof is ouaf-ouaf

    Also some different letter may be needed for certain languages like @Smurjo said.
     
  6. cirocontinisio

    cirocontinisio

    Joined:
    Jun 20, 2016
    Posts:
    884
    Hey Harsh, thanks for opening the thread. I have added more info on the card, I'll replicate it here too:

    Hope this clears up what the idea behind the task is. In any case, keep in mind it's a lower priority than other things we're doing at the moment.

    We don't say "choo choo" in Italian! :D We say "ciuf ciuf" (unmistakable proof)
     
  7. davejrodriguez

    davejrodriguez

    Joined:
    Feb 5, 2013
    Posts:
    69
    I have been working on this task on and off over the past couple of months and I'd like to share some thoughts and possible insights I've had while developing it. My work focused mostly on the system itself rather than the UI, so I did not explore any use of TMPro.
    1. We should absolutely use LocalizedString for caption descriptions.
    2. We should do our best to match the goals of Closed Captioning as described here. (Note: These are US guidelines. The only other other list of guidelines I found were EU, but most of the differences were in regard to formatting rather than general goals.) These goals include:
      • Accuracy. Describe as much of the sound stage as possible at any given moment.
      • Synchronous. Ensure that descriptions occur at same time as sound.
      • Complete. Describe sounds consistently throughout use of application.
      • Placement. Place the captions so that they do not block important visual elements of the game.
    3. We should use LocalizedAsset to switch between caption formatting for different regions.
    4. The balance between accuracy and placement will be hardest goals to manage. There will likely be several sounds playing at once and we will need to decide programmatically which sounds are most important and show only those we can fit in UI. Which leads me into what I consider the hardest programming challenge of this task...
    5. Unity does not have an API to retrieve a list of AudioSources sorted by audibility (except in the Profiler). So, we will need to programmatically figure out which audio sources are most audible to the listener at any given moment. This is difficult because a sound goes through several transformations in Unity before it gets outputted. It has a base average AudioClip loudness, AudioSource volume, spatialization, rolloff, and AudioMixerGroup attenuation.
    My approach was to calculate clip RMS at AssetPostProcess-time and save them in a SO database, because RMS calculations are very expensive and in our case the clip loudness should not change at runtime. The database would also have an editable field for the LocalizedString description. Then as AudioSources play, you lookup the clip for the base average loudness and run it through all the runtime calculations like AudioSource volume, rolloff, etc. early out if it ever touches 0. Then put each AudioSource into a list, sort by our calculated audibility, lookup the clip once more for the description, and voila you have the most important captions!

    In my opinion, this is necessary to address the accuracy goal of captions. But I understand if we want to avoid all this in favor of a rougher approximation based on some sort of priority. Anyway, I'd be happy to be of any help with this task.
     
  8. Harsh-NJ

    Harsh-NJ

    Joined:
    May 1, 2020
    Posts:
    315
    Hey @davejrodriguez , Thanks for your explanation. But how do will be make the SFX indicating Sprite appear on screen such that it points to where the sound is coming in the 3D world (position of the AudioEmitter)?
     
  9. davejrodriguez

    davejrodriguez

    Joined:
    Feb 5, 2013
    Posts:
    69
    Hey @Harsh-099! Ciro left you a hint in the card info ;) Just google "offscreen indicator unity". There are many tutorials. Here's just one that I found:


    I don't know if that particular tutorial is the best approach, but most are going to boil down to converting the object's world position to screen space via Camera.WorldToScreenPoint and doing the math to figure out if/where an indicator is necessary.
     
  10. GoodBoySK

    GoodBoySK

    Joined:
    Oct 2, 2020
    Posts:
    2
    Hi,
    I am making my own implemantation and i had problem. I made a system which cooperate with a AudioCueEventChannel but it seems like Finish event is never called. I thought that Finish event is called when a audio finished playing. It is a system mistake or i am missing meaning of that event.
    PS: Sorry if my question is stupid but i am new at this open project stuff and i am still figuring out what to do :D
     
  11. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    Hi
    I just picked up this on Friday too :D Made some prototyping and have a few things to discuss.
    First of all, I split this task into two parts - displaying the text on the screen and the offscreen indicator.

    1. Should we display SFX which are in the loop?
    2. What to do if one object has two sounds like campfire (BoilingWater & Campfire)? Maybe display them together as multiline?
    3. Ciro suggested "script that is in charge of playing the sound will notify the ClosedCaptioningSystem and pass the necessary parameters". Won't be better to register for 'AudioCueEventChannelSO' in the ClosedCaptioningSystem and separate completely from AudioManager?

    Game preview:

    upload_2021-8-29_14-53-35.png
    upload_2021-8-29_14-53-56.png

    In my configuration campfire (loop) does not display, because I didn't know if we would like to have it on the screen all the time. The bard has two sing sounds - long and short, so you can observer different states - 'La en' & 'Lalala en'.

    Implementation preview:
    upload_2021-8-29_14-58-8.png

    Onomatopoeia prefab to instantiate in the sound location:
    upload_2021-8-29_15-1-18.png

    Code (CSharp):
    1. using System.Collections;
    2. using TMPro;
    3. using UnityEngine;
    4.  
    5. namespace Assets.Scripts.Audio
    6. {
    7.     public class ClosedCaptioningSystem : MonoBehaviour
    8.     {
    9.         public GameObject OnomatopoeiaPrefab;
    10.  
    11.         public void VisualiseAudioClip(Onomatopoeia onomatopoeia, Vector3 position = default)
    12.         {
    13.             var newOnomatopoeia = Instantiate(OnomatopoeiaPrefab, position, Quaternion.identity);
    14.             var onomatopoeiaTextComponent = newOnomatopoeia.GetComponentInChildren<TextMeshPro>();
    15.  
    16.             if (!string.IsNullOrEmpty(onomatopoeia.SoundText.TableReference))
    17.             {
    18.                 onomatopoeiaTextComponent.text = onomatopoeia.SoundText.GetLocalizedString();
    19.             }
    20.             StartCoroutine(DestroyNewOnomatopoeia(newOnomatopoeia, onomatopoeia.Duration));
    21.         }
    22.  
    23.         IEnumerator DestroyNewOnomatopoeia(GameObject newOnomatopoeia, float duration)
    24.         {
    25.             yield return new WaitForSeconds(duration);
    26.             Destroy(newOnomatopoeia);
    27.         }
    28.     }
    29. }
    30.  

    Lines added in the AudioManager:
    upload_2021-8-29_15-4-6.png

    What do you think? I will be adding object pooling and refactoring a bit later, but first wanted to ask about the approach.
     
  12. Harsh-NJ

    Harsh-NJ

    Joined:
    May 1, 2020
    Posts:
    315
    Something more interesting than mine idea. I tried to do the same, on the Canvas UI, a pointer pointing towards the audio direction (relative to the center of the screen), and you did it in the 3D space.

    And about support for sprites?

    Anyway, great work.
    Thanks.
     
  13. GoodBoySK

    GoodBoySK

    Joined:
    Oct 2, 2020
    Posts:
    2
    I think it would better because we made system more modular. That was my thoughts when i was creating my script.

    It would be best and simplest solution.

    And by the way thank you. Your post help me to figure out some stuff. Thank you a lot <3 Great Work:D
     
  14. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    @Harsh-099 I think that canvas will be required for the 'off-screen' sounds (when the sound it behind you for example). I didn't look into this part though. First part shows the text in 3d world and second will have to notify the player that some sound exists outside of the screen.

    Sprite support can be easily added as the image reference to the Onomatopoeia object, next to the Duration and Sound text. However I wasn't thinking about adding it. It should be easy to extend it later.

    @GoodBoySK Agree with you, it will be more modular in the separate listener class. Good to hear that my post was useful for you! :)

    I am planning to refactor the code a little bit and add object pooling. Then will share with you guys. Hopefully will have some time soon.
     
  15. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
  16. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    I added an option to display AudioCueSO with many AudioClips attached. Previously it was displaying one on top of another. Initially I thought about just concatenating them, but then realised that two AudoClips can have different duration. Eventually I've just added simple shift in the position of each caption.

    This is just an example to easily present the case (campfire has two sounds in the loop and they won't be displayed - at least for now):
    upload_2021-9-4_20-26-27.png

    upload_2021-9-4_20-31-29.png

    Also settings for the caption added to the Language section in settings:
    upload_2021-9-4_20-32-32.png
     
  17. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    I added the off-screen indicator to this solution. All scripts were taken from the following video (big thanks to the author of this tutorial!):


    I modified them for our purpose and it looks quite cool. There is an arrow pointing to the each caption which is out of the camera view. Once we look at the caption, the arrow will disappear. I hope someone from the community would like to modify the visual aspect of the caption and the arrow indicator. I took the image from the solution just to have something to visualise it. Maybe the sound icon attached to the arrow would do? A few screens below:

    upload_2021-9-5_19-39-41.png

    upload_2021-9-5_19-39-16.png

    Brief description of the off-screen solution:
    The Caption prefab has OffscreenTargetIndicator child object (with the arrow image and TargetIndicator script).
    Once the Caption is displayed I also add it to the captionEmmiters list. It keeps all the captions currently displayed in the game. This list is needed for the Update method from the ClosedCaptioningManger which takes care of the off-screen indicators (show/hide them and update the position/rotation). When the caption duration ends and the script is about to return the caption object to the pool, this object is also removed from the captionEmmiters list. Update method doesn't care about it anymore. Also the arrow disappear because it is the children object of the caption prefab (returned to the pool already).

    I am open to discuss this solution. Link to the git is below. What do you think guys?
    https://github.com/codeedward/open-project-1/tree/feature/CaptioningSystem
     

    Attached Files:

  18. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    Here is the video of the functionality:
     
    itsLevi0sa and cirocontinisio like this.
  19. codeedward

    codeedward

    Joined:
    Dec 19, 2014
    Posts:
    93
    cirocontinisio likes this.
  20. cirocontinisio

    cirocontinisio

    Joined:
    Jun 20, 2016
    Posts:
    884
    Oh wow, great implementation! It's exactly as I imagined it. Obviously the text and the markers can be made more readable, but the functionality is there.

    Thanks! I'll take a look at the PR as soon as we can. Sorry if it won't be immediately, it's a feature that was a bit lower on the priority list.
     
    codeedward likes this.