Search Unity

Bad Word Filter PRO - Solution against profanity and obscenity

Discussion in 'Assets and Asset Store' started by Stefan-Laubenberger, Jan 9, 2015.

  1. Foreman_Dev

    Foreman_Dev

    Joined:
    Feb 4, 2018
    Posts:
    82
    Hey @Stefan-Laubenberger, thanks for this asset! I have a question regarding the "global" bad words.

    Is the "global" list of bad words always included by default, or must I explicitly define that it should be used? Below is the code that I'm using to check for bad words (with the "_global" source name included but not sure that's correct).

    Code (CSharp):
    1. List<string> detectedBadWords = BWFManager.Instance.GetAll(chatMessage, Crosstales.BWF.Model.Enum.ManagerMask.BadWord, "english_modified", "_global");

    Note: I created my own "english_modified" source that I'm loading via URL. Would you recommend I also load the "global" source via URL?

    EDIT: After understanding this further, it appears I do need to explicitly reference the "_global" source for it to work. I've made my own copy of this global source hosted via URL.

    I do have a few more questions:

    1. What happens if the plugin is unable to load the source via URL? Maybe there is an intermittent internet outage or the file is no longer reachable due to a firewall issue or someone deleting the file by mistake. Does the plugin just skip filtering for that source via 'GetAll' or 'ReplaceAll'? Is there some way to do a "fallback" if it fails? (for example, the direct file reference gets used instead if the URL fails?) EDIT: After testing, it appears that if it can't load the source via URL it skips filtering that source (makes sense) BUT if a local source is also defined, then the local source gets used instead of the URL source no matter what, regardless of whether the URL source can or cannot be accessed. Could this be updated to use the local source as a fallback only if the URL source can't be accessed?

    2. I've read the documentation but I'm not very familiar with crafting RegEx. What is the "correct" way to whitelist a word that the filter is improperly identifying as a bad word? (Haven't run into this yet but I expect to at some point so I'd just like to know how to do this)

    3. If I add a new word to the bad word list (just the word on its own with no fancy RegEx stuff) how will the filter be applied to that? Will that word be filtered out even if it's part of another word, or only that word explicitly on its own gets filtered out? Example: I add the word "ass" to the list (only that word and nothing else on the line). Does "asshat" or "massive" get filtered out also, or just "ass"? I probably need to get more comfortable with RegEx but would love it if you could steer me in the right direction here, thanks.
     
    Last edited: Jan 3, 2023
  2. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    Here are the answers to your questions:

    0. If you don't add a the names at the end of the methods, all defined sources will be used. You could just use the desired sources in the BadWordProvider and skip the names.

    1. This is added in for the next update. You could send me your invoice and I can give you access to an early beta.

    2. You can just add the whitelisted words like this: \b(?!\bmassive\b|\bassassin\b)\w*ass\w*\b

    3. It depends on the mode: if "SimpleCheck" is enabled, it will match any any occurrences of "ass", e.g. like you mentioned in "massive" etc.
    If it's disabled, only "ass" will be matched.

    I hope this helps you further.


    Cheers
    Stefan
     
    Last edited: Jan 4, 2023
  3. Foreman_Dev

    Foreman_Dev

    Joined:
    Feb 4, 2018
    Posts:
    82
    Very helpful - thank you, Stefan! :)

    0. I'll look into further configuring the BWF prefab instead of adding the names at the end of the methods. However, if I want the greatest amount of flexibility for which sources to use, then is it best to explicitly name the sources in the methods (like in my example code above) or is there a better recommended way? For example, I have the "forbidden names" source that I'd only want to use when someone is choosing a username, otherwise use only the English and Global filters for text chat. Or maybe I want less restrictive filtering for private chat vs public chat. OR, maybe I choose the sources based on the player's language.

    1. Sounds good, I'll follow up with you about that via email.

    2. Thank you for providing this! To be clear, is it the ?! assertion that is excluding the words, and the | character acts as an "or" operator within the parenthesis? And to whitelist certain words, the bold part of your example can be added to any word at the beginning, just after the \b ?

    3. SimpleCheck appears to be disabled by default (and I have not adjusted that setting). So for avoidance of doubt, if things are left at the default settings (SimpleCheck disabled) and only the word "ass" is added to the list (without any RegEx), then occurrences of only the word "ass" will be matched and not the word "massive"? Would the RegEx expression \bass\b give the same exact result?
     
    Last edited: Jan 4, 2023
  4. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hello again

    0. You could either use the approach with the strings or use different BadWordProviders and swap them in the BadWordManager.

    1. The new version will be sent to you shortly.

    2. \b is the word boundary of the RegEx and ?! is to exclude and | is the divider between the words

    3. Yes, \bass\b will only match the exact occurrence
     
  5. Foreman_Dev

    Foreman_Dev

    Joined:
    Feb 4, 2018
    Posts:
    82
    Thank you! The only thing I still need clarification for is #3: If no RegEx is used at all, and default settings are used (SimpleCheck disabled) is the word treated as if it were using \b (matching only the exact occurrence)?
     
  6. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    The effect is the same, you can use it with or without boundary: it will only match "ass" if "SimpleCheck" is disabled, but we recommend using boundaries.
    It makes a big difference if you ever decide to give "SimpleCheck" a go: without boundary it then would match "massive" while with boundary, it would still only match "ass".
     
    Last edited: Jan 5, 2023
    Foreman_Dev likes this.
  7. Foreman_Dev

    Foreman_Dev

    Joined:
    Feb 4, 2018
    Posts:
    82
    Sounds good, thank you for the explanation! It might be helpful to add this note to the documentation if it isn't mentioned already.
     
  8. hashiiiii1

    hashiiiii1

    Joined:
    Oct 6, 2020
    Posts:
    1
    Hi @Stefan-Laubenberger , thanks for this asset. I have a request.

    In our project we are using Assembly Definition.

    So we need to manually add the asmdef file under "Assets > Plugins > crosstales" in this library.

    Can you please define the asmdef on the library side?
     
  9. Gordon_G

    Gordon_G

    Joined:
    Jun 4, 2013
    Posts:
    372
    HI @Stefan-Laubenberger ! Bad Words Pro looks like a great asset, and I am about to purchase it. I have a couple of quick questions I hope you can respond to:

    1. I'm am doing games for children and I don't even want bad words to be replaced with **** or any other symbols. I want to omit them completely.in the filtered result. Will this asset allow me to replace bad words with an empty string?

    2. I've done a quick look at your documentation and tutorial video, and it is not clear to me how your system is set up.. I think you have a default list of bad words I could use, but then I see there are also regular expressions, and white lists - I'm trying to get a sense of how much time it is going to take me to get up to speed on using your system.

    Out of the box, without any modification, for a simple example, will your system filter "Hell" ""Hellonyou" but not "Hello" ? I'm thinking "Hello" would have to added to a white list .. In general, how much set up would have to be done to get that to work?.

    Just looking for broad guidance on this last question - no need for detail.

    [edit] Actually I purchased it

    I hope my questions make sense and thank for your time and any advice you can give!
     
    Last edited: Jun 1, 2023
  10. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    Thank you for purchasing our asset!

    To quickly answer your questions:
    1. You can set an empty string as replace character
    2. BWF has defaults for 25 languages and it covers a lot of actual "bad words". It depends on "how strict" you are, but I guess you could modify it to your liking with a few hours of tweaking.
    The defaults for English won't cover the example - the current regex will only catch "hell", but you could modify it to catch your examples. For more on regexes, please see the documentation.


    Cheers
    Stefan
     
    Gordon_G likes this.
  11. Gordon_G

    Gordon_G

    Joined:
    Jun 4, 2013
    Posts:
    372
    Hey, thanks so much for your reply. I got it working in very little time - though your video tutorial is out of date and no longer works, so that threw me off for a few minutes. The sample scenes were very helpful! yeah, I figured out that I could just delete the replacement character field and leave it empty.

    Actually, accepting the default providers and sources that are in the given BWF prefab seems to cover everything in my example. I may have added an extra Regex from the existing ones one the dropdown menus in one of providers.

    I haven't completely assimilated your terminology and how your system is arranged just yet. So I was just fiddling around and I didn't really keep track of what I did.

    If I can ask you a few follow up questions, (if these should be obvious, in my defense some times I am dense):

    1. in order to remove sources so that I am not building every language into my iOS app, so I just remove the sources from the given provider child instance within given BWF prefab I have in my scene?

    2. related to that, if I am not using any of the default langues in the RTL provider. I think I understand I can just remove the provider from the RTL list in the BWF manager inspector. Is that all I need to do, or I should also disable the RTL provider component itself?

    thanks for helping out a newbie if you can! You have a really comprehensive solution to this problem and so much better than the very limited asset I was trying out previously!

    Cheers!
     
  12. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hello again

    No problem, I'm here to help :)

    1. You can remove all unused source - just add the ones you need
    2. Simply remove it in the "BadWordManager" under "Bad Word Provider RTL" and disable/remove "RTL"
    I hope this gets you further.


    Cheers
    Stefan
     
    Gordon_G likes this.
  13. Gordon_G

    Gordon_G

    Joined:
    Jun 4, 2013
    Posts:
    372
    Hey thank you, Stefan! I have a need to understand and I'm getting there with your asset.

    I took a look at one of the language files and saw all the regex patterns in there - very interesting.

    And I think I was wrong about hellonyou being filtered. So if I wanted to filter something like that, I would need to alter the regex in the language file, on the line associated with that word - is that correct?

    What about a case for any arbitrary bad word that comes in the form h-e-l-l or a-s-s or a---s---s or any arbitrary punctuation in between, do I similarly need to alter the regex in the language for every word I want to apply that to, or is there some global way of handling cases like that?

    [edit - hey, it occured to me that unless there is some built in global way of handling cases like that, it is simple do just do my own and regEx filter out characters like that between letters in the input before passing the input to BWF - If I am not missing the feature, I guess it would be neater if your asset had a 'pre regEx filter" parameter we could pass in, and/or was available in the Asset UI]
     
    Last edited: Jun 3, 2023
  14. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hello again

    You may have to create your own filter for word variations.
    About the arbitrary punctuation - we have something similar for "spaces" and it would make sense to have a possibility to add your own punctuation characters for the detection. We will consider this for a future update.


    Cheers
    Stefan
     
    Gordon_G likes this.
  15. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
  16. vARDAmir88

    vARDAmir88

    Joined:
    Sep 9, 2015
    Posts:
    36
    Hi there!

    Today I bought your plugin, but it turned out not to be what I expected.
    The filter can only catch whole words or separated by spaces and leet speech.

    For example, I can easily bypass it using symbols between letters (for example it will catch word "hell" but not "h-ell").
    Also, the filter completely ignores stretched words (for example it will catch the word "hell" but not "hhell")
     
  17. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    Sad to hear it's not what you expected.:(
    Have you tried our demos before you bought it?

    However, have you tried different options in BadWordManager, especially "Remove Spaces"?
    Our built-in filters are serving just as a "default configuration" - our customers can adjust them as they like and the regexes can be written to catch most if not all of your given examples. Unfortunately, more complicated regexes lead to more performance impact and have to be rigorously tested against false-positives.
    If you're unhappy with our product, feel free to request a refund.


    Regards,
    Stefan
     
  18. vARDAmir88

    vARDAmir88

    Joined:
    Sep 9, 2015
    Posts:
    36
    It was my mistake, I only looked at promo materials, but did not try the demo version.

    I tried the "Remove Spaces" option and it works well. It would be nice if there was an option to "remove custom characters" so that the user can specify the characters he needs.

    And if you manage to solve the problem with "stretched words", then there will be a very good product for me.

    Thanks for the offer but I wrote this post to help you improve the product, not to get a refund :)
     
  19. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We will consider your request with the "replace custom characters".
    To fix the stretched-words, just edit the regex to your liking, e.g. this would match your example:

    \bh{1,}e{1,}l{2,}\b


    Update:
    We added a new property called "RemoveChars" to the BadWordManager which should solve the "h--e-l-l"-issue. If you like to test the beta, please send us your invoice.
     
    Last edited: Jul 19, 2023
  20. vARDAmir88

    vARDAmir88

    Joined:
    Sep 9, 2015
    Posts:
    36
    Hi,

    Thanks for providing an example of a solution to this problem. Just what I need.

    Thank you for inviting me to the beta test, but unfortunately I don't have free time right now, so I'll wait for the new version in the asset store
     
    Stefan-Laubenberger likes this.
  21. vARDAmir88

    vARDAmir88

    Joined:
    Sep 9, 2015
    Posts:
    36
    Hi,

    Just tested the new version and the RemoveChars feature works very well.

    Now I'm much more satisfied with the work of the plugin :)

    Thanks for the update!
     
    Stefan-Laubenberger likes this.
  22. Timmy-Hsu

    Timmy-Hsu

    Joined:
    Aug 27, 2015
    Posts:
    51
    Hi,

    An error occurs when switching between these languages, could you please take a look?

    BadWordFilter Error.jpg

    Error log:
    Exception
    CultureNotFoundException: Culture name sh is not supported.
    Parameter name: name
    System.Globalization.CultureInfo..ctor (System.String name, System.Boolean useUserOverride, System.Boolean read_only) (at <00000000000000000000000000000000>:0)
    Crosstales.Common.Util.BaseHelper..cctor () (at <00000000000000000000000000000000>:0)
    Crosstales.Common.Util.CTHelper.initialize () (at <00000000000000000000000000000000>:0)
    Rethrow as TypeInitializationException: The type initializer for 'Crosstales.Common.Util.BaseHelper' threw an exception.
    Crosstales.Common.Util.CTHelper.initialize () (at <00000000000000000000000000000000>:0)
     
  23. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    Hi

    Please update BWF to the latest version - this issue is fixed since March 2021.


    Cheers
    Stefan
     
    Timmy-Hsu likes this.
  24. kingsleyfs

    kingsleyfs

    Joined:
    Jul 5, 2012
    Posts:
    14
    Have purchased your product and following demo video but in test script: ... looks "ManagerMask" requires "using Crosstales.BWF.Model.Enum;" and then "inText.text = BWFManager.ReplaceAll(inText.text, Mask, Sources); " spits object reference is required for non-static field error in editor. Any thoughts or updates ?
     
  25. kingsleyfs

    kingsleyfs

    Joined:
    Jul 5, 2012
    Posts:
    14
    thank you for very prompt reply to my email, glad you pointed out the video is outdated.

    new way is: using Crosstales.BWF.Model.Enum; & accessed by inText.text = BWFManager.Instance.ReplaceAll(inText.text, Mask, Sources); (note the word "Instance"),

    thanks for making this product, its (now) easy to use.
     
    Stefan-Laubenberger likes this.
  26. Gordon_G

    Gordon_G

    Joined:
    Jun 4, 2013
    Posts:
    372
    Hey, I've got a git branch were BWF prefab ready status is displaying "no" when the app runs, and I cannot find what might be the cause.

    On our Main "published " branch it works fine, and the ready status is "yes", but as the branch with the problem is our app's Apple Test Flight branch where we stage our app changes for testing on iOS devices, I need to get this working.

    All the settings and the BWF component and child component inspectors all look identical between the two branches.

    specifically, this call returns null : BWFManager.Instance.ReplaceAll

    Any idea what might be going on and how I can get this working on our Test Flight branch?

    Thanks for any help in advance!
     
  27. Stefan-Laubenberger

    Stefan-Laubenberger

    Joined:
    May 25, 2014
    Posts:
    1,981
    We've just released version 2024.1.1 of Bad Word Filter.
    Main changes:
    • Support for Unity Cloud Build improved
    • domains updated
    • Updated to Common 2024.1.1