Search Unity

il2cpp and global-metadata.dat

Discussion in 'General Discussion' started by JustAnotherName, Jun 10, 2020.

  1. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    A couple years ago, I was ready to jump ship for UE4 because of the practically open-source state of any published Unity3D game. Copy and paste away. At that crossroads, Unity added il2cpp and without actually digging in to the implementation, I was back into the Unity fold with ignorant assumption that they couldn't possibly have published this feature in the current state. I've reverse engineered games in x86 as long as two decades ago and fully understand the reality that "everything is hackable" but the current state of il2cpp is an atrocity. There is no good reason that I should be able to automate dumping of named function pointers into an API with a one liner, out of the box, script kiddie friendly like I currently can thanks to global-metadata.dat and other inner workings. I fully expect Unity to provide no extra helping hand to reverse engineers in an il2cpp build. Now isn't the time to suggest obfuscation either in Unity's current state either since the current use of reflection hugely weakens that approach, requiring you to hard-code everything UI related and just accept that Update, Start, Awake etc are probably untouchable.
    You've built a fence in front of a house in a field with no back, or side fences. You have the capability to build the other fences, but you just haven't. It is lazy and quite frankly telling of the direction and coordination of the company. Obviously my allegiance to the engine is gone and without a shred of doubt, I will be moving on to UE4/5 for the next title. This is just sloppy.

    If you want to go down this rabbit hole, particularly if you are using il2cpp with the assumption that it would produce binaries similar to if you wrote your own engine in C++ and not a gaping public github trove of enumerable, plaintext pointers then let google guide you https://lmgtfy.com/?q=unity+il2cpp+dump Note: you will be targeting global-metadata.dat and the GameAssembly.dll in the CLI.

    If you want to actually allay my concerns with Unity, you'll:
    1) Shore up il2cpp builds to not contain any metadata if configured as such and if debugging support is enabled, then any metadata should be no more descriptive of the content than a lookup hash where the developer only has access to the sourcemap file
    2) Address weaknesses introduced by reflection to enable end to end code obfuscation for all builds
     
    Last edited: Jun 10, 2020
    DungDajHjep and MrLucid72 like this.
  2. Good luck with your endeavor.

    But no one ever stated (that I know of) or implied that IL2CPP is for code obfuscation or hiding your code. They actually stated that it is not. So I don't know why project your false expectations onto them?
     
    Joe-Censored likes this.
  3. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    I have an expectation, as a C++ developer, to not see plaintext function names in a build if symbols are disabled and to be given the option to disable packaging of said metadata. x86/x64 doesn't need to work like that. At the absolute bare minimum, I'd expect metadata plaintext to be replaced with reference hashes that only the developers and Unity cloud systems have access to for debugging issues. I'm not shipping my C++ projects with .pdb files so why is Unity doing it on a release/master build? When we need to address something like reflection, it should be done by building a map of references in pre-build then hard linking them at generation such that it also has a step where Obfuscators can do their thing on the entire stack. Am I missing something? Why is GameAssembly.dll separate from <game>.exe anyway?
     
    Last edited: Jun 10, 2020
    Infinite-3D likes this.
  4. When you're working with Unity, you're not a C++ developer.

    And you're missing something, you expect things which were never offered. Your application's code isn't Unity's main focus. This is why we don't recommend to anyone to develop e-sport games in Unity.
    But again, they never offered such a thing, so I personally have no problem with it.

    You could write down exactly your thoughts on this and post under the Feedback tag in one of the scripting forums (maybe without the treats that you're leaving because they don't offer something they never promised), and maybe they will evaluate the said feedback.


    Edit: okay, they may visit the General forums for this, no need to go anywhere :D
     
    Joe-Censored likes this.
  5. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,936
    Thanks for sharing your concerns. We've never intended IL2CPP to provide any sort of obfuscation. Really it is a .NET ahead-of-time compiler and runtime that happens to use C++ as an intermediate language. So we don't expect it to look or work like hand-written C++ code.

    With that said, obfuscation is certainly possible from a technical perspective, but it is not on the road map for IL2CPP development now.

    There are a number of IL obfuscations utilities that I've seen used before, so maybe that is a route you want to try.
     
  6. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,680
    Uhhh it actually does need to work like that. When you have code in separate modules (like two .DLLs), those DLLs have to agree on function names otherwise they won't be able to call into one another. That's exactly what is happening here: you cannot change method names of "Start", "Update" etc because Unity needs to call into them, so they must be visible.

    You cannot statically figure out what reflection code will end up calling. Method names can come virtually from anywhere.

    It's separate because engine code is precompiled. DLLs is how you ship precompiled code on Windows.

    Anyway, the source code for metadata loading is shipped with Unity and compiled on your machine. There's nothing preventing you from using it to strip out unused methods from metadata. It's just not something we offer out of the box because IL2CPP is not meant to be an obfuscation tool. Lastly, it will be far more effective if you do something unique because if we do something, it will affect all Unity games so there will be more effort put in by other people to reverse engineer it (plus they could just download Unity and see how it works).
     
    ippdev likes this.
  7. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,680
    Also to add: this is an already solved issue in "Project Tiny". C# used in Project Tiny doesn't support reflection so we generate no metadata at all.
     
    karl_jones likes this.
  8. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    Regarding linking the dll & plaintext method names re: "why is GameAssembly.dll separate from <game>.exe anyway" generating a monolithic PE from a stub is one option, albeit more difficult. Statically linking to create a single PE absolutely should be a configuration option. Is Unity just legally bound to not allow building of the game.exe from the stack or is it just build speed? Another could be import reference modifications directly on the built PEs.

    For the issue of reflection; It seems that Unity's reliance on reflection is the Achilles heel for getting traction on the subject and I get it that it is nuanced. If I'm using the UI Button component and choose a method to invoke on click though, it is painfully obvious that the component doesn't actually need reflection for that invocation. You have scenarios where reflection is being used for convenience for the Unity engine developers but I'd suggest that Unity has options to de-escalate from reverting to the kitchen sink of reflection. Offer a setting, or automatically if it makes sense, to replace reflection invocations for scenarios where variables never modified (UI Button example). At a point, if no user or plugin code has variable reflection then Unity could allow symbol stripping for everything. Alternatively, Unity could expose a system to enable obfuscators to rename Unity specific methods like Start etc. It is unfortunate that the Unity Technologies initial response is so adverse in response to this topic which comes up over the history of the forums ad nauseam.

    Edit* Unfortunately Project Tiny doesn't look like a solution for the masses in regards to this topic.
     
    Last edited: Jun 10, 2020
  9. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,680
    Unfortunately in order for static linking to work correctly, your compiler version and linker version has to match exactly. That means that if we shipped the engine as a static library, you would have to install exact same compiler we built Unity with (for instance, Visual Studio 2019 16.4). And we're upgrading the compiler used to build Unity all the time. You upgrade Unity and forget to upgrade compiler? You get some random linker crash when building your project. You accidentally upgraded Visual Studio? Now you can't build to IL2CPP either. Static libraries on Windows were not meant to be shipped to a broad audience.

    Another issue with this is that we would lose ability to read crash dumps for crashes that happen in the engine because debug symbols get generated at link time and we would have no way of obtaining them if linking is done on your machine.

    There are no legal issues here, it's purely technical.

    I'm not exactly sure what you mean here.

    How would you implement it? Keep in mind that while you hook it up in the editor, that method has to be called at runtime later so you can't just subscribe a delegate.

    Unfortunately yes, these cases exist too. I'd love nothing more than to get rid of reflection altogether but unfortunately it's a tremendous task. There plans on addressing this, partly thanks to Project Tiny. Since reflection isn't available there, we had to come up with ways to do it without it. And we hope to eventually bring those techniques to "big" Unity.

    We actually looked at this in the past. You'd be surprised how rare this is. Almost every project uses some kind of reflection coming from user code or plugins.
     
  10. unit_dev123

    unit_dev123

    Joined:
    Feb 10, 2020
    Posts:
    989
    Anything that is client side, is hackable open to misuse.
     
  11. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    Tldr; for my initial post. Unity does a lot of extra work to make it easier for hacking that doesn't need to be there.
     
  12. unit_dev123

    unit_dev123

    Joined:
    Feb 10, 2020
    Posts:
    989
    Yes but from a purely abstract POV something is either hackable or it is not (i.e handled server side) so.
     
  13. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    Maybe I'm missing something but why not package the C++ compiler with Unity (or at least the il2cpp build extension)? I'm assuming you're using C++ Build Tools and they can be installed alongside the Unity version that requires them or they can be packaged into a portable installation.

    In the case of a statically linked build, an artifact would need to be uploaded for Unity cloud services to work I assume. That might be doable by uploading the produced Program DataBase/PDB OR mangling symbols like I originally suggest such that they are useful for Unity's platforms but not useful for reverse engineering.

    Legally: just wasn't sure if Unity needed to protect their prebuilt game .exe (ironically) or if the source was redistributable.

    Implementation of reflection replacement? Code replacement in a pre-build step for variables which are not modified explicitly and not in the presence of other reflection which *could* modify it. Warn the user of reflection code which is indeterminate and cite that it prevents optimization of other reflection code. Another would be following Unreal's footsteps: https://www.unrealengine.com/en-US/blog/unreal-property-system-reflection which is tldr; annotations to explicitly define discovery to the reflection system. Any not annotated remain invisible to it.
    Thanks for asking, I know it is more work but I also know there are dozens of us customers that would appreciate this.
     
    Last edited: Jun 10, 2020
  14. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    Supporting a product that has been ventilated with mods like swiss cheese by an onslaught of unskilled script kiddies on forums versus a bare minimum of deep understanding of machine code and reverse engineering is just not even comparable.
     
    dannyNK likes this.
  15. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,680
    C++ compiler is not enough. You also need Windows SDK. I don't think we're allowed to distrubute that.

    I don't think that's practical. Thousands of games are built using Unity every day. If we started to upload 1 GB worth of PDBs for every one of them, we would soon run out of disk space...

    I might be misunderstanding what you said but how would mangling symbol names help with the issue of us not having the original symbol files?

    Source code for the Windows executable is shipped in Editor\Data\PlaybackEngines\WindowsStandaloneSupport\Source. But that's just an entry point. You're probably talking about UnityPlayer.dll. Yes, source code is not something we have available. However, you can download symbol files from https://symbolserver.unity3d.com and debug it away.

    The thing is, that's not how C# works. Take any random .NET assembly of NuGet and it will probably use reflection somewhere. It's easy to say "oh just don't use reflection". It's actually very hard to do in practice. When we were first developing IL2CPP, slight reflection issues would break various projects badly. C# ecosystem just expects these things to work.

    Lastly let me reiterate this to you: any kind of anti-hacking measure we do will be undone very quickly because the gain for the investment is huge: single hack will apply to thousands of games. If you do something specific to your game, it will have much higher chances of being unhacked for a while because they will have to develop tools specific to hacking your game.
     
  16. JustAnotherName

    JustAnotherName

    Joined:
    Sep 27, 2017
    Posts:
    11
    To be blunt, and upfront, this isn't about anti-hacking. You're literally preaching to the choir there. It is about Unity Leaving the front door open for a stupid amount of things. Enumerating project structure automatically like we're able to do now is like I've said before; frankly atrocious.
    In a sterile environment, it sounds like il2cpp builds without reflection was behaving properly. If I'm not pulling in any 3rd party libraries that reference the reflection namespace, then can't we just fall back to a state where global-metadata isn't published?

    Symbol mangling? Bare minimum but at least Unity isn't leaving doors wide open out of the box.

    PDBs being fat: yeah they are. If I'm a customer that wants to do an il2cpp build with metadata disabled and it costs me a nominal fee to have it hosted by Unity to enable some cloud features then at least it is my choice because right now the choice is "go use UE4/5 if you don't like global-metadata.dat". Unity is storing the stacktraces already, so being able to download them and parse them locally could also be an option.

    If Unity had an attribute system for il2cpp which could either act as a whitelist or a blacklist for classes, methods etc, then in code generation, you could clearly determine if something needed to be exposed to global-metadata.dat to support reflection. Maybe I'm missing something. Are you guys shipping global-metadata.dat for reflection support and error reporting support? I guarantee you zero 3rd party libraries need to reflect my game specific code. I don't have error reporting enabled either so again, why are we leaving the front door open on game code? I'm not talking about Unity's engine code. It is a complicated issue obviously but as a customer I'm just not seeing anything to indicate that Unity is interested in the topic. You guys might want to ctrl+f "Security" on your il2cpp manual and do a little housekeeping.
     
    Infinite-3D, Trigve and MrLucid72 like this.
  17. emrys90

    emrys90

    Joined:
    Oct 14, 2013
    Posts:
    755
    I'm interested in this too. My assumption was always that an IL2CPP build was more secure than a C# build just for the fact that its harder to decompile. My assumption was shattered today.
     
    daxiongmao likes this.
  18. MrLucid72

    MrLucid72

    Joined:
    Jan 12, 2016
    Posts:
    994
    Extremely relative and at the top of Google in 2021:

    I understand that devs are currently responsible for locking the door (Psst, I'd pay for features that help with this), but Unity is currently leaving the door wide open -- if anything, without a door, at all, to lock. "Come on in!".

    Is there really nothing in 2021 that can natively protect this with even the most basic form of protection? There's only so much one can do server-side: At some point, the server needs to send and collect data to/from the client.
     
  19. HerpDerpinstine

    HerpDerpinstine

    Joined:
    Jun 8, 2020
    Posts:
    2
    Not really no.
    Not much else you can do besides implementing an AC or something along those lines yourself.
     
  20. MrLucid72

    MrLucid72

    Joined:
    Jan 12, 2016
    Posts:
    994
    Didn't take too much Googling to see that Obfuscation will block out your average dude, alongside looking for particular processes that are known troublemakers. Obfs gave us hell, but it's definitely enough to at least give em enough trouble for the average person to not feel it's worth it.

    Obf can be taken further if I can port button events to code so I can obfs Mono classes too.

    I tried wrapping the game in a virtual exe and while it helped a ton and broke all auto decompilers, it also triggered anti virus. Pretty much obfuscation combined with IL2CPP and watchdogs.