Search Unity

  1. Unity 2018.1 has arrived! Read about it here
    Dismiss Notice
  2. Scriptable Render Pipeline improvements, Texture Mipmap Streaming, and more! Check out what we have in store for you in the 2018.2 Beta.
    Dismiss Notice
  3. If you couldn't join the live stream, take a peek at what you missed.
    Dismiss Notice
  4. Improve your Unity skills with a certified instructor in a private, interactive classroom. Learn more.
    Dismiss Notice
  5. ARCore is out of developer preview! Read about it here.
    Dismiss Notice
  6. Magic Leap’s Lumin SDK Technical Preview for Unity lets you get started creating content for Magic Leap One™. Find more information on our blog!
    Dismiss Notice
  7. Want to see the most recent patch releases? Take a peek at the patch release page.
    Dismiss Notice

Could Unity be a lot faster if it used larger files and buffers?

Discussion in 'General Discussion' started by Arowx, Jun 13, 2018.

  1. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    6,271
    Here are some benchmark graphs for driver read write performance using various buffer/cache sizes:





    The main thing to notice here is the best transfer rates for these drives are usually around the 128 Kb buffer cache sizes.

    Now I found out that by running a program called Process Monitor you can check the files/network and registry activity of your Windows apps and initially I was looking into why the WebGL build process is so slow.

    Then I got thinking of all those tiny files Unity uses when you work on a project (*.o, metadata, *.bin, *.info).

    And if you run Process Monitor you too can see that not only does Unity loads lost of small files from dlls to meta files it often loads even the larger files in very small chunks.

    And we are talking very small buffer sizes here some dlls are literally being loaded bytes at a time e.g.

    Editor\Plugsin.Serives.Broker.dll is loaded in 2 and 4 byte chunks at a time (reported by Process Monitor).

    As you can see from these graphs loading 128 Kb buffer is many times faster than loading a <= 1 Kb buffer.

    For example I have a small Unity project that weights in at 265 MB but it contains 7,619 files or has an average file size of 34.4 Kb. (WinDirStat is a good tool to analyse file sizes in directories).

    Therefore I think that if Unity adopted:
    • A 64 or 128kb buffer size standard throught it's engine.
    • Packaged multiple files into larger lightly compressed packages.
    Unity could get a massive boost in file/io performance e.g. go from 10s of Mbs to 100s of Mbs on modern hardware.

    Note: Even older HDDs get a boost in file/io with larger buffer reads/writes.

    ATTO Drive Benchmarking Tool https://www.atto.com/disk-benchmark/
    Process Monitor https://docs.microsoft.com/en-gb/sysinternals/
    WinDirStat https://windirstat.net/
     
    Last edited: Jun 13, 2018
  2. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    10,595
    Have you used version control before?
     
  3. iamthwee

    iamthwee

    Joined:
    Nov 27, 2015
    Posts:
    1,478
    I use github.
     
  4. Player7

    Player7

    Joined:
    Oct 21, 2015
    Posts:
    956
    Unity's Library folder of metadata crap sure is annoying in many ways.. syncing backups, updating unity only to find its so stupid it generates everything again regardless of no actual improvements been done to the backend systems in those areas like texturing formats etc, I imagine a lot of temp files just pile up and aren't even used in projects if you take a project through multiple versions of unity updates. And just the bloat of 100000's of tiny files that pile up per project.. but yet you want to sync them at least for backup projects for the textures that take ages to convert to temp S*** again. So should you roll back or anything you can use that same Unity version and not have watching progress bars.

    Frankly a global repository for library textures per unity version would sure cut down on the temp file dupes, and save users time from watching damn progress bars just to open a project in a new version while it waste time converting textures to the F***ing thing it already did in the last version X per project using same textures.. spend more time waiting sometimes.

    I mean it's almost as bad as installing poor documentation locally.. I just don't anymore think I symlinked the docs folder to an older one, its not like it ever really improved, at least it opens up the help file when I click, mainly because I have my system browser to that dummIEs browser which is still fine for browser html docs locally, but I have it firewall blocked.. so any S*** programs and uninstallers that like to just open some random website I didn't ask to visit now just loads up in a blocked browser I barely ever touch.

    Someday Unity might bother to offer the preference option of allowing users to actually set a custom browser path like other applications have managed to find the time to do, in the decades they've existed.

    I'm sure Unity could be faster, they just do things in ways that fail to achieve those goal... at least not directly like 2 steps forward 1 step back kinda thing.. like the asset store of bullshit and the way of managing downloaded assets.. through unity, through some single tab chromium crap browser that needs internet connection to list the F***ing files you downloaded locally through it just yesterday etc.
     
  5. AndersMalmgren

    AndersMalmgren

    Joined:
    Aug 31, 2014
    Posts:
    2,096
  6. Joe-Censored

    Joe-Censored

    Joined:
    Mar 26, 2013
    Posts:
    2,575
    6 year old PCI-E SSD's are pretty unusual in a developer rig today. I'd retest with a modern SATA 3 SSD before jumping to any conclusions, assuming your old motherboard even supports SATA 3. Your old hardware and suboptimal amount of RAM is probably doing more to hamper Unity's performance than any file structure changes Unity could make.
     
  7. Player7

    Player7

    Joined:
    Oct 21, 2015
    Posts:
    956
    Those original i7, i5 (with 4cores, how the F*** intel got around to slying customers with just 2 logical cores post 2010 is some grade B bullshit) anyway they are still pretty good if you overclocked them to 3.8ghz or more easily done on air with good heatsink.. heck they practically outlived every other meaningless update intel did with those cpu's in the last 8years really. Cheaper than playing the intel game of upgrading a mobo just to get some pitiful improvement (or regression in some cases)

    It's only now that AMD are kicking their asses around that they've finally realized they have to actually start adding more logical cores into them for actual reasonable prices.
     
    bobisgod234 and AndersMalmgren like this.
  8. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    11,537
    All of those benchmarks are questionable and not just because they're positively ancient hardware. At least one of those benchmarks was ran on an OS (Windows XP) that is known to be missing features (example linked below) out of the box that are vital to the health and performance of an SSD.

    https://en.wikipedia.org/wiki/Trim_(computing)
     
  9. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    6,271


    Note the peak around 64-128 Kb now look at this CPU bandwidth comparison chart...



    There is a similar memory/cpu bandwidth sweet spot around the 64/128 Kb block sizes, so not only could this approach enhance file IO it could also potentially improve memory IO performance.

    Please try the ATTO Drive Benchmarking tool or similar on your drives I think you will find a similar trends in bandwidth speed and performance.

    Even old HDD drives have much higher bandwidth when files are read/written in larger chunks.
     
    Last edited: Jun 14, 2018
  10. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    10,595
    There's also the issue that sure, you can read data at a faster average speed if you read more of it at once, but that's useless unless you have larger amounts of data that need to be read at once.

    Just arbitrarily clumping bits of data together so that the clump can be read at a faster average speed won't speed up applications. Instead, applications will have to read more data just to pull out the smaller bits they need, and have some system to keep track of what bits of data are stored in which aggregate blob.

    And, again, this would likely be murder on version control.
     
  11. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    1,893
    While it is true that sequentially reading one large file is usually very fast, that is not always the right answer. Frequently rewriting a giant file will be tedious and performance robbing. If Unity did store everything in one massive file, each small change in the project would trigger a lot of maintenance on that large file. It would also prevent versioning systems from working properly.

    As for hardware, I would prefer to see more optimizations for modern hardware instead of optimizations for very outdated hardware. If you are still using an Intel i7-920 with 6GB of RAM, it would be far easier for you to simply get modern PC hardware instead of asking Unity to rewrite their storage. A 10 year old CPU will never be ideal in a dev rig.

    For a Unity development rig, I would recommend 32GB of RAM and a modern CPU. For example, look at an Intel i7-8700K with 32GB of RAM and a modern SSD like a Samsung 860 or 960. For the OS, use Windows 10. One of your screenshots looks like Windows XP. Don't use Windows XP anymore.

    Also, consider using modern SSD drives. The OCZ Technology 120 GB RevoDrive should be able to deliver 65,000 IOPS, and the SandForce SF1200 should be able to deliver up to 30,000 IOPS. The Samsung 960 Pro can deliver 330,000 to 440,000 IOPS. (depending on the drive size) Even the Samsung 860 Pro can deliver 100,000 IOPS. And remember that old SSD drives are probably not performing optimally anymore, especially if you are using them with Windows XP.
     
    Last edited: Jun 15, 2018
    angrypenguin likes this.
  12. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    1,893
    I opened one of my Unity projects in Unity 2018. With that project open and this browser open, I am using a total of 7.4GB of RAM on Windows 10. If the OP only has 6GB of RAM, then much of the storage performance problem would be thrashing caused by a lack of RAM. At a minimum, I would suggest using 8GB of RAM instead of 6GB.
     
  13. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    6,271
    The graphs provided were for illustration purposes and were not examples of my current hardware.

    Please run OTTA or a similar benchmark on your rigs and see what bandwidth speeds you get vs chunk/buffer sizes?

    I'm betting you all see similar results as I'm guessing the IO bus is a limiting factor in maximum throughput for hard drives and SSDs.

    For reference here are my ATTO results for a Ram Drive (RD):



    Even with the most up to date SSD/CPU/MB you will probably struggle to get these kinds of speeds.

    And again we see a peak in throughput around the 64 to 128k buffer sizes.

    Even with a RD 1 Kb reads and writes just get about 120-130 MBs vs 64KB getting 6.9-5.5 GBs.

    Maybe different hardware/OS e.g. Mac will provide a different best buffer/cache sizes but it looks like there is definitely a sweet spot that Unity's IO processes could take better advantage of.

    And imagine if not only the Unity editor but the Unity engine and your games got this boost in IO throughput, OK different hardware platforms might have different sweet spots but if Unity can tune Unity's IO functions to take advantage of these we could see improvements in Unity and in our games.
     
    Last edited: Jun 15, 2018
  14. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    10,595
    So what? If you only* need to read a few hundred bytes of data then putting it in a file with megabytes of other data will not make it "faster". The average read speed may well increase, but you're still forcing far more work to achieve the required result.

    It's kind of like saying that shopping is more efficient when you buy lots of stuff at once, so from now on you're only going to buy milk by the pallet.

    * On the flip side, if you are in a situation where you know in advance that you'll need to load a bunch of related data then this approach can actually work really well. I think this used to be a common optimisation in game builds, packaging all required assets in a combined scene/level file to be loaded in one shot as a direct sequential copy, reducing load time at the expense of some extra storage. But builds behave very differently to an Editor...
     
    Last edited: Jun 15, 2018
  15. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    6,271
    To use your shopping analogy Unity is currently buying it's eggs one at a time, taking them home and putting each one in the fridge before going out and getting the next one.

    My solution to the slow egg problem is to buy them by the dozen and while were at it lets get the bread in a loaf at a time instead of a slice at a time!

    Please run Unity and use a tool like Process Monitor to see Unity get an egg or slice of bread one at a time (< 64Kb file reads often < 1 Kb) as it loads in the files it needs.

    Hint: Process Monitor has filters you can set so you can focus in on only what Unity.exe does and only it's file reading or writing activity.

    Hopefully the @UT team will read this and actually test my theory out!
     
  16. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    10,595
    How 'bout you respond to the first paragraph of what you quoted? ;)

    So lets load a bunch of stuff we don't need that's not even related, and that - in a computer - potentially makes other things far less efficient?
     
  17. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    1,893
    I definitely understand the impact of chunk sizes during various types of storage benchmarks. Nobody is trying to debate that. We all agree that large files and sequential reads offer great performance for situations where you actually need to sequentially read a large blob of data.

    What I am saying is that switching to a large file creates a different performance penalty for use cases that are not a good fit for that pattern. If you need to modify a small chunk of data in the middle of a large file, there is a high risk of creating a lot of performance penalties during the writes.

    You cannot assume that large files and sequential reads are the ticket for every use case. There is often a need to read and write very small chucks of data. If you stuff all of those small files into one large file, it will help performance in some cases and hurt performance in other cases.
     
  18. Eric5h5

    Eric5h5

    Volunteer Moderator Moderator

    Joined:
    Jul 19, 2006
    Posts:
    31,723
    To clearly answer the topic question: no, plus it messes up version control. If you want faster WebGL builds, get a better computer or fix your current one. Granted the WebGL build process is rather Rube Goldbergian, but your computer is seriously borked if it's taking an hour to make a build of a cube; mine takes a little under 2 minutes, and by now it's a few generations away from being high-end.

    Physical cores. Logical cores are "fake" cores, achieved with SMT.

    --Eric
     
    Kiwasi, Ryiah and zombiegorilla like this.
  19. zombiegorilla

    zombiegorilla

    Moderator

    Joined:
    May 8, 2012
    Posts:
    6,977
    FTW.

    That is a critical element to good (and effective) engineering. Certainly any discrete processes can be done in a optimal way, but that way may not have a positive impact once in the larger system. Micro optimizations in a large complex system are a fools game. Clearly things like build times and vcs take a hit in this case. It’s a overall loss.
     
  20. Dustin-Horne

    Dustin-Horne

    Joined:
    Apr 4, 2013
    Posts:
    4,523
    Not to mention that you also destroy the ability to do async reads and writes. It's trivial with several smaller files, but once you have a gigantic file, one thread writing to that file destroys the integrity of another thread reading from it since byte offsets change.

    Speaking of offsets... a good example of another issue can be seen by looking at common RDBMS systems. Indexes on tables create tremendous read performance benefits... but they have a cost... as they're often ordered, if data needs to be inserted in the middle or even beginning of that index, the index has to be rebuilt from that point forward. That's why ordered indexes are bad on GUID columns as the randomness can result in the GUID needing to be placed potentially anywhere within the index and makes insert (write) times suffer.
     
    angrypenguin likes this.
  21. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    1,893
    Exactly. You would need to practically build another file system on top of it to handle all of the IO edge cases properly.
     
    Dustin-Horne likes this.