Search Unity

Matrix multiplication optimization

Discussion in 'Scripting' started by Nihil688, Apr 11, 2020.

  1. Nihil688

    Nihil688

    Joined:
    Mar 12, 2013
    Posts:
    503
    I am doing the following which results in 14000 calls so I am trying to optimize the matrix multiplications. As they are structs I get quite a lot of ms of delay because of:

    String.memcpy
    Buffer.Memcpy
    Buffer.memcpy4


    Code (CSharp):
    1. SkinnedMeshRenderer renderer = target.GetComponent<SkinnedMeshRenderer>();
    2.             if( renderer != null )
    3.             {
    4.                 NeedsScaleAdjust = false;
    5.  
    6.                 Mesh mesh = new Mesh();
    7.                 renderer.BakeMesh( mesh );
    8.                 mesh.boneWeights = renderer.sharedMesh.boneWeights;
    9.                 outMesh = mesh;
    10.  
    11.                 Matrix4x4 scale = Matrix4x4.Scale( _target.transform.localScale ).inverse;
    12.                 outBindposes = new Matrix4x4[ renderer.bones.Length ];
    13.                 for( int i = 0; i < renderer.bones.Length; i++ )
    14.                 {
    15.                     outBindposes[ i ] = renderer.bones[ i ].worldToLocalMatrix * target.transform.localToWorldMatrix * scale;
    16.                 }
    17.  
    18.                 return;
    19.             }
    I am not sure if I use float4 from Unity Mathematics is going to help at all since float4 is also a struct but any ideas as to how to improve this would be great!
     
  2. SpookyCat

    SpookyCat

    Joined:
    Jan 25, 2010
    Posts:
    3,768
    Cache the matrices first, and if you are doing this every frame have a pool for matrices. Also you can move the target.transform.localtoworldmatrix * scale outside the loop. I have done things like this and I did my own matrix mult methods and used multithreading, but I guess with Unity now Burst etc would do a good job.
     
  3. Nihil688

    Nihil688

    Joined:
    Mar 12, 2013
    Posts:
    503
    @SpookyCat I've actually found some of your questions from years ago in the forums and were reading through them :D
     
  4. Nihil688

    Nihil688

    Joined:
    Mar 12, 2013
    Posts:
    503
    I was thinking whether I could add the variables in hashsets and if I can somehow check whether the 3 of them had been multiplied before I can just return the cached result. Did you mean that by caching the matrices @SpookyCat ?

    PS: My solution is already in multithreading so it'll start getting very complex very quickly if I add Matrices in threads as well although not a bad idea
     
  5. Nihil688

    Nihil688

    Joined:
    Mar 12, 2013
    Posts:
    503
    This makes it a bit faster to anyone that might need it, I'm also unity Unity Mathematics, although I must have a bug somewhere as the result isn't the same but it could be in another part of the code

    Code (CSharp):
    1. public static float4x4 Multiply( ref float[ , ] matrix1, ref float[ , ] matrix2 )
    2.         {
    3.             // caching matrix lengths for better performance
    4.             int matrix1Rows = matrix1.GetLength( 0 );
    5.             int matrix1Cols = matrix1.GetLength( 1 );
    6.             int matrix2Rows = matrix2.GetLength( 0 );
    7.             int matrix2Cols = matrix2.GetLength( 1 );
    8.             // checking if input is defined
    9.             if( matrix1Cols != matrix2Rows )
    10.             {
    11.                 return default;
    12.             }
    13.  
    14.             // creating the final input matrix
    15.             float[ , ] product = new float[ matrix1Rows, matrix2Cols ];
    16.             // looping through matrix 1 rows
    17.             for( int matrix1Row = 0; matrix1Row < matrix1Rows; matrix1Row++ )
    18.             {
    19.                 // for each matrix 1 row, loop through matrix 2 columns
    20.                 for( int matrix2Col = 0; matrix2Col < matrix2Cols; matrix2Col++ )
    21.                 {
    22.                     // loop through matrix 1 columns to calculate the dot input
    23.                     for( int matrix1Col = 0; matrix1Col < matrix1Cols; matrix1Col++ )
    24.                     {
    25.                         product[ matrix1Row, matrix2Col ] +=
    26.                             matrix1[ matrix1Row, matrix1Col ] *
    27.                             matrix2[ matrix1Col, matrix2Col ];
    28.                     }
    29.                 }
    30.             }
    31.  
    32.             return Convert( product );
    33.         }