Search Unity

OpenCV and SolvePNP problem

Discussion in 'AR/VR (XR) Discussion' started by aapo_p, Aug 6, 2020.

  1. aapo_p

    aapo_p

    Joined:
    Jan 15, 2014
    Posts:
    3
    Hey everyone, I'm having a bit of a 'not quite understanding how this works' situation here.

    I'm suppose to do a simple app that you first take a picture of your wall with a camera (or add from your library), after that you would mark 4 points that correspond the walls dimensions (for sake of simplicity let's say you mark all four corners of the wall). With these four points I would calculate the camera pose in relative to the wall (I will also ask the user to input the width and height of the marked area, in this example the whole wall). At least currently I'm forcing the left and right edge of the selection to be completely vertical.

    After this is done, I would then take the given information, put them in OpenCV (currently I use the free OpenCV plus Unity) SolvePnP()-function and use the output to rotate and translate a plane mesh or the camera in the world, so that the camera pose is correct.

    So currently I do all this, looking something like this:

    First I take the world positions of a plane and screen positions of the points the user has given, where op is objectPoints, ip is imagePoints.

    Code (CSharp):
    1. MatOfPoint3f op = new MatOfPoint3f();
    2. for (int i = 0; i < m_meshObject.mesh.vertexCount; i++) {
    3.     op.Add(
    4.         new Point3f(
    5.             m_meshObject.mesh.vertices[i].x,
    6.             m_meshObject.mesh.vertices[i].y,
    7.             m_meshObject.mesh.vertices[i].z
    8.         )
    9.     );
    10. }
    11.  
    12. MatOfPoint2f ip = new MatOfPoint2f();
    13.  
    14. for (int i = 0; i < guideLinePoints.Count; i++) {
    15.     Vector2 screenPos = m_camera.WorldToScreenPoint(guideLinePoints[i]);
    16.     ip.Add(
    17.         new Point2f(
    18.             screenPos.y,
    19.             screenPos.x
    20.         )
    21.     );
    22. }
    I do also a camera matrix based on some examples I've found online and I ignore distortion coefficients. These two don't really matter, since the result doesn't have to be perfect, like in an AR app.

    So with these I run SolvePnP(). After this, I do a few conversions.
    Code (CSharp):
    1. Mat Rvec = new Mat();
    2. Mat Tvec = new Mat();
    3. Mat rvec = new Mat(1, 3, MatType.CV_64FC1);
    4. Mat tvec = new Mat(1, 3, MatType.CV_64FC1);
    5.  
    6. Cv2.SolvePnP(
    7.     op,
    8.     ip,
    9.     camMat,
    10.     distCoeffs,
    11.     rvec,
    12.     tvec
    13. );
    14.  
    15. rvec.ConvertTo(Rvec, MatType.CV_32F);
    16. tvec.ConvertTo(Tvec, MatType.CV_32F);
    17.  
    18. Mat rotMat = new Mat(3, 3, MatType.CV_64FC1);
    19.  
    20. Cv2.Rodrigues(Rvec, rotMat);
    After this I don't really know anymore what to do. With these I should be able to rotate and translate my camera, but I don't know how. I've tried to look at all sorts of examples and tutorials (most of which are for OpenCV in Python or C++ and these don't help with the Unity transform-stuff).

    So with what I have now, screen positions like these:

    solvepnp_screen_points.JPG

    I get values like these:
    I have no idea if these even look like values I should be getting, but I just added these if they are of any help.


    TL: DR; I'm trying to use OpenCV SolvePnP() to get camera pose from a picture of a wall and manually added 4 points indicating the corners of the wall in the picture. I get values from SolvePnP(), but I'm at loss on how to convert these matrices to Unity transform.position and transform.rotation.


    If anyone could help me it would be super great, I'm losing my mind here...
     
  2. phil-R

    phil-R

    Joined:
    Nov 20, 2020
    Posts:
    9
  3. aapo_p

    aapo_p

    Joined:
    Jan 15, 2014
    Posts:
    3
    Hey, I never solved the problem, I seemed to be pretty close, but never got there. Ended up doing things much more simply and not using OpenCV at all. Also I'm not working on the project anymore since it got completed and released, so most likely I won't be going back to this problem either (at least for a while).

    The bigges problem I had (on top of not getting stuff to actually work, d'uh), was that at any point I wasn't quite sure if I was doing stuff correctly. Hard to know if the preparations are incorrect or the handling of the outcome from SolvePnP. This was also now "so long time ago", that I've forgotten most of the details not on the OP.

    That OpenCV for Unity utility might actually have worked for me, by the looks of it. I was working with the free version though, which is just the C# conversion of OpenCV, so it didn't have any Unity-utilities in it.

    I'm sorry I can't be of help to you, hope you get your things working!
     
  4. phil-R

    phil-R

    Joined:
    Nov 20, 2020
    Posts:
    9
    No worries, and thank you!

    My problem is actually very similar to the original app you posted. I have a plane of known dimensions in camera space, and I have the user move four corners in screen space to match the plane. I need to then roughly align a unity camera to this pose. I'd definitely be curious to know about the simpler method you mentioned. Would you be open to a quick DM?
     
  5. bfig

    bfig

    Joined:
    Jul 18, 2019
    Posts:
    2
    To get intrinsic parameters of the camera, you have some options. You can get the information from the camera through Unity using XRCameraIntrinsics (if you're using an XR device like a mobile phone). I looked at the opencv solvepnp documentation and filled out a 3x3 matrix with the info necessary that I could get from XR camera manager.

    Or you can manually calibrate the camera. There's lots of information available online about that. You can use OpenCV to calibrate it, or matlab has a Camera Calibration app that does it for you. This method will give you very precise intrinsic parameters, and even calculated distortion.

    I've used both, and I get very similar results.
     
  6. triduzak

    triduzak

    Joined:
    Jan 24, 2021
    Posts:
    3
    Hey, did you resolve the problem? :)
     
  7. mattycorbett

    mattycorbett

    Joined:
    May 9, 2021
    Posts:
    13
    Did anyone get anywhere with this? Im trying to estimate head pose from MediaPipes face landmarks. I'm using the code below.

    Code (CSharp):
    1. var camera_matrix = new Mat(3, 3, CvType.CV_32F);
    2.                 var dist_coeffs = new MatOfDouble(1, 4, CvType.CV_32F);
    3.                 Mat Rvec = new Mat();
    4.                 Mat Tvec = new Mat();
    5.                 Mat rvec = new Mat(1, 3, CvType.CV_64FC1);
    6.                 Mat tvec = new Mat(1, 3, CvType.CV_64FC1);
    7.  
    8.                 camera_matrix.put(0, 0, 600, 0, 400, 2, 600, 300, 0, 0, 0);
    9.                 dist_coeffs.put(0, 0, 600, 0, 400, 2);
    10.  
    11.                 OpenCVForUnity.CoreModule.Point[] imagePoints = new OpenCVForUnity.CoreModule.Point[6];
    12.                 imagePoints[0] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[1].X, landmarks.Landmark[1].Y);
    13.                 imagePoints[1] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[52].X, landmarks.Landmark[52].Y);
    14.                 imagePoints[2] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[226].X, landmarks.Landmark[226].Y);
    15.                 imagePoints[3] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[446].X, landmarks.Landmark[446].Y);
    16.                 imagePoints[4] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[57].X, landmarks.Landmark[57].Y);
    17.                 imagePoints[5] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[287].X, landmarks.Landmark[287].Y);
    18.                 var image_points = new MatOfPoint2f(imagePoints);
    19.  
    20.                 OpenCVForUnity.CoreModule.Point3[] objectPoints = new OpenCVForUnity.CoreModule.Point3[6];
    21.                 objectPoints[0] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[0].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[0].y,rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[0].z);
    22.                 objectPoints[1] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[1].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[1].y, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[1].z);
    23.                 objectPoints[2] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[2].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[2].y, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[2].z);
    24.                 objectPoints[3] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[3].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[3].y, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[3].z);
    25.                 objectPoints[4] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[4].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[4].y, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[4].z);
    26.                 objectPoints[5] = new OpenCVForUnity.CoreModule.Point3(rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[5].x, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[5].y, rearFaceCube.GetComponent<MeshFilter>().mesh.vertices[5].z);
    27.                 MatOfPoint3f object_points = new MatOfPoint3f(objectPoints);
    28.  
    29.                 Calib3d.solvePnP(object_points, image_points, camera_matrix, dist_coeffs, rvec, tvec);
    30.  
    31.                 // Convert to unity pose data.
    32.                 double[] rvecArr = new double[3];
    33.                 rvec.get(0, 0, rvecArr);
    34.                 double[] tvecArr = new double[3];
    35.                 tvec.get(0, 0, tvecArr);
    36.                 PoseData poseData = ARUtils.ConvertRvecTvecToPoseData(rvecArr, tvecArr);
    37.  
    38.                 var outQuat = poseData.rot;
    It runs just fine, however, the output roations are very wrong. They are erratic, and almost never in the correct direction. ANy ideas?