Fixed camera setup for object localisation and measurement

A look at fixed camera setup for object localisation and measurement

computer-vision.jpg

A common task in Computer Vision is to use a camera for localise and measure certain objects in the scene. In the industry is common to use images of objects on a high contrast background and use Computer Vision algorithms to extract useful information.

There's a lot of literature about the computer vision algorithm that we can use to extract the information, but something that's usually neglected is how to correctly setup the camera in order to correctly address the problem. This post aim is to shed light on this subject.

The problem

The problem we aim to solve with Computer vision is to measure (in mm) objects of unknown shape, but with known thickness ToT_o and max height HoH_o and width WoW_o values, while satisfying the constraint on the required minimum accuracy / error tolerance.

The camera setup for this kind of problem consists in:

  • Finding the correct working distance (distance between the object surface and the lenses)
  • Choose the right focal length.

In the following I'm going to show a possible 3 steps approach that can be used to correctly setup the camera.

Step 1: camera calibration & px/mm ratio calculation

Without entering in the detail of camera calibration, all we need to know is that the calibration process allow to represent the camera intrinsic parameters as a 3×33 \times 3 matrix. What the calibration does is to estimate the parameters of a pinhole camera model that approximate the camera that produces the set of photos given in input to the process.

A=(fx  γ  cx0  fy  cy0    0   1)A = \begin{pmatrix} f_x ~~\quad \gamma ~~\quad c_x \\ 0 ~~\quad f_y ~~\quad c_y \\ 0 ~~~\quad ~0 ~~\quad ~1 \end{pmatrix}

where fxf_x and fyf_y are the focal distances in px and (cx,cy)(c_x, c_y) is the optical center in px.

In case of a squared sensor fxf_x and fyf_y are equal, but in general we can consider fxfyf_x \approx f_y and consider a single focal length in px

fxy=fx+fy2[px]f_{xy} = \frac{f_x + f_y}{2} \quad [px]

The theory of the camera resectioning gives us the relation between the estimated focal lengths (in px) and the real focal length (in mm).

fx=mxf,fy=myff_x = m_x \cdot f \quad , \quad f_y = m_y \cdot f

Since we're considering fxyf_{xy} we can just consider a single equation

fxy=mff_{xy} = m \cdot f

In short, the estimated focal length in pixel is the real focal length ff (mm) times a scaling factor mm (px/mm).

m=fxyf[pxmm]m = \frac{f_{xy}}{f} \quad [\frac{px}{mm}]

This scaling factor is extremely important, because it measure the number of pixels in a millimeter of sensor.

Step 2: relationship between distance, object on sensor and object in scene

There's a relation between the size of an object in the scene and the size of the object on the image plane. This relation comes from the thin lenses equiation.

Given XX the real size of the object (mm) and xx the size of the object in pixels, we know that

WD=Xfxm[mm2pxpxmm=mm]\text{WD} = \frac{X \cdot f}{\frac{x}{m}} \quad [\frac{mm^2}{\frac{px}{\frac{px}{mm}}} = mm]

That in English it can be read as "the working distance in millimeters is the object real size in millimeter times the focal length in millimiters, divived by the object size on the image sensor".

Hence it's pretty easy to measure the size of the object in millimeters, when every other variable is know:

X=WDxmfX = \frac{\text{WD} \cdot \frac{x}{m}}{f}

Step 3: satisfy constraints

There are 2 constraints that have to be satisfied when designing an object measurement system:

  1. Being able to measure the whole object
  2. Minimum accuracy

Step 3.1: FOV constraint

The constraint on the ability of measure the whole object can be satisfied analyzing the Field of View (FOV) of the camera.

Let Mo=max(Wo,Ho)+δM_o = max(W_o, H_o) + \delta, where δ\delta is a "safety margin" used to compensate the camera calibration distortion removal and the need for a background around the object (usual values for δ\delta are in range [50,100][50, 100] mm). Let hh and ww be the height and width of the sensor respectively (these values are available on the camera datasheet), then

αH=FOVH=2atanh2f,αW=FOVW=2atanw2f\alpha_H = FOV_H = 2\text{atan} \frac{h}{2f} \quad ,\quad \alpha_W = FOV_W = 2\text{atan} \frac{w}{2f}

Since the object can be in any possible orientation we can consider only the smaller FOV when finding the right distance for the camera (because this is the constrained one):

α=min(αH,αW)\alpha = \min(\alpha_H, \alpha_W)

It's obvious that α\alpha is the angle (in radians) between the working distance WD\text{WD} and the "last ray of light" (in the sense of farther from the center) captured by the sensor. It's also clear that the length of this ray of light changes according to the working distance.

The following images will make everything clear: fov.png

On the yy axis the position of MoM_o is highlighted because we have to find the distance dd that makes the whole object (and the safety margin) visible. Hence:

{WD=LcosαMo  =LsinαWD=Motanα\begin{cases} WD \quad= L \cos \alpha \\ M_o \quad~~= L \sin \alpha \end{cases} \quad \Rightarrow \text{WD} = \frac{M_o}{\tan \alpha}

This means that our working distance (noted as d in the picture) can be found exactly.

Please note that we're creating an object measurement application, hence we can exploit other information regard the object in order to improve the precision. In fact, if we know in advance the set of thickness (in mm) T={T1,T2,,Tn}T = \left\{T_1, T_2, \dots, T_n\right\} that our objects could have, we can place our camera at a smaller distance and hence increase the accuracy (see next section).

In practice, the real working distance (that's the one we're really interested) can be found as:

WDr=WDmin{T}offset\text{WD}_r = \text{WD} - \min\{T\} \text{offset}

The offset term is an optional term, that usually can be found on the camera datasheet, that's the relative position of the sensor with respect to the measurement point (in the order of [0,5][0, 5]mm usually).

WARNING: The working distance computed in this way is a theoretical estimation of the real working distance since the camera model we're using is the pinhole using, hence we're using the thin lens equation as the foundation for our reasoning. In practice, the working distance to use in a real-world application must be computed using a software solution (exploiting the information about the size of a known object and the measured object in pixel) since the thin lens equations can't model complex lens system in a precise way. Hence, you can use all the content of this article to get a rough estimation of the working distance in order to properly setup the camera physically.

Step 3.2: minimum accuracy constraint

The constraint on the accuracy can be formalised as follow:

#pxΔ1\frac{\#px}{\Delta} \geq 1

where Δ\Delta is the accuracy required and the 1 represents a lower bound (we can't have a number of pixel less than 1 at a specified tolerance). In english: the number of pixel of the image per Δ\Delta millimiter of the scene must be greather than 1.

If, for instance, the requirement is to have an accuracy of 3mm, the inequality becomes:

#px31\frac{\#px}{3} \geq 1

From the relation of the object in the scene on the object on the sensor (now with the real working distance) we can measure the number of pixels per millimiter, in fact

X=WDrxmfx=XfmWDr=XfxyWDrX = \frac{\text{WD}_r \cdot \frac{x}{m}}{f} \Leftrightarrow x = \frac{X f m}{\text{WD}_r} = \frac{Xf_{xy}}{\text{WD}_r}

So, now is extremely easy to calculate the number of pixels per millimiter in the scene and check if the previous relation holds:

ΔfmWDr1\frac{\Delta f m}{\text{WD}_r} \ge 1

if the relation holds, we have correctly setup our system (but another safety margin can be to increase the number of pixels per accuracy required and hence change that 1 to something bigger).

Instead, if this relation does not hold we have to change the moving part of our system in order to satisfy every requirement:

  1. Check if the thickness of the object you're measuring can help you making the camera closed to the object
  2. Change the focal length (and repeat every calculation, but only after a new calibration!)
  3. Evaluate the usage of more cameras and stitch the images together
  4. Last resort: change the camera(s)

One last tip: the relation x=XfmWDrx = \frac{X f m}{\text{WD}_r} allows also to measure the system accuracy (in px/mm), hence the number of pixels per single millimiter of the scene, just set Δ=1\Delta=1 and you're done!

0
0
Responses
Leave a message
Join the team