Skip to main content


Geometric analysis of reality-based indoor 3D mapping

Article metrics


Vision based mapping has become an important way to provide geospatial information for vision based navigation especially when satellite signals are not available. When acting as an independent source for navigation, its quality will affect that of navigation directly. However, geometry is one key component that affects the quality of vision-based mapping including reliability, separability and accuracy. Analysing the geometry provides a reference for users to design and judge the mapping strategy to meet the requirement in quality. This paper aims to explore the geometry’s influence on accuracy, reliability and separability in reality based indoor 3D mapping. Firstly, an analytical analysis based on the global redundancy number is conducted. Secondly, the geometric strength between the camera and ground control points (GCPs) quantified by Dilution of Precision (DoP) is analysed under different indoor mapping scenarios. Thirdly, the relationship between two geometric components including overlapping percentage and intersection angle and quality including reliability and separability is analysed based on a simulation environment. Geometric analysis shows that three images have the ability to provide enough global redundancy for reality based 3D mapping. GCPs with a good coverage of the image and a shorter distance between the camera and the object will contribute to good geometry. Besides, mapping simulation in the indoor environment based on two selected functional models shows that the number of images is the key factor that influences Minimum Detectable Bias (MDB) and Minimum Separable Bias (MSB).


Vision-based indoor mapping has received increased attention due to the growing demand for indoor navigation as satellite signals are not strong enough to be tracked in indoor environment (Taylor, 2009, Milford and Wyeth, 2008, Konolige and Agrawal, 2008). Recently, reality based 3D mapping using a single camera attracts researcher’s attention for its advantages such as low cost, passive sensing, information richness and high accuracy (Davison et al., 2007).

As a new type of map, reality-based 3D mapping provides an approach for a single camera to georeference the surrounding environment (Li, 2013). The generation of reality based 3D map is as follows. Firstly, indoor environments are surveyed to obtain real world coordinates of GCPs, and then after overlapping images are captured by a single camera, GCP coordinates in the world reference frame and image reference frame are both known. With a calibrated camera, its poses at different time can be initially determined based on the collinearity equation. Then, the keypoints from the overlapping image can be detected, described and matched. With the determined poses of multiple calibrated cameras, the real world coordinates of keypoints can be initially obtained. Finally, bundle adjustment using the above obtained initial values is conducted to minimize the re-projection errors. This process can be regarded as “space resection-intersection-bundle adjustment”.

The essence of reality based 3D mapping is to georeference keypoints using geospatial information transferred from GCPs. Geometry is one key factor that affects mapping quality. This is analogous to satellite navigation, where the geometry is the spatial relationship between satellites and receivers. Similarly, geometry in reality-based 3D mapping is related to four aspects: the distribution and number of GCPs, the distance between the camera and GCPs, the relative poses of cameras which can be quantified by overlapping percentage and intersection number, as well as the number of the images taken by the camera.

On the other hand, quality control theory in Global Navigation Satellite System (GNSS) community has been well developed to quantify the magnitude of error propagation, detect possible outliers, and evaluate the performance of outlier detection and separation. More specifically, DoP values quantify the magnitude of error propagation. Outlier statistic test such as w test (Baarda, 1968) can be utilized to detect and exclude outliers. The performance evaluation is indicated by reliability and separability. Reliability quantifies the minimum magnitude of outlier that can be detected (Baarda, 1968), while separability determines the minimum bias that can be separated for every pair of observations (Wang and Knight, 2012).

It is well known in GNSS community that generally better geometry will be beneficial for navigation quality. However, in reality based 3D mapping, the relationship between the geometry and the quality has not been comprehensively analysed. More specifically, a number of practical issues, including the relationship between global redundancy and the number of images and keypoints, the appropriate number and distribution of GCPs, and GCPs’ appropriate distance to the camera, the relationship between geometry and reliability as well as separability, have not been analysed in detail.

The earliest work on analysing the geometric aspects in photogrammetry can be traced back to Gruen (1978). The accuracy and reliability of self-calibrating bundle adjustment system for mapping were analyzed by statistic test, which included significance tests for additional parameters, residuals at check points and control points, as well as coordinate differences in deformation measurements. The analysis demonstrated the feasibility and importance of quality control in vision based mapping system. Then, Förstner (1987) further evaluated precision, controllability and robustness for planning purpose in vision based measurement problems, which included template matching, absolute orientation and relative orientation. More recently, Alsadik et al. (2014) put forward two automatic filtering methods, namely object accuracy based method and fuzzy function based method, for camera network design considering coverage and accuracy, and demonstrated that the optimal network design was accurate and efficient in completeness and time complexity. Nocerino et al. (2014) evaluated the accuracy of image based 3D model reconstruction with ground truth data, and showed that convergent image could ensure the accuracy. However, all the aforementioned papers still lack the comprehensive analysis in the relationship between geometry and quality including global redundancy number, DoP, MDB and MSB in reality based 3D mapping. Therefore, as a major contribution, this paper explored the influence of geometry on global redundancy number, DoP, reliability and separability in various reality-based 3D indoor mapping scenarios. Functional model section presents a brief introduction on the functional modelling using collinearity equation for reality based 3D mapping. DoP values in space resection and Reliability and separability sections discuss the concept of DoP, MDB and MSB. Analysis and experiments section conducts an empirical analysis including global redundancy number, DoP values, accuracy, reliability and separability. Concluding Remarks section summarizes the geometry’s influence on quality in reality-based 3D indoor mapping and points out the problems that need to be explored further.

Functional model

The fundamental components in reality based 3D mapping come from collinearity equation, which can be expressed in the following form:

$$ \begin{array}{l}{F}_x=x-{x}_0=-f\frac{a_1\left(X-{X}_c\right)+{b}_1\left(Y-{Y}_c\right)+{c}_1\left(Z-{Z}_c\right)}{a_3\left(X-{X}_c\right)+{b}_3\left(Y-{Y}_c\right)+{c}_3\left(Z-{Z}_c\right)}\hfill \\ {}{F}_y=y-{y}_0=-f\frac{a_2\left(X-{X}_c\right)+{b}_2\left(Y-{Y}_c\right)+{c}_2\left(Z-{Z}_c\right)}{a_3\left(X-{X}_c\right)+{b}_3\left(Y-{Y}_c\right)+{c}_3\left(Z-{Z}_c\right)}\hfill \end{array} $$

where x and y are image coordinates respectively, and X, Y and Z are the corresponding coordinates in the world frame. x0, y0 and f are the interior parameters. Xc, Yc and Zc are the position of camera. ai, bi and ci (i = 1, 2, 3) are the factors in the rotation matrix from ω, φ and K.

Four elements exist in Eq. (1), namely objects’ image coordinates, objects’ corresponding world coordinates, exterior parameter and interior parameter. As mentioned, initial position and orientation of the camera for the image for mapping can be obtained by space resection using GCPs. For each GCP, the collinearity equation can be linearized as illustrated in Eqs. (2, 3, 4 and 5), where A EO and δ EO are design matrix and correction for the pose of camera respectively, and L GCPi is the observation from GCPs’ image coordinates.

$$ {A}_{EO}=\left(\begin{array}{l}\frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial {X}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial {Y}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial {Z}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial \omega}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial \varphi}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial \mathrm{K}}\hfill \\ {}\frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial {X}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial {Y}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial {Z}_c}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial \omega}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial \varphi}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial \mathrm{K}}\hfill \end{array}\right) $$
$$ {\delta}_{EO}=\left(d{X}_c\ d{Y}_c\ d{Z}_c\ d\omega\ d\varphi\ d\mathrm{K}\right) $$
$$ {L}_{GCP}={\left(-{F}_x, - {F}_y\right)}^T $$
$$ {A}_{EO}{\delta}_{EO}={L}_{GCPi} $$

Usually one image contains a number of GCPs, and keypoints that need to be georeferenced. The initial values of keypoints’ world coordinates are obtained using space intersection with the cameras’ initial pose from space resection. For each keypoint, its corresponding components in bundle adjustment can be constructed as Eqs. (6, 7, and 8), where A S and δ S are design matrix and correction for objects’ world coordinates respectively, and L KPi is the observation from keypoints’ image coordinates.

$$ {A}_S=\left(\begin{array}{l}\frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial X}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial Y}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{x}}}{\partial Z}\hfill \\ {}\frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial X}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial Y}\kern1em \frac{\partial {\mathrm{F}}_{\mathrm{y}}}{\partial Z}\hfill \end{array}\right) $$
$$ {\delta}_S=\left(dX\ dY\ dZ\right) $$
$$ \left({A}_{EO}\ {A}_S\right)\left(\begin{array}{l}{\delta}_{EO}\hfill \\ {}{\delta}_S\hfill \end{array}\right)={L}_{KPi} $$

Finally, as illustrated in Eq. (9) (Case I), Eqs. (5) and (8) for all the images are combined, and then bundle adjustment is conducted to refine cameras’ pose and objects’ world coordinates. In this case, GCPs’ world coordinates are fixed as error free values.

$$ \left(\begin{array}{l}{A}_{EOc}\kern1.5em 0\hfill \\ {}{A}_{EOk}\kern1.5em {A}_{Sk}\hfill \end{array}\right)\left(\begin{array}{l}{\delta}_{EO}\hfill \\ {}{\delta}_S\hfill \end{array}\right)=\left(\begin{array}{l}{L}_{GCPi}\hfill \\ {}{L}_{KPi}\hfill \end{array}\right) $$

When some GCPs lie in the overlapping areas of multiple images, they can be treated as observed points with their image coordinates be so-called “pseudo-observations”. Therefore another type of bundle adjustment can be formulated in Eq. (10) (Case II), where L KPi , L GCPi and L GCPw are the observations for keypoints’ image coordinates and GCPs’ image coordinates and world coordinates respectively. In this case, GCPs’ world coordinates are considered to be estimated in the presence of errors.

$$ \left(\begin{array}{l}{A}_{EOk}\kern1.5em 0\kern1.5em {A}_{Sk}\hfill \\ {}{A}_{EOc}\kern1.5em {A}_s\kern1.5em 0\hfill \\ {}0\kern3.5em 0\kern3em I\hfill \end{array}\right)\left(\begin{array}{l}{\delta}_{EO}\hfill \\ {}{\delta}_S\hfill \\ {}{\delta}_G\hfill \end{array}\right)=\left(\begin{array}{l}{L}_{KPi}\hfill \\ {}{L}_{GCPi}\hfill \\ {}{L}_{GCPw}\hfill \end{array}\right) $$

To sum up, the observation and unknown parameters in mapping for both cases are illustrated in Table 1.

Table 1 Observations and unknowns for reality-based 3D mapping

DoP values in space resection

DoP (Dilution of Precision), which is also referred as geometric strength, is the indicator that illustrates the error propagation from observations to estimated parameters. Further, in space resection for determining the initial values for camera pose, DoP shows the coefficient for error propagation from image coordinates of GCP to camera pose (position and orientation). The lower DoP means stronger geometric strength. DoP can be calculated by using the design matrix as illustrated in Eq. (11).

$$ {\left({A}^TA\right)}_{EO}^{-1}=\left(\begin{array}{l}{D}_{X_c}^2\hfill \\ {}\kern3.5em {D}_{Y_c}^2\hfill \\ {}\kern7em {D}_{Z_c}^2\hfill \\ {}\kern9.5em {D}_{\omega}^2\hfill \\ {}\kern12em {D}_{\varphi}^2\hfill \\ {}\kern15em {D}_{\mathrm{K}}^2\hfill \end{array}\right) $$

Among them, DoP for position and orientation can be calculated as follows (Li and Wang, 2012):

$$ XDOP={D}_{X_c}\kern0.5em YDOP={D}_{Y_c}\kern0.5em ZDOP={D}_{Z_c} $$
$$ PDOP=\sqrt{D_{X_c}^2+{D}_{Y_c}^2+{D}_{Z_c}^2} $$
$$ \omega DOP={D}_{\omega}\kern0.5em \varphi DOP={D}_{\varphi}\kern0.5em \mathrm{K} DOP={D}_{\mathrm{K}} $$
$$ ADOP=\sqrt{D_{\omega}^2+{D}_{\varphi}^2+{D}_{\mathrm{K}}^2} $$

where PDOP is the DoP values for camera position, and ADOP is the DoP values for camera orientation.

Reliability and separability

As mentioned above, MDB (minimum detectable bias) refers to lower bounds of detectable outliers in observation which can be illustrated according to Baarda (1968):

$$ MDB=\frac{\delta_0{\sigma}_i}{\sqrt{r_i}} $$

where δ 0 and σ i are the non-centrality parameter defined by Type I and II error, and observation’s prior standard deviation respectively, r i is redundancy number of the ith observation, which can be expressed by the diagonal number of Eq. (17).

$$ R=I-A{\left({A}^TPA\right)}^{-1}{A}^TP $$

Further, MDB is related to a variety of factors including stochastic model, functional model, geometry and testing parameters (Salzmann, 1991). However, MDB can be analyzed in the planning stage of mapping.

In Eq. (16), δ 0 is the function of type I and II error which depends on the user’s predefined probability, and σ i is the precision of observation which needs to reflect the practical situation. r i can be analysed by users as it is related to mapping configuration. Higher redundancy number will lead to lower MDB if δ 0 and σ i are unchanged. According to Förstner (1985), the rank of the redundancy number is illustrated in Eq. (18).

$$ {r}_i=\kern0.5em \left\{\begin{array}{l}\kern1.5em {r}_i>0.5\kern7.5em Good\hfill \\ {}\kern0.5em 0.1\le {r}_i\le 0.5\kern3.5em Acceptable\hfill \\ {}0.04<{r}_i<0.1\kern5em Bad\hfill \\ {}\kern1.5em {r}_i\le 0.04\kern5em Not\ Acceptable\hfill \end{array}\right. $$

There are correlations between every two outlier detection statistics, indicating that one observation’s outlier statistic is affected by another one. Therefore, to eliminate the influences of outliers, outlier should not only be detected but also be separated. This introduces minimum separable bias (MSB) that quantifies the minimum bias that can be separated for every two observations according to Wang and Knight (2012) as shown in Eq. (19).

$$ MS{B}_{ij}=\frac{\delta_s{\sigma}_0\sqrt{2}}{\sqrt{e_i^TP{Q}_vP{e}_i\left(1-\left|{\rho}_{ij}\right|\right)}} $$

where e i is the vector of zeros with the ith element being equal to one, and Q v is the cofactor matrix of the estimated residuals. σ0 is the observation’s prior standard deviation. δ s is the mean shift according to Type I and II error. ρ ij is the correlation coefficient between the outlier detection statistics for the ith and jth measurements.

Analysis and experiments

Global redundancy analysis

Assume that totally there are m images and n tracked keypoints in bundle adjustment. The number of unknowns for the pose of the camera is 6m, and the total numbers of the unknowns of keypoints’ position are 3n. Therefore the number of unknowns M unkown is 6 m  + 3 n in total. In Case I, observations of bundle adjustment include two parts: the observation from GCPs, and the ones from keypoints. Each GCP or keypoints contributes 2 observations. In general, each image consists of more than 4 GCPs to obtain the position and orientation of the camera reliability as three GCPs will not guarantee the unique solution (Thompson, 1966). Keypoints’ contributions on the observation need to involve image number and image’s keypoints. Each keypoint is at least linked with 2 images. However, if all keypoints are merely linked with two images, its corresponding number of observations will be 4n at least. In the best scenario, all keypoints can be linked with all the images, the number of observation will be 2mn. But the practical situation is the number of images that keypoints linked with may vary from 2 to m. Therefore keypoints’ corresponding number of observation N KPi is:

$$ {N}_{KPi}=2{\displaystyle \sum_{j=2}^m}j{n}_j $$

where n j is the number of keypoint that are tracked in j images, and \( n={\displaystyle \sum_{j=2}^m}{n}_j \) as the sum of keypoints tracked in j images equals to the total number of keypoints to be georeferenced.

Assume the functional model in Eq. (9) is used (Case I), and the average number of GCP in each image is 5, then the observation contributed by GCP N GCPi is 10m. Therefore the total redundancy number will be:

$$ \begin{array}{l}{N}_{KPi}+{N}_{GCPi}-{M}_{unkown}=2{\displaystyle \sum_{j=2}^m}j{n}_j+10m-\left(6m+3n\right)\hfill \\ {}\kern12em =4m+2{\displaystyle \sum_{j=2}^m}j{n}_j-3n\hfill \end{array} $$

The minimum number of the second component in Eq. (21), which are the observations contributed by keypoints, is 4n. Therefore in this case the total redundancy number will be 4m + n. This is consistent with the well-known principle that, more images and keypoints will have more total redundancy.

Similarly, if the functional model in Eq. (10) is applied (Case II). Assume the number of GCP in the overlapping areas is 5, and the number of measurements contributed by GCPs’ world coordinates and image coordinates are 15 and 10m respectively, the total redundancy number is:

$$ \begin{array}{l}{N}_{KPi}+{N}_{GCPi}+{N}_{GCPw}-{M}_{unkown}\hfill \\ {}=2{\displaystyle \sum_{j=2}^m}j{n}_j+15+10m-\left(6m+3n\right)\hfill \\ {}=2{\displaystyle \sum_{j=2}^m}j{n}_j+4m+15-3n\hfill \end{array} $$

The global redundancy for each observation is defined as the ratio between the total redundancy number and the number of total observations. In Case I, the total redundancy number for the worst scenario is 4m + n, and number of observation will be 10m + 4n, therefore the global redundancy \( G{R}_w^I \) in the worst scenario is:

$$ G{R}_W^{\mathrm{I}}=\frac{4m+n}{10m+4n} $$

Similarly, \( G{R}_B^{\mathrm{I}} \) in the best scenario is:

$$ G{R}_B^{\mathrm{I}}=\frac{4m+2mn-3n}{10m+2mn} $$

Similarly, in Case II, the total redundancy numbers for the worst scenario and best scenario are n + 4m + 15 and 2mn + 4m − 3n + 15 respectively. The numbers of unknowns for the two scenarios are 6m + 4n + 15 and 6m + 2mn + 15 respectively. Therefore their global redundancy number \( G{R}_W^{\mathrm{II}} \) and \( G{R}_B^{\mathrm{II}} \) are formulated in Eqs. (25) and (26) respectively:

$$ G{R}_W^{\mathrm{II}}=\frac{n+4m+15}{6m+4n+15} $$
$$ G{R}_B^{\mathrm{II}}=\frac{2mn+4m-3n+15}{6m+2mn+15} $$

The variation of global redundancy number for the four scenarios could be illustrated by Fig. 1. The range of m was set from 3 to 8, and the range of n was set from 20 to 200 with interval 20.

Fig. 1

Variation of global redundancy number with regarding to the number of images and keypoints

As shown in Fig. 1a for the worst scenario of Case I, generally the global redundancy number lied between 0.25 and 0.32. Figure 1c for Case II showed the similar tendency with that of Case II, Its minimum global redundancy number was higher than that of Case I, and the maximum one was larger than that of Case I. It was interesting for the two cases that though with the increase of keypoints’ number, the global redundancy number decreased slightly instead of increasing. Besides, even number of images increased but each keypoint only existed in two images, which meant that the keypoints were tracked on two frames, the global redundancy number would not increase significantly. However, the lowest global redundancy numbers for Case I and II were larger than 0.1, which was acceptable according to Eq. (18).

The best scenarios for Case I and II were also similar as shown in Fig. 1b and d. With the increase of number of images, the global redundancy number approximately increased from 0.5 to 0.8. Contrary to the worst scenario, there existed a slight increasing trend when the number of keypoints increased for Case II. However, for Case I, the global redundancy number almost kept unchanged. Their global redundancy numbers were all larger than 0.5, which meant that even in the case that all keypoints were available or tracked in three images. Therefore the global redundancy number had the ability to provide enough for better reliability according to Eq. (18).

According to the analytical analysis, the number of images in reality based 3D mapping plays an important role in improving global redundancy. The correctness of tracked keypoints in all the collected frames will contribute to improving global redundancy. However, the increase on the number of keypoints is less significant than the aforementioned two factors. The redundancy number for each observation is related to MDB in Eq. (16). Due to this, more mapping images and better keypoint tracking algorithm would result in lower MDB and better internal reliability if the stochastic model and non-centrality parameter are unchanged.

Geometric analysis of space resection

GCPs in reality based 3D indoor mapping were utilized as the input to obtain the initial value of camera pose through space resection for both Case I and II. DoP values in space resection based on the principle of error propagation were evaluated as the accuracy indicator of pose determination. As the real data contained random noise that could not be well controlled, and might be contaminated by outliers, a simulation environment without noise and outliers including a single wall and two types of corner with different patterns (convex type and concave type) was set up to eliminate the effects of noise and outliers, which were shown in Fig. 2a, 2b and 2c. The arranged red dots simulated GCPs with known image and world coordinates, and the blue cube represents the pose of the camera.

Fig. 2

Simulated three scenarios and corresponding variation of PDoP and ADoP

For each scenario, three factors that affected geometry were involved: GCPs’ distribution, GCPs’ number and the distance between the camera and GCPs. The distribution of GCPs was changed from a centralized style to a distributed style with the interval of 0.5 meter, and the number of GCPs was set as 4, 8 and 12 respectively. The distance from the camera and the object was set from 0.5 meter to 5 meters with the interval of 0.5 meter.

When comparing the three scenarios from Fig. 2, PDoP and ADoP in the scenario of convex corner and concave corner were similar with each other, but they were much smaller than those of wall scenario when GCPs were centralized. What the three scenarios had in common was when the GCPs became more scattered, DoP would decrease significantly. However, with the increase of GCPs’ number, the DoP values including ADoP and PDoP decreased slightly. Therefore its contribution was not significant.

As shown in Fig. 3 about DoP variation with regarding to distance in the three scenarios, the increased distance from the camera to the object would enlarge the DoP values for position and orientation. However, the difference between wall scenario and the other two scenarios could be one order of magnitude if the distance from the camera’s position and the center of GCPs was large. PDoP and ADoP in the scenario of convex corner and concave corner were similar, but were much lower than those of wall scenario.

Fig. 3

DoP variation with regarding to distance

In the normal condition of indoor environment, there could be 4 GCPs with dispersive distribution. Besides, the distance between the camera and object normally should be around 3 meters. In this case, The DoP values in space resection lied on 103 − 104 level, therefore, if the size of pixel was 5.2 um, the variation of one pixel in the observation would cause the variation of 0.0052 − 0.052 meter for camera position and 0.298 − 2.98 degrees for camera orientation. In the extreme case, DoP values could lie on 106 level, causing higher error propagation for the position and orientation of camera. Therefore a tiny change of the observation would result in large variation on the camera’s estimated position and orientation. Based on the analysis from the simulation result, the diversity degree of the GCPs and distance played a more important role for DoP values than the number of GCPs.

Geometric analysis of bundle adjustment

The final mapping solution was generated by bundle adjustment mentioned in Functional model for reality based 3D mapping section. However, the noise on GCP’s image and world coordinates and keypoints’ image coordinates in the real world could not be well controlled and real data might contain outliers, causing bias in the geometric analysis. Therefore a simulation environment without noise and outliers was created to analyze the relationship between geometry and reliability as well as separability for both Case I and Case II.

Three mapping images were the input for mapping as they were able to provide enough global redundancy according to the analysis in global redundancy analysis section. Overlapping percentage and intersection angle were represented by two geometric components: distance from the object to the camera and the baseline between every two images. This was because the less distance would lead to higher intersection angle, and the lower length for the baseline would lead to higher overlapping percentage. The focal length for the camera was set as 50 mm.

The observations of Case I could be divided into two parts: the image coordinates from GCPs and the ones from keypoints. The latter one could be further classified according to their corresponding number of overlapping images. For example, some keypoints might only be matched in two images, while others were matched in three images. The MDBs from different types had different characteristics. MDBs could be divided by three groups: MDBs for the image coordinates of GCPs (Type A), MDBs for the image coordinates of the keypoints that were matched in three images (Type B), and MDBs for the image coordinates of the keypoints that were matched in two images (Type C). Mapping solutions were generated based on different overlapping percentage and baseline length, and the corresponding MDBs and MSBs from different types were compared for both Case I and II.

The MDB values for Type A from Case I was stable as Table 2 showed. The range of variation was less than 1 pixel when the distance changed from 3.25 m to 4 m, and the baseline changed from 0.3 m to 0.6 m. Similarly, MDB values contributed by keypoints that were matched in three images shown in Table 3 also did not have obvious change when the overlapping percentage and intersection angle changed. However, their MDB values were all larger than those contributed by GCPs, indicating that their corresponding measurements were less sensitive for outliers than those from GCPs.

Table 2 The average values of MDBs for Type A
Table 3 The average values of MDBs for Type B

However, the images that were matched in two images were much larger than the first two types as shown in Table 4, which were caused by the lower redundancy number. Besides, their variations for MDB were larger than other two types with regarding to intersection angle and overlapping percentage.

Table 4 The average values of MDBs for Type C

Since MSB was defined to quantify the bias that could be separated for every two measurements. According to the classification in MDB, the corresponding MSB could be divided into: MSB (A, A), MSB (A, B), MSB (A, C), MSB (B, B), MSB (B, C) and MSB (C, C), which represented the average value of MSBs. For example, MSB (A, C) meant the average MSB between the image measurements of GCPs and the ones of the keypoints that were matched in two images.

As illustrated in Fig. 4, it was observed that MSB (A, A), MSB (A, C), MSB (A, B) were around 7-9 pixels. These MSB values were larger than their corresponding MDBs as MSBs were always larger than MDB in value if they have the same Type I and II error according to Wang and Knight (2012).

Fig. 4

The variation of average MSB with regarding to distance and baseline in Case I

However, MSB (B, B), MSB (B, C) and MSB (C, C) were much higher than the other three groups, which were at 105 or 109 pixel level. This meant that in these groups, if one of the measurements was contaminated by one outlier. The size of outlier should be at 105 or 109 pixel level to be separated. Practically it was actually impossible to separate outliers. It was interesting to find that the changes on the intersection angle and overlapping percentage in indoor environment did not have a significant influence on MDB and MSB. The main reason for this was that compared with aerial mapping, the variation of distance and baseline was much smaller in indoor environment, reducing their influence on MDB and MSB. Therefore the number of images became the main factor that affected the quality of mapping.

Similarly, the MDBs in Case II could be divided into four parts: the MDBs for GCPs’ image coordinates (Type D), the ones for GCPs’ world coordinates (Type E), and the ones contributed by keypoints. The last type could be further classified as two parts: the MDBs for image coordinates of keypoints that were matched in three images (Type F), and the MDBs for image coordinates of keypoints that were matched in two images (Type G).

As illustrated in Table 5, the majority of average MDB values for Type D lied between 6.919 and 8.602 pixels, and their variation was around 2 pixels, which meant that the bias on GCPs’ image coordinates at least needed to be larger than 6.919 pixels to be detected.

Table 5 The average values of MDB values for Type D
Table 6 The average MDB values for Type F

The average MDB values for Type E with regarding to baseline and distance were similar with each other, which were all round 0.067 m, which meant the bias on the GCPs’ world coordinates should be larger than 0.067 m to be detected as the observation of Type E were world coordinates.

The overall size of MDB for Type F shown in Table 6 similar with that of Type D. It was interesting found that with MDBs had a slight increasing trend when the lengths of baselines increased and distances between the object and camera decreased for both Type D and F. The values of MDBs for Type G ranged from 108 pixels to 109 pixels. This meant that it were nearly impossible to detect the bias on the observation for Type G.

As shown in Fig. 5, the orders of magnitude for MSB (D, D), MSB (D, E), MSB (D, F), MSB (D, G) approximately lied from 9 pixels to 18 pixels. However, MSB (E, E), MSB (E, F) and MSB (E, G) were at 104 − 106 pixel level, which means that if one observation for GCPs’ world coordinates was contaminated by one outlier, the bias on world coordinates at least needed to range from 0.052 m to 5.20 m in order to be separated as the size of one pixel in this simulation was 5.2 um. If the outlier existed in the observation of keypoints’ image coordinates, the large MSB indicated that it was almost impossible to separate the contaminated observation from GCPs’ world coordinates.

Fig. 5

The variation of average MSB with regarding to distance and baseline in Case II

MSB values among the keypoints were high except MSBs between Type F and G. MSB (F, F) and MSB (G, G) were at 104 and 109 pixel level, indicating it was difficult to separate the corresponding measurement if one outlier existed in their own group. However, the separability between Type F and G was much lower than MSB (F, F) and MSB (G, G) as its MSB values was around 12 pixels.

The MDB values for image coordinates of keypoints and GCPs in Case I and II were similar. MDB values for GCPs were around 10 pixels, and MDB values for keypoints that were matched in three images were around the same level, indicating that they had similar performance in outlier detectability. However, MDB values for keypoints that are matched in two images were much higher, and MDB values for GCP’s world coordinates in Case II lied on meter level. The situations of MSB values for Case I and II were complex as MSB values’ distributions inside each group were not uniform. In some cases, the correlation among the measurement contributed by keypoints were high (e.g. larger than 0.9), causing higher MSB values. However, if the correlations among the measurement were lower, MSB values would decrease. For both cases, the measurements contributed by GCPs’ image coordinates were easier to be separated with other measurements as their MSBs were lower than others.


The geometry in reality-based 3D mapping is an important factor that affects mapping quality. The geometric component is controllable and the functional model can be designed when determining how to conduct indoor mapping. Therefore geometric analysis provides references for users about how to appropriately set up the geometric components and functional model to meet the requirement for mapping quality.

In general, better geometry will lead to better DoP, reliability and separability. More specifically, to meet the requirement of mapping quality in indoor environment, the distribution of GCPs and distance between the camera and GCPs need to be considered first. The second factor is the number of them. Besides, the number of images is one key factor that affects reliability and separability. Through the simulation for indoor environment, intersection angle and overlapping percentage do not have significant influence on reliability and separability. Finally, through the analysis of reliability and separability for the two functional models, they have similar performance in outlier detection and separation.

In practice, the matching performance of keypoints affects the available number of overlapping images. However, matching performance is closely related to the texture of the surrounding environment. In texture-less areas, keypoints are more difficult to be detected, described and matched. How to improve the keypoints’ matching performance in such region is still a question that needs to be further explored.


  1. Alsadik B, Gerke M, Vosselman G, Daham A, Jasim L (2014) Minimal camera networks for 3d image based modeling of cultural heritage objects. Sensors 14:5785–5804

  2. Baarda W (1968) A testing procedure for use in geodetic networks. Netherlands Geodetic Commission, New Series, vol. 2, No. 4.

  3. Davison AJ, Reid ID, Molton ND, Stasse O (2007) MonoSLAM: Real-time single camera SLAM. IEEE Trans Pattern Anal Mach Intell 29:1052–1067

  4. Förstner W (1985) The reliability of block triangulation. Photogramm Eng Remote Sens 51:1137–1149

  5. Förstner W (1987) Reliability analysis of parameter estimation in linear models with applications to mensuration problems in computer vision. Computer Vision, Graphics, and Image Processing 40:273–310

  6. Gruen A (1978) Accuracy, reliability and statistics in close range photogrammetry. Inter-Congress Symposium of ISP Commission V.

  7. Li X (2013) Vision-based navigation with reality-based 3D maps. Dissertation, University of New South Wales.

  8. Li X, Wang J (2012) Image matching techniques for vision-based indoor navigation systems: Performance analysis for 3D map based approach. 2012 International Conference on Indoor Positioning and Indoor Navigation.

  9. Milford MJ, Wyeth GF (2008) Mapping a suburb with a single camera using a biologically inspired SLAM system. IEEE Trans Robot 24(5):1038–1053

  10. Nocerino E, Menna F, Remondino F (2014) Accuracy of typical photogrammetric networks in cultural heritage 3D modeling projects. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 1:465–472

  11. Konolige K, Agrawal M (2008) FrameSLAM: From bundle adjustment to real-time visual mapping. IEEE Trans Robot 24(5):1066–1077

  12. Salzmann M (1991) MDB: a design tool for integrated navigation systems. Bulletin Géodésique 65(2):109–115.

  13. Taylor T (2009). Mapping of indoor environments by robots using low-cost vision sensors. Dissertation, Queensland University of Technology.

  14. Thompson E (1966) Space resection: Failure cases. Photogramm Rec 5:201–207

  15. Wang J, Knight NL (2012) New outlier separability test and its application in GNSS positioning. Journal of Global Positioning Systems 11:46–57

Download references


The author wishes to thank China Scholarship Council (CSC) for supporting his studies at UNSW Australia.

Authors’ contributions

In this paper, ZL designed the experiments, wrote the program codes, analysed the result and wrote the manuscript. JW proposed the concepts of geometric analysis in reality based 3D mapping, supervised the corresponding experiments and revised the draft manuscript. MA extended the functional model of reality based 3D mapping from Case Ito Case II. KC and SZ participated in the experiment design, analysed the experimental result and did the proofreading for manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Author information

Correspondence to Zeyu Li.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark


  • Geometric analysis
  • Indoor environment
  • 3D map
  • MDB
  • MSB