Quality Assurance of Image Registration Using Combinatorial Rigid Registration Optimization (CORRO)

Purpose: Expert selected landmark points on clinical image pairs to provide a basis for rigid registration validation. Using combinatorial rigid registration optimization (CORRO) provide a statistically characterized reference data set for image registration of the pelvis by estimating optimal registration. Materials ad Methods: Landmarks for each CT/CBCT image pair for 58 cases were identified. From the landmark pairs, combination subsets of k-number of landmark pairs were generated without repeat, forming kset for k=4, 8, and 12. A rigid registration between the image pairs was computed for each k-combination set (2,000-8,000,000). The mean and standard deviation of the registration were used as final registration for each image pair. Joint entropy was used to validate the output results. Results: An average of 154 (range: 91-212) landmark pairs were selected for each CT/CBCT image pair. The mean standard deviation of the registration output decreased as the k-size increased for all cases. In general, the joint entropy evaluated was found to be lower than results from commercially available software. Of all 58 cases 58.3% of the k=4, 15% of k=8 and 18.3% of k=12 resulted in the better registration using CORRO as compared to 8.3% from a commercial registration software. The minimum joint entropy was determined for one case and found to exist at the estimated registration mean in agreement with the CORRO algorithm. Conclusion: The results demonstrate that CORRO works even in the extreme case of the pelvic anatomy where the CBCT suffers from reduced quality due to increased noise levels. The estimated optimal registration using CORRO was found to be better than commercially available software for all k-sets tested. Additionally, the k-set of 4 resulted in overall best outcomes when compared to k=8 and 12, which is anticipated because k=8 and 12 are more likely to have combinations that affected the accuracy of the registration.


Introduction
The central aim of radiation therapy is to maximize the dose to the tumor volume while minimizing the dose to the surrounding healthy tissues [1,2]. During a session of radiation treatment it is standard practice to verify the patient's position with respect to the planning CT images by registering a cone beam CT image (CBCT), taken while the patient is in the setup position, to the CT planning image, taken much earlier at the start of the treatment planning. The registered CBCT/CT image pair can then indicate to the therapist the proper table positioning shifts necessary to best align the patient with the desired alignment of the treatment plan.
In spite of this alignment verification process it is difficult to ensure that each radiation port is being delivered as planned. This is due to several complicated factors such as patient movement, positioning, and the tumor size and shape, which can change during the course of treatment. This results in inter-and intra-fractional motion which can lead to over-orunder-dosing the intended target and the organs at risk (OAR) [2,3,4]. For these reasons, several methods have been developed to validate and characterize the quality of image registration [5].
Traditionally physicians have qualitatively validated registered images by visually inspecting portal images and diagnostic quality images alongside a planning digitally reconstructed radiograph (DRR) [6].This method of assessing image registration quality typically involves the use of an infield metric (graticule mounted to the MV treatment head) to locate anatomical structures shared between the portal image and the DRR, in most cases this involves measurement of a point relative to unique features of the bony anatomy. The accuracy of this method has been reported to be between 5 to 10 mm [7]. This type of registration is subjective and will not be feasible for large quantities of data [8].
Fiducial markers strategically placed in/on internal targets or OARs have also been used to find the ground truth of a rigid image registration [5] Due to the poor contrast of the prostate relative to the surrounding soft tissue and the increased visibility of gold fiducial markers in kV or MV X-ray imaging, intra-prostatic gold markers are used to improve targeting for surgical procedures and radiotherapy treatment to the prostate and to estimate registration error [9][10][11][12].Though fixed to the target anatomy, the fiducials are known to drift from their original fixed point either due to anatomy changes over time or detachment of the fiducial. The effect of fiducial relocation may lead to inter-observer error associated with the registration [13][14][15]. Also, the number of fiducials that can be fixed at any time is limited. O'Neill et al [16]. described that in a study of 427 patients undergoing intensity modulated radiation therapy using fiducial markers for image guided radiation therapy (IGRT) the intra-fraction motion was found to be greater than 2mm for about 66% of their patients.
Expert positioned landmark points pairs have also been used to quantitatively validate registration quality. These landmark points are used to find the ground truth in correlated images which allows for validation based on the accuracy of the manual or automated selection of the points [17][18][19]. However, the quality of the landmark points must be validated. One metric for validating the quality of landmark points is the calculation of inter-and-intra-observer variation. This measurement is performed by requesting multiple experts to repeat the manual point selection of landmark points on the same data set multiple times [20]. For example, Shearer et al [21]. evaluated causes of error in landmark-based data collection using microCT and surface scanners. In their study, they measured the precision, accuracy and repeatability of craniodental landmarks. Their results showed that inter-observer error is of greater concern than intra-observer. Also, Fagertum et al., [22] showed the interobserver error of 73 selected facial landmarks selected to be between 0.11-5.75 pixels. This validation method depends on how well the expert identified the points within each paired image set and the limited number of unique anatomical features from which corresponding sets of points can be selected with confidence.
Mutual information is a metric that has been successful in performing registration between clinical images and can also be used as a metric for characterizing registration quality, especially those from different modalities [23,24]. Mutual information is a concept from information theory, which is applied to image registration to measure the amount of information one image contains about the other. The registered images are validated using the following comparison metrics: the root mean squared (RMS) difference of intensities of the two images, medianabsolute deviation of the intensity difference, and maximum intensity differences [25].
One disadvantage of CBCT imaging is that large amounts of scattered xrays may enhance the noise levels in the reconstructed image and ultimately affect the contrast of the image. This noise is increased for thicker anatomical regions such as the pelvis making CT/CBCT image registration a very difficult problem.
In this study, we present an offline quality assurance technique for image registration using a curated and statistically characterized reference data set of pelvic cases, each consisting of expert placed landmark points, CORRO registration values, and the original CT/CBCT image pairs from radiation therapy (RT) patient set-up for pelvic cases. We employ a method of joint entropy to quantitatively measure CORRO as we discussed in [26].

Image Data
Fifty-Eight patients treated in the pelvic region at the Beaumont radiation oncology center were selected for a Beaumont Research Institutional Review Board approved retrospective study (2014-326). Each patient received a planning CT on a 16-slice Philips Brilliance Big Bore CT scanner (Philips NA Corp, Andover, MA) covering the entire anatomic region and utilizing immobilization devices. Each patient had CBCT images acquired for daily image guidance on the on-board imager for Elekta Synergy® linear accelerator (Elekta Oncology Systems Ltd., Crawley, UK). The CBCT images ranged from 512 × 512 × 88 to 512 × 512 × 110 pixels with pixel size ranging from (1 × 1) mm 2 and 3mm slice thickness to (0.64 × 0.64) mm 2 and 2.5mm slice thickness. The planning CT was resampled to the same in-plane dimensions as the CBCT and the image content was shifted to place the anatomic isocenter at the center of the image volume shown in Figure 1. The machine isocenter is located at the center of the CBCT reconstruction image volume. Figure 1a shows the planning CT image with machine isocenter in red and image isocenter in blue; Figure 1b shows the planning CT at machine isocenter and Figure  1c shows CBCT at machine isocenter.

Rigid-body Registration
Given the planning CT (P) and the CBCT (Q), we calculated the transformation Q=T(P) such that the corresponding coordinates in the two images correspond to the same physical location in both images. Let = { … } and = { … }denoted the collection of points in 3D space ( ℝ 3 ) with the same size, with P representing landmark points in the planning CT image and Q representing corresponding landmark points in the CBCT image. The registration problem in three-dimension (3D) consists of finding the transformation that achieves the best match between the corresponding features in P and Q such that the root mean square (RMS) distance, , between corresponding points is minimized [11,27]. The appropriate translation vector is simply the mean displacement between the two sets of points [10]. The aim was to find the errors associated with locating the landmark points. The image registration problem is reduced to a shape analysis problem or to the orthogonal Procrustes problem [10][11][28][29]. The Procrustes problem is simply a least square-fitting problem and studies have shown that the calculation of the rotation matrix R is more involved due to the non-linear condition for a rotation matrix to be orthogonal. If P and Q are replaced with their "centered" values i.e. their values less the mean values then the optimal transformation is represented as → − ̅ ( ) → − ̅ ( ) This reduces the problem to the orthogonal Procrustes problem where we seek to find the rotation R. The RMS distance to be minimized is termed as the fiducial registration error (FRE) [10][11].Therefore given four noncoplanar points for a 3D volume the problem of rigid-body registration is to finding a rotation and translation (t) that minimizes the FRE which is represented mathematically as An FRE value of zero means the rigid-body registration is perfect. However, the fit will be approximate due to variations associated with the location of landmark points and we aimed to find the error associated in locating these landmark points. The translation is given by = ̅ − ̅ ( ) Where the bar indicates a mean over i=1,…,k.

Landmark selection and generation of large k-Sets
An in-house MATLAB-based software interface named ASEMPA (Assisted Expert Manual Point Selection Application) was developed to aid experts in manually selecting landmark feature pairs between the images. An average of nearly 150 landmark pairs were selected between each CT/CBCT image pairs. The anatomical landmark pairs selected were used to generate the k-set. A k-combination set was generated as a subset of the landmark pairs by combining k-number of landmark pairs without repeat. We call the set of all possible distinctive k-combinations that form the set of discrete independent trials the k-set. Three different sizes of kcombination set were used in this study k=4, 8 and 12.

Combinatorial Rigid Registration Optimization (CORRO)
CORRO as discussed earlier is a recently developed method that uses the rigid registration generated by members of the k-set to calculate average registration values for the image pair as a whole. The average registration values calculated by CORRO closely approximate the lowest entropy registration possible for the two images.
To improve the quality of the final registration output from CORRO a screening process was developed to remove landmark points from the original landmark set which would result in poor registration quality. This was performed by first generating the k-set of k=4. A rigid registration is then computed for each k-combination set in the k-set and an FRE is calculated for each rigid registration. The registration transformation matrix associated with the minimum FRE, Tmin is then used to create the boundary condition of Tmin ± 2mm (± 2pixels shifts for 1 pixel/mm cases). This boundary condition is then applied to all calculated registration values to extract the registrations that are within tolerance and their respective k-combination sets. The union of all landmark point pairs from these filtered k-combination sets form, a new filtered landmark point set which is then used for CORRO. Effectively this removes landmark points of poor quality (1-30 points depending on the case and original quality of landmark placement) from the original data set, which is similar to point filters used in other fields such as remote sensing [30]. The filtered landmark location of these combination pairs is used to find the kcombination set for k=8 and k=12 which is then used for CORRO.
In generating the k-combination set, we were able to create a large population of rigid registration values capable of mapping between the CT/CBCT image pair. Using k-sets with such a large population of kcombination sets we estimated the true mean of the rigid-registration values (optimal registration) and use the central limit theorem to validate our results. The results were also validated using joint histogram and joint entropy calculation. The joint histogram of two registered images is less dispersed when the images are well aligned and the corresponding anatomical features overlap. The near perfect alignment in the joint histogram results in a low joint entropy value.

Image Data, Landmark pairs and k-combination sets
An average of 154 (range: 91-212) bony landmark pairs were selected for each CT/CBCT image pair. Large k-sets for k = 4, 8 and 12 were generated and used to solve for the rigid-body registrations. The size of each k-set ranged from 2,000 to 8,000,000 k-combination sets. Table 1 shows 10 of the 58 cases included with our reference data set.

Rigid-Body Registration and Joint Entropy
Applying the mean translation for combination k=4 for one of the CBCT cases the output registration is demonstrated by fusing the planning CT and CBCT to see how well the rigid registration performed. Figure 2 shows the joint histogram of the registration from the x-ray volumetric imaging software (XVI). The mean registration from CORRO was applied to the CT/CBCT pair and the joint histogram was plotted and shown in Figure 2 for k=4, k=8, k=12 and XVI. The joint entropy of the planning CT and CBCT combination by targeting only the tumor regions for all cases was used to validate the results. This result is shown in Table 2 and the pie chart results in Figure 2 and the linear plot in Figure 3. Joint entropy has been discussed fully in our previous publication [26].

Estimating and Validating the Optimal Registration Using Central Limit Theorem
The mean standard deviation of the registration output decreased as the ksize increased for all cases. Calculating the joint entropy of all 58 cases validates the final results; a sampling of 10 cases is shown in Table 2. In general, the joint entropy evaluated using CORRO was found to be lower than results from XVI. Of all 58 cases 58.3% using k=4, 15% using k=8 and 18.3% using k=12 resulted in the best registration using CORRO as compared to 8.3% of cases using the commercial registration software as shown in Figure 3. The joint entropy between the registration output from k-set of 4 versus k-set of 8, 12 and the commercial software were compared with correlation coefficient of 0.9989 for 4 and 8, 0.9988 for 4 and 12 and 0.9947 for 4 and 12.
The minimum joint entropy for one of the cases was calculated and found to exist at the estimated registration mean as shown in Figure 6.  The k-set gives a large population of paired sets of points, (k-combination set) when we draw a large sample from the population of k-set then the distribution of the sample mean approaches a normal distribution, and the standard deviation of the sample mean decreases as the sample size increases.
The normal distribution for one case for k-set of 4 is shown in Figure 4 and the standard deviation, which is proportional to1 √ ⁄ , where n is the sample size is demonstrated in Figure 5. It should be noted that for this particular case the estimated registration mean (population mean) is -2.3608 pixels for x-translations and the mean of the sample mean is also -2.3608 as expected by the central limit theorem. In Figure 6a, the Joint entropy distributions for over 400,000 individual transformations for a sample case. The minimum joint entropy is at the estimated mean and minimum joint entropy of 6.4004 at RMS value of 32.11, which is close to the estimated mean for this sample, is 31.7397. Figure 6b shows the Joint entropy distributions of a random sample of 5000 individual transformations for the same sample case. The minimum joint entropy is at the estimated mean and minimum joint entropy of 6.4646 at RMS value of 31.4, which is close to the estimated mean for this sample, is 31.6369.
To further prove that the joint entropy will always have a value close to the estimated mean we randomly sampled 1000 individual transformations shown in Figure 6c for the same case. The minimum joint entropy was again found to exist close to the estimated mean and minimum joint entropy of 6.4651 at the three RMS values which leads us to conclude that the sample has to be large in order to get the minimum joint entropy at just one RMS value.

Discussion
Rigid body registration algorithms have been historically evaluated using geometrical features. Points and surfaces are most commonly used in the evaluation process. In this study, we created a data set for pelvic cases and statistically characterized this data using a methodology we developed called CORRO. Using landmark points identified by expert and employing the statistical method of combination without replacement we generated thousands and millions of landmark point sets for each case and estimated the true mean of the rigid registration and the registration error.
The results were validated using the central limit theorem.
The results of k=4 gave the overall best results when compared to k=8 and 12, which is anticipated because k=8 and 12 are more likely to have mismatched points that would affect the accuracy of the registration [31]. If d is the number of dimensions the number of points needed to perform a registration is given by n≥d+1 and the degrees of freedom is given by n≥d 2 +d hence for 3D the number of points needed should be n ≥4 and the degrees of freedom is 6 for a rigid registration which has only 3 rotations and 3 translations and 12 for an affine registration with 12 degrees of freedom 3 rotations, 3 translations, 3 scaling and 3 shearing. In 3D any four points can be mapped to another four points [32]. However, more landmarks result in more transformation which result in different weighted areas of the image. Coste [33] shows that the result of using many landmarks could result in the combination of some of the points resulting in odd transformation of the image grid. This could be what happens in the case of k=8 and 12.
The result of this work found that the when large samples were drawn from the k-set the estimated error associated with the registration reduces. This means for a given k-set a large sample can be randomly drawn to perform the registration with similar results rather than using registration for the entire k-set.

Conclusion
The results demonstrate that CORRO works even in the extreme case of the pelvic anatomy where the CBCT suffers from reduced quality due to increased noise levels. The estimated optimal registration was found to be better than results from XVI. The data created in this work will be made available to the scientific community for assessing image registration algorithms and to aid in the development of future image comparison and validation metrics. The result obtained in this study revealed that for ≥ Auctores Publishing -Volume 5(2)-076 www.auctoresonline.org ISSN: 2640-1053 Page 8 of 9 500 numbers of samples there are no significant differences in their registration errors as the deviation is less than a 0.01mm which is less than most clinical cutoffs.
CORRO can also be an excellent tool for radiotherapy centers in Lower Middle-Income countries or radiotherapy centers without in-room kV imaging for a retrospective quality assurance in set-up process using the MV electronic portal imaging system and digitally reconstructed radiographs. Also, the reference set can be used in future studies to test image registration algorithms. The data has been made available at https://wiki.cancerimagingarchive.net/display/Public/Pelvic+Reference+ Data)