Open Science Article Submitted: August 5, 2005https://doi.org/10.54294/jd5eba

Cumulative Gaussian Curve Fitter for Boundary Parameterization

David Wang^1*

,Robert Tamburo

,George Stetten

1.University of Pittsburgh

Abstract

We have previously developed an algorithm for locating boundaries in an image with sub-pixel resolution, as well as estimating boundary width and image intensity within the adjoining objects. The algorithm operates by finding the parameters of a cumulative Gaussian curve that best approximates an intensity profile taken across a boundary. If intensity is sampled along the image gradient across a boundary, it is reasonable to assume the profile approximates a finite portion of a cumulative Gaussian. Given that assumption, the first derivative of the profile should be the corresponding portion of a Gaussian, completely described by its mean, standard deviation, and amplitude. We present here a simple and rapid method to find those parameters, given that we only have a potentially skewed sample of the Gaussian. The parameters are approximated first for the finite sample, and then both ends of the Gaussian are extrapolated using the resulting parameters. New parameters are then calculated and the procedure is repeated. The optimization rapidly converges, yielding boundary location (mean) with sub-pixel accuracy as well boundary width (standard deviation). Integration then reproduces the cumulative Gaussian, and a least-squares fit is applied to estimate the constant of integration, from which intensity of the adjoining regions can be estimated.

Keywords

Cumulative Gaussian

Optimization

Image Analysis

Boundary Detection

Manuscript

Source Code and Data

Loading file tree...

File

Image

Select a file to preview

Reviews

Gavin Baker

Friday, 16 September 2005

Summary:

The stated aim is to determine the location of the boundary between two different tissue types in a medical image at a sub-pixel resolution. The authors present an optimisation technique that fits a cumulative Gaussian curve to a set of sample points (intended to represent such a curve). The parameters calculated should then match the data, such that the mean (mu) of the Gaussian can be used to determine the boundary point. This is demonstrated by generating a curve (possibly adding noise) and sampling it, then passing this to the optimiser to fit.

Hypothesis:

The hyopthesis is that a cumulative Gaussian can be fitted to an intensity profile to determine the location of a boundary between two tissue types. It assumes that the intensity profile of this boundary can be modelled by a cumulative Gaussian function, and furthermore that there exists a unique boundary that can be found by the profile.

Evidence:

The authors present results from testing on several generated curves. It shows how the RMS error (between the fitted and known parameters) for the curves tested goes rapidly to almost zero within a small number of iterations. This supports the claim that their optimiser can fit a CG curve to a set of points derived from a 1D profile.

No evidence is presented (citation or data) to support the claim of the intensity profile being modelled by a Gaussian, or that this can be applied to images.

Open Science:

The paper follows the tenets of Open Science; the source is provided, and the paper describes the algorithm as implemented. The technique has not yet been applied to images, so there is no data to accompany it. All the parameters were provided to enable the reader to replicate the results presented in the paper.

Reproducibility:

I downloaded the code as supplied, and built it under Debian GNU/Linux on a Pentium 4 machine, using g++ 3.3.5, ITK CVS, VTK 4.2, FLTK 1.1. Apart from a few warnings, it built and ran fine.

I ran the application with the default parameters, and it worked fine. Testing with a range of values seemed to produce the expected results, and a good fit. One combination managed to cause a segfault (didn’t manage to record the settings).

Test Results

Running with mu=5, sigma=2, I1=0, I2=100, samples=10, noise=[-1,1]: output information not displayed in GUI, but plot is displayed and fit looks very close. Mean fit error=0.0718168. Over 10 repeated runs, this would usually improve to RMS fit error of 0.6, but occasionally the RMS error would increase and stabilise higher than the initial value.

Running with mu=10, sigma=5, I1=120, I2=900, samples=10, noise=[-10,10]: Fit error converges to about 6 after around 40 iterations. Mean error 2.78059. However Fitted mean=7.21941, fitted sigma=3.68412, fitted I1=121.483, fitted I2=589.68. The Upper asymptote is significantly out, despite fairly low overall fit error. These values of intensity are common for tissue-bone boundary, and are thus representative of real images.

Use of Open Source Software:

The authors use the well-known combination of FLTK for the GUI and VTK for the charts. They do not discuss the implementation or use of these libraries.

Open Source Contributions:

The source was provided for the algorithm itself, in the style of an ITK algorithm. The primary class has already been submitted to ITK. This was accompanied by an FLTK/VTK application front-end to exercise the filter and plot the results. It was immediately usable for testing once built.

Code Quality:

The code was generally readable and well-laid out. It tends to follow the conventions used in ITK and in the ITKApplications suite, and was easily built on a Linux box with appropriate libraries installed. It seems to be fully portable.

Applicability to other problems:

The solution is in a sense a very general one, in that it provides a particular type of curve-fitting algorithm for a cumulative Gaussian. It could potentially be used in many statistical analysis problems. It may need to be developed further to see just where else it could be applied in image analysis.

Suggestions for future work:

The conclusion mentions future work in segmentation based on boundary profiles. In terms of ITK, clearly the most important step is demonstrating this approach on real image data. Also, perhaps some of the work of Pizer et al with cores could be relevant here.

Requests for additional information from authors:

Given the results of this fitting (sigma, mu, etc) how do you calculate the actual boundary location?

How do you know the result is sub-pixel? What about scale issues and partial volume effect?

How are p1 and p2 (the sampling region) determined? How does this effect the optimization process?

Error function? How is the erf table generated? It would be useful to include at least the formula showing how they were calculated.

Additional Comments:

Only a single paper is cited as related work (written by one of the coauthors). It would help to have the work placed in the context of other related boundary finding techniques, showing how this approach differs.

This technique as described seems to only be applicable to a 1-Dimensional image intensity profile. The issue of first determining the orientation of the profile is not addressed, and thus cannot be directly applied to boundary analysis in 2D or 3D images. As it stands it is assuming that there is a unique boundary point to be found, which may be in 1D but is only possible for a given orientation at a given point in 2D or 3D.

The Cumulative Gaussian appears to well describe the blurred intensity profile at the implicit boundary of two tissue types. However no mention is made of the Partial Volume Effect. It is not clear that this approach can afford to ignore the PVE. Implicit in the selection of I_1 and I_2 is the fact that it must be chosen at sufficient distance from the boundary itself as to prove representative of the plateau intensity.

It would be interesting to explore the issue of scale also; say the ratio of the intensity delta (between I1 and I2) to the voxel size. Perhaps skewness and kurtosis play a part here. You mention skew in the abstract but don’t address it directly in the body. Is the boundary always symmetric? Given there will always be a finite sampling frequency for any acquisition device, the tyranny of Nyquist’s theorem places a limit on the smallest feature we can resolve. Size of point-spread function?

There is minimal analysis of the results and their implications. Section 5 talks about mu in terms of “skew”, which one would ordinarily associate with asymmetry of the curve tails. This is also confusing since mu is supposed to represent the tissue boundary position.

The technique is not tested on either synthetic or real images. It would be great to see this applied to some real image data. Or even a synthesised boundary (say between tissue and bone for CT) with noise and blurring added for good measure.

One of the stated assumptions is that the implicit boundary is essentially a step-function; that the actual tissue observed transitions at one point from I1 to I2 (or asymptotically close). Sigma is equated with the boundary “width”, but really 1 standard deviation is only going to be 37% (or whatever the figure is) of the way to 1 (either plateau).

The algorithm attempts to estimate I1 and I2, the intensities of the two regions. But this is a given, surely? It is in the image data, only determined by sigma, or the “width” of the boundary.

Not sure about about discrete sampling - check Insight into Images.

I think it would be a much stronger demonstration of the technique if it were applied to a simulated edge profile for validation. Since the only experiment is fitting a G curve to a G curve, it doesn’t directly address the hypothesis that this can find the boundary edge. Beginning with a 1D profile, two intensity levels could be chosen, and a random point chosen for the boundary. Then to simulate the mixture of the two intensities about the boundary, apply a smoothing filter of an appropriate kernel size/scale. Finally the boundary position could be calculated from mu and compared with the original known position.