Skip to content Skip to sidebar Skip to footer

Python Gaussian Fit On Simulated Gaussian Noisy Data

I need to interpolate data coming from an instrument using a gaussian fit. To this end I thought about using the curve_fit function from scipy. Since I'd like to test this function

Solution 1:

I agree with Olaf in so far as it is a question of scale. The optimal parameters differ by many orders of magnitude. However, scaling the parameters with which you generated your toy data does not seem to solve the problem for your actual application. curve_fit uses lestsq, which numerically approximates the Jacobian, where numerical problems arise because of the differences in scale (try to use the full_output keyword in curve_fit).

In my experience it is often best to use fmin which does not rely on approximated derivatives but uses only function values. You now have to write your own least-squares function that is to be optimized.

Starting values are still important. In your case you can make sufficiently good guesses by taking the maximum amplitude for a and the corresponding x-values for band c.

In code, it looks like this:

from scipy.optimize import curve_fit,fmin
import numpy
import pylab

# Create a gaussian functiondefgaussian(x, a, b, c):
    val = a * numpy.exp(-(x - b)**2 / (2*c**2))
    return val

# Generate fake data.
zMinEntry = 80.0*1E-06
zMaxEntry = 180.0*1E-06
zStepEntry = 0.2*1E-06

x = numpy.arange(zMinEntry, 
                 zMaxEntry, 
                 zStepEntry, 
                 dtype = numpy.float64)    

n = len(x)                 

meanY = zMinEntry + (zMaxEntry - zMinEntry)/2
sigmaY = 10.0E-06

a = 1.0/(sigmaY*numpy.sqrt(2*numpy.pi))
y = gaussian(x, a, meanY, sigmaY) + a*0.1*numpy.random.normal(0, 1, size=len(x))

print a, meanY, sigmaY
# estimate starting values from the data
a = y.max()
b = x[numpy.argmax(a)]
c = b

# define a least squares function to optimizedefminfunc(params):
    returnsum((y-gaussian(x,params[0],params[1],params[2]))**2)

# fit
popt = fmin(minfunc,[a,b,c])

# Print resultsprint("Scale =  %.3f" % (popt[0]))
print("Offset = %.3f" % (popt[1]))
print("Sigma =  %.3f" % (popt[2]))


pylab.plot(x, y, 'ro')
pylab.plot(x, gaussian(x, popt[0], popt[1], popt[2]),lw = 2)
pylab.xlim(x.min(),x.max())
pylab.grid(True)
pylab.show()

enter image description here

Solution 2:

Looks like some numerical instabilities are creeping into the optimizer. Try scaling the data. With the following data:

zMinEntry = 80.0*1E-06 * 1000zMaxEntry = 180.0*1E-06 * 1000zStepEntry = 0.2*1E-06 * 1000sigmaY = 10.0E-06 * 1000

I get estimates of

Scale =  39.697 +/- 0.526Offset = 0.130 +/- 0.000Sigma =  -0.010 +/- 0.000

Compare that to the true values:

Scale = 39.894228Offset = 0.13Sigma = 0.01

The minus sign of sigma can of course be ignored.

This gives the following plot

enter image description here

Solution 3:

As I said in a comment, if you provide a reasonable initial guess, the fit succeeds, i.e. call curve_fit like that:

popt, pcov = curve_fit(gaussian, x, y, [50000,0.00012,0.00002])

graph

Post a Comment for "Python Gaussian Fit On Simulated Gaussian Noisy Data"