Does anyone know how to fit a Gaussian curve to data in SPSS? It seems
it must involve a least squares procedure, but I cannot see how this can be done in SPSS. What I have is a spectrum of elements. Some exhibit emission lines, which peak above the baseline of the data, and some exhibit absorption lines, which peak below the baseline of the data. I need to be able fit a Gaussian curve to both situations, and once fitted, be able to get the wavelength (the x-axis) at the peak of the estimated Gaussian curve. Stan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
What is the problem? You fit a Gaussian (Normal)
by computing the mean and standard deviation. The difference between the Least Squares solution and the Maximum Likelihood solution is that ML uses N instead of (N-1) in computing the average squared deviation. Do you have something else in mind? -- Rich Ulrich ---------------------------------------- > Date: Sun, 8 Dec 2013 10:58:43 -0700 > From: [hidden email] > Subject: Fitting a Gaussian > To: [hidden email] > > Does anyone know how to fit a Gaussian curve to data in SPSS? It seems > it must involve a least squares procedure, but I cannot see how this > can be done in SPSS. What I have is a spectrum of elements. Some exhibit > emission lines, which peak above the baseline of the data, and some > exhibit absorption lines, which peak below the baseline of the data. I > need to be able fit a Gaussian curve to both situations, and once > fitted, be able to get the wavelength (the x-axis) at the peak of the > estimated Gaussian curve. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Rich,
You may be correct, but although the mean is a parameter in the Gaussian pdf (probability density function), I do not think that just any computed mean is necessarily the mean of the best fitted Gaussian curve. I think you need to first get the best fitted curve and then the mean of that curve would result. How scewness would be handled is another problem because then I do not think a mean would correspond to the maximum in the y-axis, but I could be wrong and this is the reason I am asking these questions. Stan On 12/8/2013 1:32 PM, Rich Ulrich wrote: > What is the problem? You fit a Gaussian (Normal) > by computing the mean and standard deviation. > > The difference between the Least Squares solution > and the Maximum Likelihood solution is that ML uses > N instead of (N-1) in computing the average squared > deviation. > > Do you have something else in mind? > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
To expand some more, not all sample curves are going to conform to the
shape of a Gaussian curve (and some could deviate significantly) which will create a problem in the mean approach you suggest. I think what one needs to do is go through an interative process in fitting a Gaussian curve, selecting the curve that has the lowest error. Then the point on the x-axis that conforms to the maximum of the estimated curve can be determined. Stan On 12/8/2013 1:48 PM, Stan Gorodenski wrote: > Rich, > You may be correct, but although the mean is a parameter in the Gaussian > pdf (probability density function), I do not think that just any > computed mean is necessarily the mean of the best fitted Gaussian curve. > I think you need to first get the best fitted curve and then the mean of > that curve would result. How scewness would be handled is another > problem because then I do not think a mean would correspond to the > maximum in the y-axis, but I could be wrong and this is the reason I am > asking these questions. > Stan > > > On 12/8/2013 1:32 PM, Rich Ulrich wrote: >> What is the problem? You fit a Gaussian (Normal) >> by computing the mean and standard deviation. >> >> The difference between the Least Squares solution >> and the Maximum Likelihood solution is that ML uses >> N instead of (N-1) in computing the average squared >> deviation. >> >> Do you have something else in mind? >> >> > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Stan Gorodenski
Please describe your
data in more detail. What is a case? What are your variables?
Do you have repeated measures over time? Along some spectrum?
Art Kendall Social Research ConsultantsOn 12/8/2013 12:59 PM, Stan Gorodenski [via SPSSX Discussion] wrote: Does anyone know how to fit a Gaussian curve to data in SPSS? It seems
Art Kendall
Social Research Consultants |
In reply to this post by Stan Gorodenski
Sorry to bring it up, but "selecting the curve that has
the lowest error" is what you get from LS fitting or from ML fitting; and those both result in the mean and SD. Those *are* the best-fitted curves, by the usual criteria. If you want something less sensitive to outliers, you might try some "robust" fitting methods - trimming 1%... 5%... 25% outliers from each end to find the mean; or using someone's weighting method, especially if you find one that has been used for data similar to yours. If you want to use ranks, you could report median and inter-quartile range. The median corresponds to minimizing absolute differences instead of minimizing squared differences. There's a standard multiplier often used to convert IQR to SD, but I don't recall what that is. Converting might be less appropriate if your curves are really weird. -- Rich Ulrich ---------------------------------------- > Date: Sun, 8 Dec 2013 13:58:09 -0700 > From: [hidden email] > Subject: Re: Fitting a Gaussian > To: [hidden email] > > To expand some more, not all sample curves are going to conform to the > shape of a Gaussian curve (and some could deviate significantly) which > will create a problem in the mean approach you suggest. I think what one > needs to do is go through an interative process in fitting a Gaussian > curve, selecting the curve that has the lowest error. Then the point on > the x-axis that conforms to the maximum of the estimated curve can be > determined. > Stan > > On 12/8/2013 1:48 PM, Stan Gorodenski wrote: >> Rich, >> You may be correct, but although the mean is a parameter in the Gaussian >> pdf (probability density function), I do not think that just any >> computed mean is necessarily the mean of the best fitted Gaussian curve. >> I think you need to first get the best fitted curve and then the mean of >> that curve would result. How scewness would be handled is another >> problem because then I do not think a mean would correspond to the >> maximum in the y-axis, but I could be wrong and this is the reason I am >> asking these questions. >> Stan >> >> >> On 12/8/2013 1:32 PM, Rich Ulrich wrote: >>> What is the problem? You fit a Gaussian (Normal) >>> by computing the mean and standard deviation. >>> >>> The difference between the Least Squares solution >>> and the Maximum Likelihood solution is that ML uses >>> N instead of (N-1) in computing the average squared >>> deviation. >>> >>> Do you have something else in mind? >>> >>>... ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Rich,
I just do not see it. If we knew that the curve displayed by the actual data is a sample from a Gaussian and if we knew the beginning and ending points of the true Gaussian curve (if it really is a Gaussian) then I think what you say may be correct. However, we do not know the sample curve is from a true Gaussian, and we do not know the beginning and ending points of the true curve. Hence, your rational does not work. Part of the iterative process, as I envision it, would also include shifting the estimated curve along the x-axis to get a curve that has minimum error, but this would be a pretty tedious process, hence the reason for my initial question. You may be right, but unless you or others can make me see the light, if there is a light to be seen, I will have to go on the assumption I am correct. Stan On 12/8/2013 9:52 PM, Rich Ulrich wrote: > Sorry to bring it up, but "selecting the curve that has > the lowest error" is what you get from LS fitting or > from ML fitting; and those both result in the mean and SD. > Those *are* the best-fitted curves, by the usual criteria. > > If you want something less sensitive to outliers, you might > try some "robust" fitting methods - trimming 1%... 5%... > 25% outliers from each end to find the mean; or using > someone's weighting method, especially if you find one > that has been used for data similar to yours. > > If you want to use ranks, you could report median and > inter-quartile range. The median corresponds to minimizing > absolute differences instead of minimizing squared > differences. There's a standard multiplier often used to > convert IQR to SD, but I don't recall what that is. Converting > might be less appropriate if your curves are really weird. > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Stan Gorodenski
What about the K-S command in the non parametric tests command? It seems that one element of the discussion is whether the data actually fits any normal curve. Wouldn't K-S do that, and doesn't it also show the estimated mean and SD. Also, doesn't GPL offer the option of putting a normal curve on a frequency distribution (and doesn't the frequencies command also offer this, although frequencies might not work so well with continuous spectra data)?
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Stan Gorodenski Sent: Sunday, December 08, 2013 12:59 PM To: [hidden email] Subject: Fitting a Gaussian Does anyone know how to fit a Gaussian curve to data in SPSS? It seems it must involve a least squares procedure, but I cannot see how this can be done in SPSS. What I have is a spectrum of elements. Some exhibit emission lines, which peak above the baseline of the data, and some exhibit absorption lines, which peak below the baseline of the data. I need to be able fit a Gaussian curve to both situations, and once fitted, be able to get the wavelength (the x-axis) at the peak of the estimated Gaussian curve. Stan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Stan Gorodenski
Stan,
The subject line does not seem to match what you are asking in the post. If you want to assume that the data arise from a normal distribution, then the sample (unconditional) mean will be the predicted value from OLS regression. If, on the other hand, you want to empirically evaluate which of several continuous distributions optimally fits your data, then you could plot your data and test the fit of each distribution. Ryan > On Dec 9, 2013, at 1:02 AM, Stan Gorodenski <[hidden email]> wrote: > > Rich, > I just do not see it. If we knew that the curve displayed by the actual > data is a sample from a Gaussian and if we knew the beginning and ending > points of the true Gaussian curve (if it really is a Gaussian) then I > think what you say may be correct. However, we do not know the sample > curve is from a true Gaussian, and we do not know the beginning and > ending points of the true curve. Hence, your rational does not work. > Part of the iterative process, as I envision it, would also include > shifting the estimated curve along the x-axis to get a curve that has > minimum error, but this would be a pretty tedious process, hence the > reason for my initial question. You may be right, but unless you or > others can make me see the light, if there is a light to be seen, I will > have to go on the assumption I am correct. > Stan > >> On 12/8/2013 9:52 PM, Rich Ulrich wrote: >> Sorry to bring it up, but "selecting the curve that has >> the lowest error" is what you get from LS fitting or >> from ML fitting; and those both result in the mean and SD. >> Those *are* the best-fitted curves, by the usual criteria. >> >> If you want something less sensitive to outliers, you might >> try some "robust" fitting methods - trimming 1%... 5%... >> 25% outliers from each end to find the mean; or using >> someone's weighting method, especially if you find one >> that has been used for data similar to yours. >> >> If you want to use ranks, you could report median and >> inter-quartile range. The median corresponds to minimizing >> absolute differences instead of minimizing squared >> differences. There's a standard multiplier often used to >> convert IQR to SD, but I don't recall what that is. Converting >> might be less appropriate if your curves are really weird. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
>
> Stan, > > The subject line does not seem to match what you are asking in the post. > Thank you, Ryan, and everyone else who attempted to help. I think I see the problem is that I am asking a question to a group that is oriented toward thinking in a certain way. The astronomy discussion groups and the astronomy colleagues I know, know what I am referring to when I say "Fitting a Gaussian." I will retract my question and assume what I would like to do in SPSS cannot be done (without onerous programming), and pursue this in the astronomy discussion groups. I believe there is astronomy freeware that does what I want, but since I own SPSS and do some spectral analysis with it, it would have been nice to have SPSS also fit a Gaussian to get the Angstrom location of the peak of the fitted curve. Stan > If you want to assume that the data arise from a normal distribution, > then the sample (unconditional) mean will be the predicted value from > OLS regression. > > If, on the other hand, you want to empirically evaluate which of > several continuous distributions optimally fits your data, then you > could plot your data and test the fit of each distribution. > > Ryan > > >> On Dec 9, 2013, at 1:02 AM, Stan Gorodenski<[hidden email]> wrote: >> >> Rich, >> I just do not see it. If we knew that the curve displayed by the actual >> data is a sample from a Gaussian and if we knew the beginning and ending >> points of the true Gaussian curve (if it really is a Gaussian) then I >> think what you say may be correct. However, we do not know the sample >> curve is from a true Gaussian, and we do not know the beginning and >> ending points of the true curve. Hence, your rational does not work. >> Part of the iterative process, as I envision it, would also include >> shifting the estimated curve along the x-axis to get a curve that has >> minimum error, but this would be a pretty tedious process, hence the >> reason for my initial question. You may be right, but unless you or >> others can make me see the light, if there is a light to be seen, I will >> have to go on the assumption I am correct. >> Stan >> >> >>> On 12/8/2013 9:52 PM, Rich Ulrich wrote: >>> Sorry to bring it up, but "selecting the curve that has >>> the lowest error" is what you get from LS fitting or >>> from ML fitting; and those both result in the mean and SD. >>> Those *are* the best-fitted curves, by the usual criteria. >>> >>> If you want something less sensitive to outliers, you might >>> try some "robust" fitting methods - trimming 1%... 5%... >>> 25% outliers from each end to find the mean; or using >>> someone's weighting method, especially if you find one >>> that has been used for data similar to yours. >>> >>> If you want to use ranks, you could report median and >>> inter-quartile range. The median corresponds to minimizing >>> absolute differences instead of minimizing squared >>> differences. There's a standard multiplier often used to >>> convert IQR to SD, but I don't recall what that is. Converting >>> might be less appropriate if your curves are really weird. >>> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Is this an example of what you mean?
http://www.eg.bucknell.edu/physics/ASTR201/IDLTutorial/tutorial_05.html
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In reply to this post by Stan Gorodenski
A quick internet search suggests this MIGHT be possible with NLR?
I don't have time to fiddle with any details! --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
Bruce,
Yes, this is exactly what I mean. Thanks for the link. I guess I should have been able to find it myself by doing a search on the internet. I will study it, but I noticed right off that I was correct in my reasoning that it would involve minimizing some kind of error measure, and it would require iterations (if not by the researcher, then inherent in the program). Stan > Is this an example of what you mean? > > http://www.eg.bucknell.edu/physics/ASTR201/IDLTutorial/tutorial_05.html > > > > Stan Gorodenski wrote > >>> Stan, >>> >>> The subject line does not seem to match what you are asking in the post. >>> >>> >> Thank you, Ryan, and everyone else who attempted to help. I think I see >> the problem is that I am asking a question to a group that is oriented >> toward thinking in a certain way. The astronomy discussion groups and >> the astronomy colleagues I know, know what I am referring to when I say >> "Fitting a Gaussian." I will retract my question and assume what I would >> like to do in SPSS cannot be done (without onerous programming), and >> pursue this in the astronomy discussion groups. I believe there is >> astronomy freeware that does what I want, but since I own SPSS and do >> some spectral analysis with it, it would have been nice to have SPSS >> also fit a Gaussian to get the Angstrom location of the peak of the >> fitted curve. >> Stan >> >> >>> If you want to assume that the data arise from a normal distribution, >>> then the sample (unconditional) mean will be the predicted value from >>> OLS regression. >>> >>> If, on the other hand, you want to empirically evaluate which of >>> several continuous distributions optimally fits your data, then you >>> could plot your data and test the fit of each distribution. >>> >>> Ryan >>> >>> >>> >>>> On Dec 9, 2013, at 1:02 AM, Stan Gorodenski< >>>> > >> stanlep@ >> > >> > wrote: >> >>>> Rich, >>>> I just do not see it. If we knew that the curve displayed by the actual >>>> data is a sample from a Gaussian and if we knew the beginning and ending >>>> points of the true Gaussian curve (if it really is a Gaussian) then I >>>> think what you say may be correct. However, we do not know the sample >>>> curve is from a true Gaussian, and we do not know the beginning and >>>> ending points of the true curve. Hence, your rational does not work. >>>> Part of the iterative process, as I envision it, would also include >>>> shifting the estimated curve along the x-axis to get a curve that has >>>> minimum error, but this would be a pretty tedious process, hence the >>>> reason for my initial question. You may be right, but unless you or >>>> others can make me see the light, if there is a light to be seen, I will >>>> have to go on the assumption I am correct. >>>> Stan >>>> >>>> >>>> >>>>> On 12/8/2013 9:52 PM, Rich Ulrich wrote: >>>>> Sorry to bring it up, but "selecting the curve that has >>>>> the lowest error" is what you get from LS fitting or >>>>> from ML fitting; and those both result in the mean and SD. >>>>> Those *are* the best-fitted curves, by the usual criteria. >>>>> >>>>> If you want something less sensitive to outliers, you might >>>>> try some "robust" fitting methods - trimming 1%... 5%... >>>>> 25% outliers from each end to find the mean; or using >>>>> someone's weighting method, especially if you find one >>>>> that has been used for data similar to yours. >>>>> >>>>> If you want to use ranks, you could report median and >>>>> inter-quartile range. The median corresponds to minimizing >>>>> absolute differences instead of minimizing squared >>>>> differences. There's a standard multiplier often used to >>>>> convert IQR to SD, but I don't recall what that is. Converting >>>>> might be less appropriate if your curves are really weird. >>>>> >>>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> >>>> > >> LISTSERV@.UGA >> > >> (not to SPSSX-L), with no body text except the >> >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> >>> > >> LISTSERV@.UGA >> > >> (not to SPSSX-L), with no body text except the >> >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >>> >>> >>> >>> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > >> LISTSERV@.UGA >> > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > > > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Fitting-a-Gaussian-tp5723505p5723525.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Looking briefly at Bruce's link. This appears very much like the sort of problem which can be solved using NLR or CNLR (if there are constraints on the permissible values of the parameters).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Stan Gorodenski
Using the noisygaussian.txt data file available on that page, one can reproduce the scatter-plot shown there. Using WEIGHT, it is also possible to generate a histogram with superimposed normal curve, but you run into some problems because some of the Y-values are negative, and weights must be positive. Below, I shift the Y-values by centering on the minimum value of Y and then adding a small increment. This probably doesn't give exactly what you want, but it might give you some ideas.
new file. dataset close all. GET DATA /TYPE=TXT /FILE="C:\Temp\noisygaussian.txt" /ENCODING='Locale' /FIXCASE=1 /ARRANGEMENT=FIXED /FIRSTCASE=2 /IMPORTCASE=ALL /VARIABLES= /1 x 0-12 F13.4 y 13-25 F13.9 unc 26-38 F13.9. CACHE. EXECUTE. GRAPH /SCATTERPLOT(BIVAR)=x WITH y /MISSING=LISTWISE. DESCRIPTIVES X Y. WEIGHT by Y. GRAPH /HISTOGRAM(normal)=x. WEIGHT off. * Weighting by Y does not work perfectly, because some * Y-values are negative, and weights must be positive. * Get the minimum value of Y and center Y on it. * Then weight by min-centered Y + .000001. * The extra bit added so no weights = 0. AGGREGATE /Ymin=MIN(y). COMPUTE Ymincent = Y - Ymin + .000001. DESCRIPTIVES X Y Ymincent. WEIGHT by Ymincent. GRAPH /HISTOGRAM(normal)=x. WEIGHT off.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks very much, Bruce. You are way ahead of me.
Stan On 12/9/2013 11:43 AM, Bruce Weaver wrote: > Using the *noisygaussian.txt* data file available on that page, one can > reproduce the scatter-plot shown there. Using WEIGHT, it is also possible > to generate a histogram with superimposed normal curve, but you run into > some problems because some of the Y-values are negative, and weights must be > positive. Below, I shift the Y-values by centering on the minimum value of > Y and then adding a small increment. This probably doesn't give exactly > what you want, but it might give you some ideas. > > new file. > dataset close all. > > GET DATA /TYPE=TXT > /FILE="C:\Temp\noisygaussian.txt" > /ENCODING='Locale' > /FIXCASE=1 > /ARRANGEMENT=FIXED > /FIRSTCASE=2 > /IMPORTCASE=ALL > /VARIABLES= > /1 x 0-12 F13.4 > y 13-25 F13.9 > unc 26-38 F13.9. > CACHE. > EXECUTE. > > GRAPH > /SCATTERPLOT(BIVAR)=x WITH y > /MISSING=LISTWISE. > > DESCRIPTIVES X Y. > > WEIGHT by Y. > GRAPH /HISTOGRAM(normal)=x. > WEIGHT off. > > * Weighting by Y does not work perfectly, because some > * Y-values are negative, and weights must be positive. > > * Get the minimum value of Y and center Y on it. > * Then weight by min-centered Y + .000001. > * The extra bit added so no weights = 0. > > AGGREGATE /Ymin=MIN(y). > > COMPUTE Ymincent = Y - Ymin + .000001. > DESCRIPTIVES X Y Ymincent. > > WEIGHT by Ymincent. > GRAPH /HISTOGRAM(normal)=x. > WEIGHT off. > > > > > Stan Gorodenski wrote > >> Bruce, >> Yes, this is exactly what I mean. Thanks for the link. I guess I should >> have been able to find it myself by doing a search on the internet. I >> will study it, but I noticed right off that I was correct in my >> reasoning that it would involve minimizing some kind of error measure, >> and it would require iterations (if not by the researcher, then inherent >> in the program). >> Stan >> >> >> >>> Is this an example of what you mean? >>> >>> >>> http://www.eg.bucknell.edu/physics/ASTR201/IDLTutorial/tutorial_05.html >>> >>> >>> >>> Stan Gorodenski wrote >>> >>> >>>>> Stan, >>>>> >>>>> The subject line does not seem to match what you are asking in the >>>>> post. >>>>> >>>>> >>>>> >>>> Thank you, Ryan, and everyone else who attempted to help. I think I see >>>> the problem is that I am asking a question to a group that is oriented >>>> toward thinking in a certain way. The astronomy discussion groups and >>>> the astronomy colleagues I know, know what I am referring to when I say >>>> "Fitting a Gaussian." I will retract my question and assume what I would >>>> like to do in SPSS cannot be done (without onerous programming), and >>>> pursue this in the astronomy discussion groups. I believe there is >>>> astronomy freeware that does what I want, but since I own SPSS and do >>>> some spectral analysis with it, it would have been nice to have SPSS >>>> also fit a Gaussian to get the Angstrom location of the peak of the >>>> fitted curve. >>>> Stan >>>> >>>> >>>> >>>>> If you want to assume that the data arise from a normal distribution, >>>>> then the sample (unconditional) mean will be the predicted value from >>>>> OLS regression. >>>>> >>>>> If, on the other hand, you want to empirically evaluate which of >>>>> several continuous distributions optimally fits your data, then you >>>>> could plot your data and test the fit of each distribution. >>>>> >>>>> Ryan >>>>> >>>>> >>>>> >>>>> >>>>>> On Dec 9, 2013, at 1:02 AM, Stan Gorodenski< >>>>>> >>>>>> >>> >>>> stanlep@ >>>> >>>> >>> >>>> > wrote: >>>> >>>> >>>>>> Rich, >>>>>> I just do not see it. If we knew that the curve displayed by the >>>>>> actual >>>>>> data is a sample from a Gaussian and if we knew the beginning and >>>>>> ending >>>>>> points of the true Gaussian curve (if it really is a Gaussian) then I >>>>>> think what you say may be correct. However, we do not know the sample >>>>>> curve is from a true Gaussian, and we do not know the beginning and >>>>>> ending points of the true curve. Hence, your rational does not work. >>>>>> Part of the iterative process, as I envision it, would also include >>>>>> shifting the estimated curve along the x-axis to get a curve that has >>>>>> minimum error, but this would be a pretty tedious process, hence the >>>>>> reason for my initial question. You may be right, but unless you or >>>>>> others can make me see the light, if there is a light to be seen, I >>>>>> will >>>>>> have to go on the assumption I am correct. >>>>>> Stan >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 12/8/2013 9:52 PM, Rich Ulrich wrote: >>>>>>> Sorry to bring it up, but "selecting the curve that has >>>>>>> the lowest error" is what you get from LS fitting or >>>>>>> from ML fitting; and those both result in the mean and SD. >>>>>>> Those *are* the best-fitted curves, by the usual criteria. >>>>>>> >>>>>>> If you want something less sensitive to outliers, you might >>>>>>> try some "robust" fitting methods - trimming 1%... 5%... >>>>>>> 25% outliers from each end to find the mean; or using >>>>>>> someone's weighting method, especially if you find one >>>>>>> that has been used for data similar to yours. >>>>>>> >>>>>>> If you want to use ranks, you could report median and >>>>>>> inter-quartile range. The median corresponds to minimizing >>>>>>> absolute differences instead of minimizing squared >>>>>>> differences. There's a standard multiplier often used to >>>>>>> convert IQR to SD, but I don't recall what that is. Converting >>>>>>> might be less appropriate if your curves are really weird. >>>>>>> >>>>>>> >>>>>>> >>>>>> ===================== >>>>>> To manage your subscription to SPSSX-L, send a message to >>>>>> >>>>>> >>>>>> >>> >>>> LISTSERV@.UGA >>>> >>>> >>> >>>> (not to SPSSX-L), with no body text except the >>>> >>>> >>>>>> command. To leave the list, send the command >>>>>> SIGNOFF SPSSX-L >>>>>> For a list of commands to manage subscriptions, send the command >>>>>> INFO REFCARD >>>>>> >>>>>> >>>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>>> >>>>> >>>>> >>> >>>> LISTSERV@.UGA >>>> >>>> >>> >>>> (not to SPSSX-L), with no body text except the >>>> >>>> >>>>> command. To leave the list, send the command >>>>> SIGNOFF SPSSX-L >>>>> For a list of commands to manage subscriptions, send the command >>>>> INFO REFCARD >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> >>>> >>> >>>> LISTSERV@.UGA >>>> >>>> >>> >>>> (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>>> >>> >>> >>> >>> ----- >>> -- >>> Bruce Weaver >>> >>> > >> bweaver@ >> > >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/Fitting-a-Gaussian-tp5723505p5723525.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> >>> > >> LISTSERV@.UGA >> > >> (not to SPSSX-L), with no body text except the >> >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >>> >>> >>> >>> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > >> LISTSERV@.UGA >> > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > > > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Fitting-a-Gaussian-tp5723505p5723538.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Bruce Weaver
Weights should be based on the uncertainties in that dataset (e.g. you give a higher weight to things that have less uncertainty). When I think in terms of weights related to statistical procedures, weights are related to the variance, and it only makes sense to have a positive variance. See the bubble plot in the post below where smaller weights are plotted as larger circles.
After that is my best guess to replicate the example on the Bucknell page given the description. I don't have the necessary license to run the CNLR command (and I'm too lazy to figure out the derivatives off-the-cuff) - but that is my best guess. The LOSS function uses the given uncertainties instead of the usual mean squared error. *******************************************. FILE HANDLE data /name = "C:\Documents and Settings\andrew.wheeler\Desktop\SPSS_NonLin". data list free file = "data\noisygaussian.txt" skip = 1 / x y unc. dataset name ngauss. formats x y (F2.1). *Weights should be from the "unc" column - not from Y. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=x y unc /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: x=col(source(s), name("x")) DATA: y=col(source(s), name("y")) DATA: unc=col(source(s), name("unc")) GUIDE: axis(dim(1), label("x")) GUIDE: axis(dim(2), label("y")) SCALE: linear(aesthetic(aesthetic.size), reverse()) ELEMENT: point(position(x*y), size(unc)) END GPL. *Nonlinear function - taken from http://www.eg.bucknell.edu/physics/ASTR201/IDLTutorial/tutorial_05.html. MODEL PROGRAM Amp_ = 1 Cent_ = 10 Wid_ = 10. COMPUTE ARG_ = (x - Cent_)/Wid_. COMPUTE PRED_ = Amp_ * EXP(ARG_**2). COMPUTE LOSS_ = ((PRED_ - y)/unc)**2. *DERIVATIVES. CNLR y /PRED PRED_ /BOUNDS Wid_ > 0 /LOSS = LOSS_. *******************************************. |
Administrator
|
Thanks for running with that one Andy, I'm **WAY TOO LAZY** to dive into the details.
Hopefully Stan can sort the derivatives and roll with a new tool. Does require Advanced Stats module.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks, David, Andy, and Bruce. In fact, I do have the advanced stats
module. Stan On 12/9/2013 2:37 PM, David Marso wrote: > Thanks for running with that one Andy, I'm **WAY TOO LAZY** to dive into the > details. > Hopefully Stan can sort the derivatives and roll with a new tool. Does > require Advanced Stats module. > > > Andy W wrote > >> Weights should be based on the uncertainties in that dataset (e.g. you >> give a higher weight to things that have less uncertainty). When I think >> in terms of weights related to statistical procedures, weights are related >> to the variance, and it only makes sense to have a positive variance. See >> the bubble plot in the post below where smaller weights are plotted as >> larger circles. >> >> After that is my best guess to replicate the example on the Bucknell page >> given the description. I don't have the necessary license to run the CNLR >> command (and I'm too lazy to figure out the derivatives off-the-cuff) - >> but that is my best guess. The LOSS function uses the given uncertainties >> instead of the usual mean squared error. >> >> *******************************************. >> FILE HANDLE data /name = "C:\Documents and >> Settings\andrew.wheeler\Desktop\SPSS_NonLin". >> data list free file = "data\noisygaussian.txt" skip = 1 / x y unc. >> dataset name ngauss. >> formats x y (F2.1). >> *Weights should be from the "unc" column - not from Y. >> GGRAPH >> /GRAPHDATASET NAME="graphdataset" VARIABLES=x y unc >> /GRAPHSPEC SOURCE=INLINE. >> BEGIN GPL >> SOURCE: s=userSource(id("graphdataset")) >> DATA: x=col(source(s), name("x")) >> DATA: y=col(source(s), name("y")) >> DATA: unc=col(source(s), name("unc")) >> GUIDE: axis(dim(1), label("x")) >> GUIDE: axis(dim(2), label("y")) >> SCALE: linear(aesthetic(aesthetic.size), reverse()) >> ELEMENT: point(position(x*y), size(unc)) >> END GPL. >> *Nonlinear function - taken from >> http://www.eg.bucknell.edu/physics/ASTR201/IDLTutorial/tutorial_05.html. >> MODEL PROGRAM Amp_ = 1 Cent_ = 10 Wid_ = 10. >> COMPUTE ARG_ = (x - Cent_)/Wid_. >> COMPUTE PRED_ = Amp_ * EXP(ARG_**2). >> COMPUTE LOSS_ = ((PRED_ - y)/unc)**2. >> *DERIVATIVES. >> CNLR y >> /PRED PRED_ >> /BOUNDS Wid_> 0 >> /LOSS = LOSS_. >> *******************************************. >> > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Fitting-a-Gaussian-tp5723505p5723546.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Hi Stan,
When you come up with a working solution please post it back to the list for archival purposes. Thanks, David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |