Hi
I got a data like the format below X Y 1 6 2 8 3 ? 4 ? 5 12 The data has 5 rows, for each value in X, some of them have a value in Y. In this data, the values of Y are missing for 3 and 4. What I want is to fill up the missing value by using the linear regression of values at X=2 and 5. Question is how do I do this through SPSS? Thanks -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
You can do this simply by saving the predicted values from the regression. data list list/x y. begin data 1 6 2 8 3 . 4 . 5 12 end data dataset name xy. REGRESSION /DEPENDENT y /METHOD=ENTER x /SAVE PRED. On Mon, Feb 25, 2019 at 6:59 PM albert_sun <[hidden email]> wrote: Hi |
I should have finished by adding the command to copy over missing values: if missing(y) y = PRE_1. If PRE_1 existed before the regression was run, the name of the predicted values variable would be different. Note that treating the imputed value as if they are real values might give misleading statistical results. On Tue, Feb 26, 2019 at 7:44 AM Jon Peck <[hidden email]> wrote:
|
In reply to this post by albert_sun
You meant to say linear interpolation?
In Transform menu, find Replace Missing Values. 26.02.2019 4:54, albert_sun пишет:
Hi I got a data like the format below X Y 1 6 2 8 3 ? 4 ? 5 12 The data has 5 rows, for each value in X, some of them have a value in Y. In this data, the values of Y are missing for 3 and 4. What I want is to fill up the missing value by using the linear regression of values at X=2 and 5. Question is how do I do this through SPSS? Thanks -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Jon Peck
Thanks for your reply, I should explain my issue further.
See the example data below, the plot of x and y is kind of a logistic regression curve, as I am not sure how to get the regression coefficient from SPSS, so I thought to use linear interpolation around the missing values might give me a good approximate (a smoother curve). I did try all methods under Replace Missing Values (RMV), and among all options, I think the most closing one is "mean of nearby points". The issue of this method is that if there are two consecutive missing values, the estimate from RMV will give the same results. data list list/x y. begin data 0 0.09 1 0.24 2 0.63 3 1.04 4 1.64 5 2.38 6 3.92 7 6.37 8 9 10 19.46 11 12 31.2 13 37.28 14 42.91 15 16 52.79 17 56.47 18 19 64.5 20 67.38 21 70.5 22 23 75.35 24 77.64 25 . 26 82.05 27 84.04 28 . 29 88.94 30 91.11 31 92.94 32 . 33 96.5 34 98.13 35 . 36 99.7 37 99.86 38 . 39 100 40 . end data. plot of x and Y <http://spssx-discussion.1045642.n5.nabble.com/file/t339934/1.jpg> -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Kirill Orlov
I thought Linear Interpolation was for Time Series Data. I didn't get a solution when I tried it. I think John's simple syntax approach is the correct one. Maybe Joost Van Ginkel has a solution.
Brian
From: SPSSX(r) Discussion <[hidden email]> on behalf of Kirill Orlov <[hidden email]>
Sent: Tuesday, February 26, 2019 10:53:47 AM To: [hidden email] Subject: Re: How to fill in the missing values? You meant to say linear interpolation?
In Transform menu, find Replace Missing Values. 26.02.2019 4:54, albert_sun пишет:
Hi I got a data like the format below X Y 1 6 2 8 3 ? 4 ? 5 12 The data has 5 rows, for each value in X, some of them have a value in Y. In this data, the values of Y are missing for 3 and 4. What I want is to fill up the missing value by using the linear regression of values at X=2 and 5. Question is how do I do this through SPSS? Thanks -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Jon Peck
I'll jump in here before Art K does and suggest that the OP make a new
variable to hold the imputed values so that the original variable is preserved. COMPUTE Y2 = Y. IF MISSING(Y2) Y2 = PRE_1. ;-) Jon Peck wrote > I should have finished by adding the command to copy over missing values: > if missing(y) y = PRE_1. > > If PRE_1 existed before the regression was run, the name of the predicted > values variable would be different. > Note that treating the imputed value as if they are real values might give > misleading statistical results. > > > On Tue, Feb 26, 2019 at 7:44 AM Jon Peck < > jkpeck@ > > wrote: > >> You can do this simply by saving the predicted values from the >> regression. >> data list list/x y. >> begin data >> 1 6 >> 2 8 >> 3 . >> 4 . >> 5 12 >> end data >> dataset name xy. >> REGRESSION >> /DEPENDENT y >> /METHOD=ENTER x >> /SAVE PRED. >> >> >> On Mon, Feb 25, 2019 at 6:59 PM albert_sun < > xiaoxun.sun@ > > wrote: >> >>> Hi >>> >>> I got a data like the format below >>> >>> X Y >>> 1 6 >>> 2 8 >>> 3 ? >>> 4 ? >>> 5 12 >>> >>> The data has 5 rows, for each value in X, some of them have a value in >>> Y. >>> In >>> this data, the values of Y are missing for 3 and 4. >>> >>> What I want is to fill up the missing value by using the linear >>> regression >>> of values at X=2 and 5. >>> >>> Question is how do I do this through SPSS? >>> >>> Thanks >>> >>> >>> >>> -- >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >> >> >> -- >> Jon K Peck >> > jkpeck@ >> >> > > -- > Jon K Peck > jkpeck@ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Art must be lying on the beach. The same tactic (not lying on the beach) could be used with the CURVEFIT command if you want a more flexible fit. On Tue, Feb 26, 2019 at 4:36 PM Bruce Weaver <[hidden email]> wrote: I'll jump in here before Art K does and suggest that the OP make a new |
Administrator
|
Good point re CURVEFIT. It had also occurred to me that one could use
UNIANOVA to save the unstandardized fitted values. But when I tried it (see below), I found that it did not save the fitted values for cases where Y was missing. I did not expect that! (I'm using 64-bit SPSS 25.0.0.2 for Windows, by the way.) Any thoughts on why UNIANOVA behaves differently than CURVEFIT and REGRESSION, Jon? SHOW MXWARNS. PRESERVE. SET MXWARNS=0. DATA LIST FREE / X Y. begin data 0 0.09 1 0.24 2 0.63 3 1.04 4 1.64 5 2.38 6 3.92 7 6.37 8 . 9 . 10 19.46 11 . 12 31.2 13 37.28 14 42.91 15 . 16 52.79 17 56.47 18 . 19 64.5 20 67.38 21 70.5 22 . 23 75.35 24 77.64 25 . 26 82.05 27 84.04 28 . 29 88.94 30 91.11 31 92.94 32 . 33 96.5 34 98.13 35 . 36 99.7 37 99.86 38 . 39 100 40 . END DATA. RESTORE. SHOW MXWARNS. * Mean-center X to prevent REGRESSION from excluding X from the model later. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /Xmean=MEAN(x) . COMPUTE X = x-Xmean. * COMPUTE X^2 and X^3 for use with REGRESSION later. COMPUTE Xsq = X**2. COMPUTE Xcu = X**3. GRAPH /SCATTERPLOT(BIVAR)=X WITH Y /MISSING=LISTWISE. * At first glance, it looks like a cubic fit might not be too bad. * It may not fit as well at the extremes of the X-axis, but let's go with it for now. * [1] Use CURVEFIT to estimate the model and save the fitted values of Y. * Curve Estimation. TSET MXNEWVAR=1. CURVEFIT /VARIABLES=Y WITH X /CONSTANT /MODEL=CUBIC /PLOT FIT /PRINT ANOVA /SAVE=PRED . * [2] Now use REGRESSION. REGRESSION /DEPENDENT Y /METHOD=ENTER X Xsq Xcu /SAVE PRED. * [3] Finally, use UNIANOVA. * Note that UNIANOVA does not require computation of the polynomial terms, * as they can be specified on the DESIGN sub-command as x*x and x*x*x. UNIANOVA Y WITH X /SAVE=PRED /CRITERIA=ALPHA(0.05) /DESIGN=X X*X X*X*X. VARIABLE LABELS X "X (mean-centered)" Y "Y (mean-centered)" FIT_1 "Y-hat from CURVEFIT" PRE_1 "Y-hat from REGRESSION" PRE_2 "Y-hat from UNIANOVA" . DESCRIPTIVES x y FIT_1 PRE_1 PRE_2. TEMPORARY. SELECT IF MISSING(y). LIST x y FIT_1 PRE_1 PRE_2. OUTPUT from that final LIST command: X Y FIT_1 PRE_1 PRE_2 -12.00 . 16.63361 16.63361 . -11.00 . 20.23010 20.23010 . -9.00 . 27.78512 27.78512 . -5.00 . 43.77652 43.77652 . -2.00 . 55.90363 55.90363 . 2.00 . 71.21205 71.21205 . 5.00 . 81.33909 81.33909 . 8.00 . 89.69805 89.69805 . 12.00 . 97.14831 97.14831 . 15.00 . 99.25626 99.25626 . 18.00 . 97.77430 97.77430 . 20.00 . 94.52202 94.52202 . * How about that--UNIANOVA does not generate fitted values for * cases where Y is missing. I did not know that. Jon Peck wrote > Art must be lying on the beach. > > The same tactic (not lying on the beach) could be used with the CURVEFIT > command if you want a more flexible fit. > > On Tue, Feb 26, 2019 at 4:36 PM Bruce Weaver < > bruce.weaver@ > > > wrote: > >> I'll jump in here before Art K does and suggest that the OP make a new >> variable to hold the imputed values so that the original variable is >> preserved. >> >> COMPUTE Y2 = Y. >> IF MISSING(Y2) Y2 = PRE_1. >> >> ;-) >> >> >> >> >> >> Jon Peck wrote >> > I should have finished by adding the command to copy over missing >> values: >> > if missing(y) y = PRE_1. >> > >> > If PRE_1 existed before the regression was run, the name of the >> predicted >> > values variable would be different. >> > Note that treating the imputed value as if they are real values might >> give >> > misleading statistical results. >> > >> > >> > On Tue, Feb 26, 2019 at 7:44 AM Jon Peck < >> >> > jkpeck@ >> >> > > wrote: >> > >> >> You can do this simply by saving the predicted values from the >> >> regression. >> >> data list list/x y. >> >> begin data >> >> 1 6 >> >> 2 8 >> >> 3 . >> >> 4 . >> >> 5 12 >> >> end data >> >> dataset name xy. >> >> REGRESSION >> >> /DEPENDENT y >> >> /METHOD=ENTER x >> >> /SAVE PRED. >> >> >> >> >> >> On Mon, Feb 25, 2019 at 6:59 PM albert_sun < >> >> > xiaoxun.sun@ >> >> > > wrote: >> >> >> >>> Hi >> >>> >> >>> I got a data like the format below >> >>> >> >>> X Y >> >>> 1 6 >> >>> 2 8 >> >>> 3 ? >> >>> 4 ? >> >>> 5 12 >> >>> >> >>> The data has 5 rows, for each value in X, some of them have a value >> in >> >>> Y. >> >>> In >> >>> this data, the values of Y are missing for 3 and 4. >> >>> >> >>> What I want is to fill up the missing value by using the linear >> >>> regression >> >>> of values at X=2 and 5. >> >>> >> >>> Question is how do I do this through SPSS? >> >>> >> >>> Thanks >> >>> >> >>> >> >>> >> >>> -- >> >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >>> >> >>> ===================== >> >>> To manage your subscription to SPSSX-L, send a message to >> >>> >> >> > LISTSERV@.UGA >> >> > (not to SPSSX-L), with no body text except the >> >>> command. To leave the list, send the command >> >>> SIGNOFF SPSSX-L >> >>> For a list of commands to manage subscriptions, send the command >> >>> INFO REFCARD >> >>> >> >> >> >> >> >> -- >> >> Jon K Peck >> >> >> >> > jkpeck@ >> >> >> >> >> >> > >> > -- >> > Jon K Peck >> >> > jkpeck@ >> >> > >> > ===================== >> > To manage your subscription to SPSSX-L, send a message to >> >> > LISTSERV@.UGA >> >> > (not to SPSSX-L), with no body text except the >> > command. To leave the list, send the command >> > SIGNOFF SPSSX-L >> > For a list of commands to manage subscriptions, send the command >> > INFO REFCARD >> >> >> >> >> >> ----- >> -- >> Bruce Weaver >> > bweaver@ >> http://sites.google.com/a/lakeheadu.ca/bweaver/ >> >> "When all else fails, RTFM." >> >> NOTE: My Hotmail account is not monitored regularly. >> To send me an e-mail, please use the address shown above. >> >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > > > -- > Jon K Peck > jkpeck@ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Jon Peck
It is still a little cool here in 33772. Only 71. So I am sitting at my desk
looking across the lake at the park on the other side. Actually, I'm working on a Statistics Without Borders project. ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by Bruce Weaver
"Any thoughts on why UNIANOVA behaves differently than CURVEFIT and REGRESSION, Jon? " I don't know, but I suspect that it was convenient (and sometimes useful) to save predicted values for missing value cases in REGRESSION, because it has its own filtering/selection process while UNIANOVA does not. CURVEFIT code was probably based on REGRESSION. I don't see any documentation for any of these procedures that specifies the intended behavior. On Wed, Feb 27, 2019 at 7:06 AM Bruce Weaver <[hidden email]> wrote: Good point re CURVEFIT. It had also occurred to me that one could use |
Free forum by Nabble | Edit this page |