Problem with scatterplot confidence interval via GPL...again!

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with scatterplot confidence interval via GPL...again!

Bruce Weaver
Administrator
Hello folks.  When I was searching the archives before posting, I discovered that I have asked a similar question in the past:  

http://spssx-discussion.1045642.n5.nabble.com/Add-prediction-intervals-to-scatterplot-via-GPL-td1092345.html#a1092347.  

But despite looking at ViAnn's response there, I can't see what I'm doing wrong this time.  Here is my syntax:

* The following Chart Builder syntax adds the regression
* line via syntax (see the last ELEMENT line).

GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(smooth.linear(paeduc*educ)))
END GPL.

* Try adding a 95% CI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(region.confi.smooth.linear(paeduc*educ)))
END GPL.

* This does not work.  I get the following error/warning message:
* "Specification requires scales of two different types to be merged,
*  which is not possible."

* Try changing variable level for educ & paeduc in the file meta-data.
VARIABLE LEVEL educ paeduc(SCALE).
* Chart Builder.
GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(region.confi.smooth.linear(paeduc*educ)))
END GPL.

* This still doesn't work--I get the same error message.

* Sample ELEMENT line copied from the GPL Reference Manual:
* ELEMENT: line(position(region.confi.smooth.linear(salbegin*salary)))
.

As the inserted comments indicate, both attempts to add the confidence interval for the regression line are generating the same error message.  Those reading via Nabble can view the output in the attached Excel file.  

Can anyone spot the problem?  Andy?  

GGRAPH_scatterplot_with_regression_line_&_CI.xls


p.s. - Happy Canada Day to my fellow Canadians tomorrow, and Happy 4th of July to our American friends on Monday.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

ViAnn Beadle
The unit.category function on your DATA statements for educ and paeduc is the culprit.

On Thu, Jun 30, 2016 at 4:11 PM Bruce Weaver <[hidden email]> wrote:
Hello folks.  When I was searching the archives before posting, I discovered
that I have asked a similar question in the past:

http://spssx-discussion.1045642.n5.nabble.com/Add-prediction-intervals-to-scatterplot-via-GPL-td1092345.html#a1092347.

But despite looking at ViAnn's response there, I can't see what I'm doing
wrong this time.  Here is my syntax:

* The following Chart Builder syntax adds the regression
* line via syntax (see the last ELEMENT line).

GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(smooth.linear(paeduc*educ)))
END GPL.

* Try adding a 95% CI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(region.confi.smooth.linear(paeduc*educ)))
END GPL.

* This does not work.  I get the following error/warning message:
* "Specification requires scales of two different types to be merged,
*  which is not possible."

* Try changing variable level for educ & paeduc in the file meta-data.
VARIABLE LEVEL educ paeduc(SCALE).
* Chart Builder.
GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"), unit.category())
  DATA: educ=col(source(s), name("educ"), unit.category())
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(region.confi.smooth.linear(paeduc*educ)))
END GPL.

* This still doesn't work--I get the same error message.

* Sample ELEMENT line copied from the GPL Reference Manual:
* ELEMENT: line(position(region.confi.smooth.linear(salbegin*salary)))
.

As the inserted comments indicate, both attempts to add the confidence
interval for the regression line are generating the same error message.
Those reading via Nabble can view the output in the attached Excel file.

Can anyone spot the problem?  Andy?

GGRAPH_scatterplot_with_regression_line_&_CI.xls
<http://spssx-discussion.1045642.n5.nabble.com/file/n5732580/GGRAPH_scatterplot_with_regression_line_%26_CI.xls>


p.s. - Happy Canada Day to my fellow Canadians tomorrow, and Happy 4th of
July to our American friends on Monday.





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Problem-with-scatterplot-confidence-interval-via-GPL-again-tp5732580.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

Robert L
In reply to this post by Bruce Weaver
This could be an answer to a question that has already been answered, but I had a look at your original question, and it seemed as if a key would be to save the prediction limits in the regression modelling. But this might be too simple a solution?

There is as far as I can see no corresponding way to save the prediction intervals in UNIANOVA, and in GENLIN only the mean intervals can be saved.

*****************************************
DATA LIST LIST/ x(F2) y(F2).
BEGIN DATA
13 7
15 6
16 5
17 4
18 3
END DATA.
 
DATASET NAME pred_ci.

VARIABLE LEVEL y (SCALE).

*The /SAVE command is where the prediction limits are saved.
REGRESSION
  /MISSING LISTWISE
  /STATISTICS DEFAULTS
  /CRITERIA=PIN(.05) POUT(.10) CIN(95)
  /NOORIGIN
  /DEPENDENT y
  /METHOD=ENTER x
  /SAVE ICIN.

* Slight changes from what the GGRAPH run from menus gave:  from "point" to "line" and changes of the "color" to "shape".
GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=x y LICI_1 UICI_1 MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: x=col(source(s), name("x"))
  DATA: y=col(source(s), name("y"))
  DATA: LICI_1=col(source(s), name("LICI_1"))
  DATA: UICI_1=col(source(s), name("UICI_1"))
  GUIDE: axis(dim(1), label("x"))
  GUIDE: axis(dim(2), label("y"))
  TRANS: x_y=eval("x - y")
  TRANS: x_LICI_1=eval("x - 95% L CI for y individual")
  TRANS: x_UICI_1=eval("x - 95% U CI for y individual")
  ELEMENT: point(position(x*y), color.exterior(x_y))
  ELEMENT: line(position(x*LICI_1), shape(shape.dash))
  ELEMENT: line(position(x*UICI_1), shape(shape.dash))
END GPL.

-----Ursprungligt meddelande-----
Från: SPSSX(r) Discussion [mailto:[hidden email]] För Bruce Weaver
Skickat: den 1 juli 2016 00:11
Till: [hidden email]
Ämne: Problem with scatterplot confidence interval via GPL...again!

Hello folks.  When I was searching the archives before posting, I discovered that I have asked a similar question in the past:  

http://spssx-discussion.1045642.n5.nabble.com/Add-prediction-intervals-to-scatterplot-via-GPL-td1092345.html#a1092347.  

.
.
.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Robert Lundqvist
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

Andy W
In reply to this post by Bruce Weaver
I imagine:

  DATA: educ=col(source(s), name("educ"), unit.category())

needs to be sans "unit.category()":

  DATA: educ=col(source(s), name("educ"))

then it will work.

For linear regression to make sense, you should do the same to the paeduc variable, but SPSS will still spit out the line though even if the X axis is categorical. (And I presume treat the arbitrary order of the X axis just like numeric intervals, which default to drawing in alphabetical order.)

------

What is going on is that the VARIABLES part of the GRAPHDATASET subcommand specify the two variables as scale, but the inline GPL specifies them as categorical. Inline GPL wins, so the [LEVEL=?] don't do anything whenever the source is inline. (I'm not sure when exactly those LEVEL statements are used.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

Bruce Weaver
Administrator
Thanks Andy & ViAnn.  I'll give that a try when I get back to the office next week.

Andy W wrote
I imagine:

  DATA: educ=col(source(s), name("educ"), unit.category())

needs to be sans "unit.category()":

  DATA: educ=col(source(s), name("educ"))

then it will work.

For linear regression to make sense, you should do the same to the paeduc variable, but SPSS will still spit out the line though even if the X axis is categorical. (And I presume treat the arbitrary order of the X axis just like numeric intervals, which default to drawing in alphabetical order.)

------

What is going on is that the VARIABLES part of the GRAPHDATASET subcommand specify the two variables as scale, but the inline GPL specifies them as categorical. Inline GPL wins, so the [LEVEL=?] don't do anything whenever the source is inline. (I'm not sure when exactly those LEVEL statements are used.)
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

Bruce Weaver
Administrator
In reply to this post by Robert L
Thanks Robert.  I think that if I get rid of unit.category(), as suggested by ViAnn & Andy, it should work without saving CI limits via REGRESSION.  I'll confirm after I get back to the office and give it a try.  


Robert Lundqvist-3 wrote
This could be an answer to a question that has already been answered, but I had a look at your original question, and it seemed as if a key would be to save the prediction limits in the regression modelling. But this might be too simple a solution?

--- snip ---
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Problem with scatterplot confidence interval via GPL...again!

Bruce Weaver
Administrator
In reply to this post by Bruce Weaver
I can now confirm that getting rid of 'unit.category()' solved the problem.  For the record, here is the final version of the working GGRAPH syntax.  I added 'shape(shape.dash)' to the ELEMENT line for the 95% CI to draw the CI using dashed lines.

* Try adding a 95% CI.
GGRAPH
  /GRAPHDATASET NAME="graphdataset"
   VARIABLES=paeduc[LEVEL=SCALE] educ[LEVEL=SCALE]
   MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: paeduc=col(source(s), name("paeduc"))
  DATA: educ=col(source(s), name("educ"))
  GUIDE: axis(dim(1), label("highest year school completed, father"))
  GUIDE: axis(dim(2), label("highest year of school completed"))
  ELEMENT: point(position(paeduc*educ))
  ELEMENT: line(position(smooth.linear(paeduc*educ)))
  ELEMENT: line(position(region.confi.smooth.linear(paeduc*educ)), shape(shape.dash))
END GPL.




Bruce Weaver wrote
Thanks Andy & ViAnn.  I'll give that a try when I get back to the office next week.

Andy W wrote
I imagine:

  DATA: educ=col(source(s), name("educ"), unit.category())

needs to be sans "unit.category()":

  DATA: educ=col(source(s), name("educ"))

then it will work.

For linear regression to make sense, you should do the same to the paeduc variable, but SPSS will still spit out the line though even if the X axis is categorical. (And I presume treat the arbitrary order of the X axis just like numeric intervals, which default to drawing in alphabetical order.)

------

What is going on is that the VARIABLES part of the GRAPHDATASET subcommand specify the two variables as scale, but the inline GPL specifies them as categorical. Inline GPL wins, so the [LEVEL=?] don't do anything whenever the source is inline. (I'm not sure when exactly those LEVEL statements are used.)
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).