Compute a counting variable

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Compute a counting variable

Carlos Renato (www.estatistico.org)
Dear friends od this important list

     I want to compute a variable that count the lines of the base. I want
to
make the same procedure that in excel is (A2)=(A1)+1, (A3)=(A2)+1... etc.
For example:

Variable_Count
1
2
3
4
5
6
7
8
.
.
.
n

Thanks for all and good work.

Carlos Renato
Statistician
Recife - PE - Brazil

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Compute a counting variable

vlad simion
Hi Carlos,

try the "lag" command, somthing like this:
compute id=lag(id)+1.

hth,
Vlad

On Tue, Aug 26, 2008 at 2:24 PM, Carlos Renato <[hidden email]>wrote:

> Dear friends od this important list
>
>     I want to compute a variable that count the lines of the base. I want
> to
> make the same procedure that in excel is (A2)=(A1)+1, (A3)=(A2)+1... etc.
> For example:
>
> Variable_Count
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> .
> .
> .
> n
>
> Thanks for all and good work.
>
> Carlos Renato
> Statistician
> Recife - PE - Brazil
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Compute a counting variable

ViAnn Beadle
In reply to this post by Carlos Renato (www.estatistico.org)
If you want a simple case counter which goes from 1 to the number of cases,
use the built-in system variable to create a new variable:

COMPUTE Variable_Count=$CASENUM.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Carlos Renato
Sent: Tuesday, August 26, 2008 5:24 AM
To: [hidden email]
Subject: Compute a counting variable

Dear friends od this important list

     I want to compute a variable that count the lines of the base. I want
to
make the same procedure that in excel is (A2)=(A1)+1, (A3)=(A2)+1... etc.
For example:

Variable_Count
1
2
3
4
5
6
7
8
.
.
.
n

Thanks for all and good work.

Carlos Renato
Statistician
Recife - PE - Brazil

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Compute a counting variable

Carlos Renato (www.estatistico.org)
In reply to this post by vlad simion
Dear friend Vlad Simion

Unfortunately, no solve my problem. The variable was created but nothing was
computed in the same.

Thanks.

Carlos Renato
>> Statistician
>> Recife - PE - Brazil
>>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Compute a counting variable

Carlos Renato (www.estatistico.org)
In reply to this post by ViAnn Beadle
Thanks Viann Beadle

    This procedure solves the question.

> Carlos Renato
> Statistician
> Recife - PE - Brazil
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Compute a counting variable

Richard Ristow
In reply to this post by vlad simion
ViAnn Beadle's solution is much the simplest for the problem as
posed. But it's worth looking at the LAG solution, because LAG logic
has many uses.

At 07:28 AM 8/26/2008, vlad simion wrote:

>try the "lag" command, somthing like this:
>compute id=lag(id)+1.

And at 07:47 AM 8/26/2008, Carlos Renato replied:

>Unfortunately, no solve my problem. The variable was created but
>nothing was computed in the same.

Right. The problem is in computing the first value. In the first
case, there is no earlier case to 'lag' to, so "lag(id)" is missing.
("Missing" is a special category for SPSS numbers; it's interpreted
as "there's no notion of the value, so we can't use it for
computing.")  Then you have

compute id={missing}+1.

and *that* is missing. So in the second case, "lag(id)" is the
missing value just computed, and again you have

compute id={missing}+1.

which is also missing. And so on, so indeed "id" is computed with all
its values missing.

There are two solutions. Both these are tested.

A. Use function SUM to add, since it ignores missing values. (This is
closest to how Excel does it):

COMPUTE id2 = sum(lag(id2),1).

B. Compute the first case differently. (Most SPSS programmers do it
this way; but that's a matter of style, not of necessity):

DO IF   $CASENUM EQ 1.
.  COMPUTE id3 = 1.
ELSE.
.  COMPUTE id3 = lag(id3)+1.
END IF.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Outliers

ViAnn Beadle
In reply to this post by ViAnn Beadle
If I understand this correctly, you want to show the actual values of
SA_STARTRATE and SA_EXITRATE rather than a case id for those cases having
the outlier values?

-----Original Message-----
From: Genevieve Thompson [mailto:[hidden email]]
Sent: Tuesday, October 07, 2008 12:54 PM
To: ViAnn Beadle
Subject: Outliers

Hi,
I am trying to create a clustered box plot. The boxes represent start
and exit rates for various job classifications. I am trying to show the
outliers for both start and exit rates on the boxplot but in SPSS, I
only have the option of selecting one point ID label (through the chart
builder). Therefore, my graph is showing the outliers for start rates on
both the start and exit boxes. It should show an outlier for the start
rate box and a separate one for the exit rate box. This is the current
syntax being run. Would anyone know what syntax i should run to show
this?

EXAMINE VARIABLES=SA_STARTRATE SA_EXITRATE BY classcollapse
  /COMPARE VARIABLE
  /PLOT=BOXPLOT
  /STATISTICS=NONE
  /NOTOTAL
  /ID=OUTLIERST
  /MISSING=LISTWISE.

Thanks

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Outliers

ViAnn Beadle
I'm not sure how specific values really help the interpretation of a boxplot
beyond the y axis scale.

Nevertheless, here are two different solutions. To demonstrate these
solutions I'm using cars.sav to compare horse (horse power) and mpg (miles
per gallon) for different categories of origin.

**************************************************************************
Reorganize the data:

By and large, most graphics within SPSS are designed to compare values of a
variable within groups of cases rather than to directly compare different
variables. So one approach is to use the VARSTOCASES command to spread your
two variables to separate cases. You can then specify your new single
observation as the ID variable.

VARSTOCASES
  /ID=id
  /MAKE measure FROM horse mpg
  /INDEX=Index1(2)
  /KEEP=origin
  /NULL=KEEP.

variable labels measure "Measure" Index1 "Variables".
value labels index1 1 "Miles Per Gallon" 2 "Horsepower".


EXAMINE VARIABLES=measure BY origin BY Index1
  /PLOT=BOXPLOT
  /STATISTICS=NONE
  /NOTOTAL
  /ID=measure.
****************************************************************************
Use GGRAPH and GPL to produce the plot:

The blend operator within the GPL grammar can be used to combine separate
variables in the boxplot. Things get a bit tricky since the blend is being
specified along with clustering. Here's an example that uses the Cars.sav
sample file. The two variables being plotted are mpg and horse (horsepower)
for different categories of origin.

GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=origin mpg horse
MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: origin=col(source(s), name("origin"), unit.category())
DATA: mpg=col(source(s), name("mpg"))
DATA: horse=col(source(s), name("horse"))
COORD: rect(dim(1,2), cluster(3))
SCALE: linear(dim(2), include(0))
GUIDE: axis(dim(3), label("Country of Origin"))
ELEMENT: schema(position(bin.quantile.letter(("Horsepower"*horse+"Miles Per
Gallon"*mpg)*origin)),
      color("Horsepower"+"Miles per Gallon"), label(horse+mpg))
END GPL.

Details:

1. The blend operator signified by the + is used to combine the variables:
   ...("Horsepower"*horse+"Miles Per Gallon"*mpg)*origin...
2. The blend operator is used to color the two boxes:
   ...color("Horsepower"+"Miles per Gallon")...
3. The blend operator is used to label the outlier values by the values of
horse and mpg:
   ...label(horse+mpg)...
4. Clustering introduces some complications. The cluster function on the
COORD statement reorders the components of the algebra so origin is listed
in the third dimension but actually defines labels on the X axis.

GPL provides a lot more control over the format of the chart including the
ability to specify axes and legend titles.



-----Original Message-----
From: Genevieve Thompson [mailto:[hidden email]]
Sent: Tuesday, October 07, 2008 8:18 PM
To: ViAnn Beadle
Subject: RE: Outliers


Yes thats correct.

-----Original Message-----
From: ViAnn Beadle [mailto:[hidden email]]
Sent: Tue 10/7/2008 5:40 PM
To: Genevieve Thompson
Cc: [hidden email]
Subject: RE: Outliers

If I understand this correctly, you want to show the actual values of
SA_STARTRATE and SA_EXITRATE rather than a case id for those cases having
the outlier values?

-----Original Message-----
From: Genevieve Thompson [mailto:[hidden email]]
Sent: Tuesday, October 07, 2008 12:54 PM
To: ViAnn Beadle
Subject: Outliers

Hi,
I am trying to create a clustered box plot. The boxes represent start
and exit rates for various job classifications. I am trying to show the
outliers for both start and exit rates on the boxplot but in SPSS, I
only have the option of selecting one point ID label (through the chart
builder). Therefore, my graph is showing the outliers for start rates on
both the start and exit boxes. It should show an outlier for the start
rate box and a separate one for the exit rate box. This is the current
syntax being run. Would anyone know what syntax i should run to show
this?

EXAMINE VARIABLES=SA_STARTRATE SA_EXITRATE BY classcollapse
  /COMPARE VARIABLE
  /PLOT=BOXPLOT
  /STATISTICS=NONE
  /NOTOTAL
  /ID=OUTLIERST
  /MISSING=LISTWISE.

Thanks

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD