creating new variables

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

creating new variables

abdelrhman elmubarak

  

Hi

my data looks as below

 

id

Repair Date

Repair Type

money

16

20090430

115

23

30

20090325

209

45

30

20090124

103

62

47

20090409

209

78

47

20090409

101

69


 

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that ….at the final my data should be look as below

 

id

Repair Date

Repair Type

money

more jobs plus

more money plus

16

20090430

115

23

 

 

30

20090325

209

45

0

 

30

20090124

103

62

 

 

47

20090409

209

78

1

69

47

20090409

101

69

 

 


 

I can explain more if it is required

Thanks in advance

 

Abdulrahman

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables

Melissa Ives
One way would be to use the sort function so that the 209 record is last.  (it may help to create a dichotomy for rep209=1 or 0 if not--something like this would be needed if there are >2 repairs in a day)
Then do a lag function to create morejobs and moremoney.
 
THE FOLLOWING ASSUMES that no client has >2 repairs in a day AND 209 is the highest repair value.
 

sort cases by id repdate reptype.

if (id=lag(id) and repdate=lag(repdate) and reptype=209) morejobs=1.

if ((id ne lag(id) or repdate ne lag(repdate)) and reptype=209) morejobs=0.

if (id=lag(id) and repdate=lag(repdate) and reptype=209) moremoney=lag(money).

 

 

The actual syntax will differ if there are more than one possible extra repairs on the same date (i.e. one id could have 3+ per day), but similar logic should work.
 
HTH,
Melissa 
 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of abdelrhman elmubarak
Sent: Thursday, May 07, 2009 3:36 AM
To: [hidden email]
Subject: [SPSSX-L] creating new variables


  

Hi

my data looks as below

 

id

Repair Date

Repair Type

money

16

20090430

115

23

30

20090325

209

45

30

20090124

103

62

47

20090409

209

78

47

20090409

101

69


 

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that ….at the final my data should be look as below

 

id

Repair Date

Repair Type

money

more jobs plus

more money plus

16

20090430

115

23

 

 

30

20090325

209

45

0

 

30

20090124

103

62

 

 

47

20090409

209

78

1

69

47

20090409

101

69

 

 


 

I can explain more if it is required

Thanks in advance

 

Abdulrahman

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables

abdelrhman elmubarak
thanks Melissa
I am finding difficulty if there are more than one possible extra repair on the same day , your synatx work very fine if there are one repair on the same day...any help is higly appreciated.
thanks
Abdulrahman
 

Date: Thu, 7 May 2009 09:14:37 -0500
From: [hidden email]
Subject: Re: creating new variables
To: [hidden email]

One way would be to use the sort function so that the 209 record is last.  (it may help to create a dichotomy for rep209=1 or 0 if not--something like this would be needed if there are >2 repairs in a day)
Then do a lag function to create morejobs and moremoney.
 
THE FOLLOWING ASSUMES that no client has >2 repairs in a day AND 209 is the highest repair value.
 
sort cases by id repdate reptype.
if (id=lag(id) and repdate=lag(repdate) and reptype=209) morejobs=1.
if ((id ne lag(id) or repdate ne lag(repdate)) and reptype=209) morejobs=0.
if (id=lag(id) and repdate=lag(repdate) and reptype=209) moremoney=lag(money).
 
 
The actual syntax will differ if there are more than one possible extra repairs on the same date (i.e. one id could have 3+ per day), but similar logic should work.
 
HTH,
Melissa 
 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of abdelrhman elmubarak
Sent: Thursday, May 07, 2009 3:36 AM
To: [hidden email]
Subject: [SPSSX-L] creating new variables


  

Hi

my data looks as below

 

id

Repair Date

Repair Type

money

16

20090430

115

23

30

20090325

209

45

30

20090124

103

62

47

20090409

209

78

47

20090409

101

69


 

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that ….at the final my data should be look as below

 

id

Repair Date

Repair Type

money

more jobs plus

more money plus

16

20090430

115

23

 

 

30

20090325

209

45

0

 

30

20090124

103

62

 

 

47

20090409

209

78

1

69

47

20090409

101

69

 

 


 

I can explain more if it is required

Thanks in advance

 

Abdulrahman

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.


What can you do with the new Windows Live? Find out
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables

ariel barak
Hi Abdulrahman,
 
The syntax below works with the data you posted and will take care of multiple repairs. If there are multiple repair types of 209 on a given day, the syntax would not work properly. Basically I wrote code to create a total cost for all repairs on a given day for an given ID number excluding the cost of repair type 209. I then added those variables to the original dataset matched by ID and repair date. If the repair type was not 209, I set the new variables to zero.
 
Let me know if this works for your dataset...if not, maybe I can tweak it a little.
 
Thanks,
Ari
 
 
DATA LIST LIST /ID (F8) RepairDate (ADATE10) RepairType (F8) Money (F8).
BEGIN DATA
16 04/30/2009 115 23
30 03/25/2009 103 62
30 01/24/2009 209 45
47 04/09/2009 101 69
47 04/09/2009 209 78
END DATA.
 
DATASET NAME Original.
 
SORT CASES BY ID RepairDate RepairType.
TEMP.
SELECT IF RepairType<>209.
DATASET DECLARE ExtraRepairs.
AGGREGATE
  /OUTFILE='ExtraRepairs'
  /BREAK=ID RepairDate
  /MoreJobsPlus=N
  /MoreMoneyPlus=SUM(Money).
 
MATCH FILES /FILE=*
  /TABLE='ExtraRepairs'
  /BY  ID RepairDate.
EXECUTE.
 
IF RepairType<>209 MoreJobsPlus=0.
IF RepairType<>209 MoreMoneyPlus=0.
EXE.
 
RECODE MoreJobsPlus MoreMoneyPlus (SYSMIS=0).
EXE.

2009/5/11 abdelrhman elmubarak <[hidden email]>
thanks Melissa
I am finding difficulty if there are more than one possible extra repair on the same day , your synatx work very fine if there are one repair on the same day...any help is higly appreciated.
thanks
Abdulrahman
 

Date: Thu, 7 May 2009 09:14:37 -0500
From: [hidden email]
Subject: Re: creating new variables

To: [hidden email]

One way would be to use the sort function so that the 209 record is last.  (it may help to create a dichotomy for rep209=1 or 0 if not--something like this would be needed if there are >2 repairs in a day)
Then do a lag function to create morejobs and moremoney.
 
THE FOLLOWING ASSUMES that no client has >2 repairs in a day AND 209 is the highest repair value.
 
sort cases by id repdate reptype.
if (id=lag(id) and repdate=lag(repdate) and reptype=209) morejobs=1.
if ((id ne lag(id) or repdate ne lag(repdate)) and reptype=209) morejobs=0.
if (id=lag(id) and repdate=lag(repdate) and reptype=209) moremoney=lag(money).
 
 
The actual syntax will differ if there are more than one possible extra repairs on the same date (i.e. one id could have 3+ per day), but similar logic should work.
 
HTH,
Melissa 
 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of abdelrhman elmubarak
Sent: Thursday, May 07, 2009 3:36 AM
To: [hidden email]
Subject: [SPSSX-L] creating new variables


  

Hi

my data looks as below

 

id

Repair Date

Repair Type

money

16

20090430

115

23

30

20090325

209

45

30

20090124

103

62

47

20090409

209

78

47

20090409

101

69


 

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that ….at the final my data should be look as below

 

id

Repair Date

Repair Type

money

more jobs plus

more money plus

16

20090430

115

23

 

 

30

20090325

209

45

0

 

30

20090124

103

62

 

 

47

20090409

209

78

1

69

47

20090409

101

69

 

 


 

I can explain more if it is required

Thanks in advance

 

Abdulrahman

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.


What can you do with the new Windows Live? Find out

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables

Richard Ristow
In reply to this post by abdelrhman elmubarak
At 04:36 AM 5/7/2009, abdelrhman elmubarak wrote:

my data looks as below
The posted data, extended for this reply:
|-----------------------------|---------------------------|
|Output Created               |12-MAY-2009 00:39:31       |
|-----------------------------|---------------------------|
[Input]
 id Repair_Date Repair_Type   money

 16  2009/04/30     115      $23.00
 30  2009/03/25     209      $45.00
 30  2009/01/24     103      $62.00
 47  2009/04/09     209      $78.00
 47  2009/04/09     101      $69.00
 50  2009/05/01     117      $41.00
 50  2009/05/01     209      $75.00
 50  2009/05/01     105      $15.00
 51  2009/05/15     110      $62.00
 51  2009/05/15     209      $51.00
 51  2009/05/15     103      $67.00
 51  2009/05/15     209      $50.00
 51  2009/05/15     111      $84.00


Number of cases read:  13    Number of cases listed:  13

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that. At the final my data should look as below
|-----------------------------|---------------------------|
|Output Created               |12-MAY-2009 00:39:31       |
|-----------------------------|---------------------------|
[Desired]
 id Repair_Date Repair_Type   money more_jobs_plus more_money_plus

 16  2009/04/30     115      $23.00         .              .
 30  2009/03/25     209      $45.00         0              .
 30  2009/01/24     103      $62.00         .              .
 47  2009/04/09     209      $78.00         1           $69.00
 47  2009/04/09     101      $69.00         .              .
 50  2009/05/01     117      $41.00         .              .
 50  2009/05/01     209      $75.00         2           $56.00
 50  2009/05/01     105      $15.00         .              .
 51  2009/05/15     110      $62.00         3          $198.00
 51  2009/05/15     209      $51.00         .              .
 51  2009/05/15     103      $67.00         .              .
 51  2009/05/15     209      $50.00         3          $198.00
 51  2009/05/15     111      $84.00         .              .


Number of cases read:  13    Number of cases listed:  13


At 11:02 AM 5/11/2009, Ariel Barak wrote:
 
The following syntax works with the data you posted and will take care of multiple repairs. If there are multiple repair types of 209 on a given day, the syntax would not work properly.

Here's a variation. Like Ariel's, it's based on AGGREGATE. If there are multiple repairs of type 209 on the same day, it puts the same values of more_jobs_plus and more_money_plus on all type-209 lines: the number of all services not type 209, and total money from all such services.

IF Repair_Type NE 209 more_jobs_plus  = 1.
IF Repair_type NE 209 more_money_plus = money.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
      /BREAK           = id Repair_Date
      /more_jobs_plus  'Number of services not type 209'  =
   SUM(more_jobs_plus)
      /more_money_plus 'Money from services not type 209' =
   SUM(more_money_plus).

FORMATS more_jobs_plus  (F3)
       /more_money_plus (DOLLAR7.2).
 
IF Repair_Type NE 209 more_jobs_plus  = $SYSMIS.
IF Repair_type NE 209 more_money_plus = $SYSMIS.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |12-MAY-2009 00:39:32       |
|-----------------------------|---------------------------|
 id Repair_Date Repair_Type   money more_jobs_plus more_money_plus

 16  2009/04/30     115      $23.00         .              .
 30  2009/03/25     209      $45.00         .              .
 30  2009/01/24     103      $62.00         .              .
 47  2009/04/09     209      $78.00         1           $69.00
 47  2009/04/09     101      $69.00         .              .
 50  2009/05/01     117      $41.00         .              .
 50  2009/05/01     209      $75.00         2           $56.00
 50  2009/05/01     105      $15.00         .              .
 51  2009/05/15     110      $62.00         .              .
 51  2009/05/15     209      $51.00         3          $213.00
 51  2009/05/15     103      $67.00         .              .
 51  2009/05/15     209      $50.00         3          $213.00
 51  2009/05/15     111      $84.00         .              .

Number of cases read:  13    Number of cases listed:  13
=============================
APPENDIX: Test data, and code
=============================
DATA LIST LIST/
     id     Repair_Date  Repair_Type     money
    (F3,    F10,         F3,    F4). 
BEGIN DATA
     16     20090430     115     23
     30     20090325     209     45
     30     20090124     103     62
     47     20090409     209     78
     47     20090409     101     69
     50     20090501     117     41
     50     20090501     209     75
     50     20090501     105     15
     51     20090515     110     62
     51     20090515     209     51
     51     20090515     103     67
     51     20090515     209     50
     51     20090515     111     84
END DATA.
.  /**/  LIST  /*-*/.

*  Convert Repair_Date to an SPSS date variable: .... .

COMPUTE    #Day      = MOD(Repair_Date,100).
COMPUTE    #Month    = MOD((Repair_Date-#Day)/100,100).
COMPUTE    #Year     = (Repair_Date-100*#Month-#Day)/1E4.
COMPUTE    #SPSSdate = DATE.DMY(#Day,#Month,#Year).

FORMATS    #Day #Month #Year (F4)
          /#SPSSdate         (SDATE10).


COMPUTE    Repair_Date = #SPSSdate.
FORMATS    Repair_Date (SDATE10).

FORMATS    money       (DOLLAR7.2).
DATASET NAME Input.
LIST.

DATA LIST LIST/
     id  Repair_Date  Repair_Type money  more_jobs_plus more_money_plus
    (F3, F10,         F3,         F4,    F3,            F3). 
BEGIN DATA
     16  20090430      115      23             
     30  20090325      209      45                 0
     30  20090124      103      62
     47  20090409      209      78                  1      69
     47  20090409      101      69     
     50     20090501     117     41
     50     20090501     209     75    2 56
     50     20090501     105     15
     51     20090515     110     62    3 198
     51     20090515     209     51
     51     20090515     103     67
     51     20090515     209     50   3 198
     51     20090515     111     84
END DATA.


*  Convert Repair_Date to an SPSS date variable: .... .

COMPUTE    #Day      = MOD(Repair_Date,100).
COMPUTE    #Month    = MOD((Repair_Date-#Day)/100,100).
COMPUTE    #Year     = (Repair_Date-100*#Month-#Day)/1E4.
COMPUTE    #SPSSdate = DATE.DMY(#Day,#Month,#Year).

FORMATS    #Day #Month #Year (F4)
          /#SPSSdate         (SDATE10).

COMPUTE    Repair_Date = #SPSSdate.
FORMATS    Repair_Date  (SDATE10).

FORMATS    money more_money_plus     
                        (DOLLAR7.2).
DATASET NAME Desired.
LIST.


*  ....   Calculate the desired new variables                    .... .

NEW FILE.
ADD FILES
  /FILE=Input.

*  A. Compute the 'extra' cost individually for those services   .... .
*     that are *not* type 209.                                   .... .

IF Repair_Type NE 209 more_jobs_plus  = 1.
IF Repair_type NE 209 more_money_plus = money.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
      /BREAK           = id Repair_Date
      /more_jobs_plus  'Number of services not type 209'  = 
   SUM(more_jobs_plus)
      /more_money_plus 'Money from services not type 209' =
   SUM(more_money_plus).

FORMATS more_jobs_plus  (F3)
       /more_money_plus (DOLLAR7.2).

.  /**/  LIST  /*-*/.


IF Repair_Type NE 209 more_jobs_plus  = $SYSMIS.
IF Repair_type NE 209 more_money_plus = $SYSMIS.

LIST.

====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables

abdelrhman elmubarak

Thank you very much  Richard !!!!... this is exactly what I want , thanks also to  Ariel   and Melissa.

Regards

Abdulrahman


 

Date: Tue, 12 May 2009 00:48:27 -0400
To: [hidden email]; [hidden email]
From: [hidden email]
Subject: Re: creating new variables
CC: [hidden email]; [hidden email]

At 04:36 AM 5/7/2009, abdelrhman elmubarak wrote:

my data looks as below
The posted data, extended for this reply:
|-----------------------------|---------------------------|
|Output Created        &       |12-MAY-2009 00:39:31       |
|-----------------------------|---------------------------|
[Input]
 id Repair_Date Repair_Type   money

 16  2009/04/30     115      $23.00
 30  2009/03/25     209      $45.00
 30  2009/01/24     103      $62.00
 47  2009/04/09     209      $78.00
 47  2009/04/09     101      $69.00
 50  2009/05/01     117      $41.00
 50  2009/05/01     209      $75.00
 50  2009/05/01     105      $15.00
 51  2009/05/15     110      $62.00
 51  2009/05/15     209      $51.00
 51  2009/05/15     103      $67.00
 51  2009/05/15     209      $50.00
 51  2009/05/15     111      $84.00


Number of cases read:  13    Number of cases listed:  13

I want to create two more variables for those  who  did repair type 209…the purpose is to know did they did more jobs on the same day of their visit and what the money generated from that. At the final my data should look as below
|-----------------------------|---------------------------|
|Output Created        &       |12-MAY-2009 00:39:31       |
|-----------------------------|---------------------------|
[Desired]
 id Repair_Date Repair_Type   money more_jobs_plus more_money_plus

 16  2009/04/30     115      $23.00         .              .
 30  2009/03/25     209      $45.00         0              .
 30  2009/01/24     103      $62.00         .              .
 47  2009/04/09     209      $78.00         1           $69.00
 47  2009/04/09     101      $69.00         .              .
 50  2009/05/01     117      $41.00         .              .
 50  2009/05/01&n bsp;    209      $75.00         2           $56.00
 50  2009/05/01     105      $15.00         .              .
 51  2009/05/15     110      $62.00         3          $198.00
 51  2009/05/15     209      $51.00         .              .
 51  2009/05/15     103      $67.00         .              .
 51  2009/05/15     209      $50.00         3          $198.00
 51  2009/05/15     111      $84.00         .              .


Number of cases read:  13    Number of cases listed:  13


At 11:02 AM 5/11/2009, Ariel Barak wrote:
 
The following syntax works with the data you posted and will take care of multiple repairs. If there are multiple repair types of 209 on a given day, the syntax would not work properly.

Here's a variation. Like Ariel's, it's based on AGGREGATE. If there are multiple repairs of type 209 on the same day, it puts the same values of more_jobs_plus and more_money_plus on all type-209 lines: the number of all services not type 209, and total money from all such services.

IF Repair_Type NE 209 more_jobs_plus  = 1.
IF Repair_type NE 209 more_money_plus = money.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
      /BREAK           = id Repair_Date
      /more_jobs_plus  'Number of services not type 209'  =
   SUM(more_jobs_plus)
      /more_money_plus 'Money from services not type 209' =
   SUM(more_money_plus).

FORMATS more_jobs_plus  (F3)
       /more_money_plus (DOLLAR7.2).
 
IF Repair_Type NE 209 more_jobs_plus  = $SYSMIS.
IF Repair_type NE 209 more_money_plus = $SYSMIS.

LIST.

List
|-----------------------------|---------------------------|
|Output Created        &       |12-MAY-2009 00:39:32       |
|-----------------------------|---------------------------|
 id Repair_Date Repair_Type   money more_jobs_plus more_money_plus

 16  2009/04/30     115      $23.00         .              .
 30  2009/03/25     209      $45.00   & nbsp;     .              .
 30  2009/01/24     103      $62.00         .              .
 47  2009/04/09     209      $78.00         1           $69.00
 47  2009/04/09     101      $69.00         .              .
 50  2009/05/01     117      $41.00         .              .
 50  2009/05/01     209      $75.00         2           $56.00
 50  2009/05/01     105      $15.00         .              .
 51  2009/05/15     110      $62.00         .              .
 51  2009/05/15     209      $51.00         3          $213.00
 51  2009/05/15     103      $67.00         .       &n bsp;      .
 51  2009/05/15     209      $50.00         3          $213.00
 51  2009/05/15     111      $84.00         .              .

Number of cases read:  13    Number of cases listed:  13
=============================
APPENDIX: Test data, and code
=============================
DATA LIST LIST/
     id     Repair_Date  Repair_Type     money
    (F3,    F10,         F3,    F4). 
BEGIN DATA
     16     20090430     115     23
     30     20090325     209     45
     30     20090124     103     62
     47     20090409     209     78
     47     20090409     101     69
     50     20090501     117     41
     50     20090501     209     75
     50     20090501     105     15
     51     20090515     110     62
 &nb sp;   51     20090515     209     51
     51     20090515     103     67
     51     20090515     209     50
     51     20090515     111     84
END DATA.
.  /**/  LIST  /*-*/.

*  Convert Repair_Date to an SPSS date variable: .... .

COMPUTE    #Day      = MOD(Repair_Date,100).
COMPUTE    #Month    = MOD((Repair_Date-#Day)/100,100).
COMPUTE    #Year     = (Repair_Date-100*#Month-#Day)/1E4.
COMPUTE    #SPSSdate = DATE.DMY(#Day,#Month,#Year).

FORMATS    #Day #Month #Year (F4)
          /#SPSSdate         (SDATE10).


COMPUTE    Repair_Date = #SPSSdate.
FORMATS    Repair_Date (SDATE10).

FORMATS    money       (DOLLAR7.2).
DATASET NAME Input.
LIST.

DATA LIST LIST/
     id  Repair_Date  Repair_Type money  more_jobs_plus more_money_plus
    (F3, F10,         F3,         F4,    F3,            F3). 
BEGIN DATA
     16  20090430      115      23             
     30  20090325      209      45 &nbs p;               0
     30  20090124      103      62
     47  20090409      209      78                  1      69
     47  20090409      101      69     
     50     20090501     117     41
     50     20090501     209     75    2 56
     50     20090501     105     15
     51     20090515     110     62    3 198
     51     20090515     209     51
     51     20090515     103     67
     51     20090515     209     50   3 198
     51     20090515     111     84
END DATA.


*  Convert Repair_Date to an SPSS date variable: .... .

COMPUTE    #Day      = MOD(Repair_Date,100).
COMPUTE    #Month    = MOD((Repair_Date-#Day)/100,100).
COMPUTE    #Year     = (Repair_Date-100*#Month-#Day)/1E4.
COMPUTE    #SPSSdate = DATE.DMY(#Day,#Month,#Year).

FORMATS    #Day #Month #Year (F4)
          /#SPSSdate         (SDATE10).

COMPUTE    Repair_Date = #SPSSdate.
FORMATS    Repair_Date  (SDATE10).

FORMATS    money more_money_plus     
                        (DOLLAR7.2).
DATASET NAME Desired.
LIST.


*  ....   Calculate the desired new variables                    .... .

NEW FILE.
ADD FILES
  /FILE=Input.

*  A. Compute the 'extra' cost individually for those services   .... .
*     that are *not* type 209.                                   .... .

IF Repair_Type NE 209 more_jobs_plus  = 1.
IF Repair_type NE 209 more_money_plus = money.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
      /BREAK           = id Repair_Date
      /more_jobs_plus  'Number of services not type 209'  = 
   SUM(more_jobs_plus)
      /more_money_plus 'Money from services not type 209' =
   SUM(more_money_plus).

FORMATS more_jobs_plus  (F3)
       /more_money_plus (DOLLAR7.2).

.  /**/  LIST  /*-*/.


IF Repair_Type NE 209 more_jobs_plus  = $SYSMIS.
IF Repair_type NE 209 more_money_plus = $SYSMIS.

LIST.



Invi te your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!
Reply | Threaded
Open this post in threaded view
|

recode time variable

abdelrhman elmubarak
In reply to this post by Richard Ristow

Dear Listers

How to to recode a time variable as below

7:12   7:29   8:30

To 30 minutes time interval variable as below

70:01 - 7:30 =1

7:31 – 8:30 =2

 

Thanks in advance

 

Abdulrahman Yousef Elmubar

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!
Reply | Threaded
Open this post in threaded view
|

Re: recode time variable

Lemon, John S.

As the time is recorded as the number of seconds from midnight you could use:

 

RECODE t_of_day (Lowest thru 1800=1) (1801 thru 3600=2) (3601 thru 5400=3) INTO t_of_day_recoded.

 

Or with Visual binning to generate this:

 

* Visual Binning.

*t_of_day.

RECODE  t_of_day (MISSING=COPY) (LO THRU 1=1) (LO THRU 25200=2) (LO THRU 27000=3) (LO THRU 28800=4)

    (LO THRU HI=5) (ELSE=SYSMIS) INTO t_of_day_binned.

VARIABLE LABELS  t_of_day_binned 't_of_day (Binned)'.

FORMATS  t_of_day_binned (F5.0).

VALUE LABELS  t_of_day_binned 1 '<= 0:00:01' 2 '0:00:02 - 7:00:00' 3 '7:00:01 - 7:30:00' 4

    '7:30:01 - 8:00:00' 5 '8:00:01+'.

VARIABLE LEVEL  t_of_day_binned (ORDINAL).

EXECUTE.

 

The advantage of visual binning is that you can put in times like 07:30:00 and it converts it to the required number of seconds !!

 

 

Best Wishes

 

John S. Lemon

Student Liaison Officer

Directorate of Information Technology (DIT) - University of Aberdeen

Edward Wright Building: Room G51

 

Tel:  +44 1224 273350

Fax: +44 1224 273372

 

Diary ( Free / Busy )

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of abdelrhman elmubarak
Sent: Saturday, June 27, 2009 9:50 AM
To: [hidden email]
Subject: recode time variable

 


Dear Listers

How to to recode a time variable as below

7:12   7:29   8:30

To 30 minutes time interval variable as below

70:01 - 7:30 =1

7:31 – 8:30 =2

 

Thanks in advance

 

Abdulrahman Yousef Elmubar

 


Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it!



The University of Aberdeen is a charity registered in Scotland, No SC013683.
Reply | Threaded
Open this post in threaded view
|

percent increase vs regression line as a predictor

mpirritano
In reply to this post by abdelrhman elmubarak

Pawsers,

 

Stats question. I’ve not done any forecasting before other than with multiple regression analyses. What is the difference between multiple regression and using average past percent change to make predictions? Based on one scenario I’m dealing with it looks like percent change results in a positive curvilinear (possibly logistic?) relationship.

 

Thanks,

matt

 

 

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648

Reply | Threaded
Open this post in threaded view
|

Re: percent increase vs regression line as a predictor

Hector Maletta

For a pair of dichotomous variables, the percent difference in Y between the two values of X is mathematically equivalent to a regression coefficient.

Average crude percentage change in the past (independently of predictors) may be a very poor forecasting tool. To mention just a famous example, remember the (in)famous Fisher blunder in 1929, predicting continuous growth in the stock exchange by simply projecting average past increases, even after the initial crash. The future does not always repeat the past.

 

Hector


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Pirritano, Matthew
Sent: 02 July 2009 17:44
To: [hidden email]
Subject: percent increase vs regression line as a predictor

 

Pawsers,

 

Stats question. I’ve not done any forecasting before other than with multiple regression analyses. What is the difference between multiple regression and using average past percent change to make predictions? Based on one scenario I’m dealing with it looks like percent change results in a positive curvilinear (possibly logistic?) relationship.

 

Thanks,

matt

 

 

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648

Reply | Threaded
Open this post in threaded view
|

Re: percent increase vs regression line as a predictor

mpirritano

The variables are not dichotomous in this case. I’m looking at number of prescriptions per month. Number of prescriptions is the DV and time is the IV.

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta
Sent: Thursday, July 02, 2009 3:17 PM
To: [hidden email]
Subject: Re: percent increase vs regression line as a predictor

 

For a pair of dichotomous variables, the percent difference in Y between the two values of X is mathematically equivalent to a regression coefficient.

Average crude percentage change in the past (independently of predictors) may be a very poor forecasting tool. To mention just a famous example, remember the (in)famous Fisher blunder in 1929, predicting continuous growth in the stock exchange by simply projecting average past increases, even after the initial crash. The future does not always repeat the past.

 

Hector


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Pirritano, Matthew
Sent: 02 July 2009 17:44
To: [hidden email]
Subject: percent increase vs regression line as a predictor

 

Pawsers,

 

Stats question. I’ve not done any forecasting before other than with multiple regression analyses. What is the difference between multiple regression and using average past percent change to make predictions? Based on one scenario I’m dealing with it looks like percent change results in a positive curvilinear (possibly logistic?) relationship.

 

Thanks,

matt

 

 

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648

Reply | Threaded
Open this post in threaded view
|

Re: percent increase vs regression line as a predictor

Hector Maletta

Then my second paragraph applies: the crude average past percentage rate of growth is usually a poor predictor, except if you have some objective grounds to expect a constant rate of growth.

Why the number of prescriptions per month should be an increasing (or decreasing) function of time? Time starting when? Are you talking of time counted since some condition is diagnosed, or something similar, or the mere passing of time? If time is the only predictor for the NUMBER of prescriptions, the PERCENTAGE GROWTH of prescriptions has little to do with it. If the number of prescriptions is a function of time, then the proportional increase in the number of prescriptions would be (by definition) a function of the logarithm of time.

Hector

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Pirritano, Matthew
Sent: 02 July 2009 20:25
To: [hidden email]
Subject: Re: percent increase vs regression line as a predictor

 

The variables are not dichotomous in this case. I’m looking at number of prescriptions per month. Number of prescriptions is the DV and time is the IV.

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta
Sent: Thursday, July 02, 2009 3:17 PM
To: [hidden email]
Subject: Re: percent increase vs regression line as a predictor

 

For a pair of dichotomous variables, the percent difference in Y between the two values of X is mathematically equivalent to a regression coefficient.

Average crude percentage change in the past (independently of predictors) may be a very poor forecasting tool. To mention just a famous example, remember the (in)famous Fisher blunder in 1929, predicting continuous growth in the stock exchange by simply projecting average past increases, even after the initial crash. The future does not always repeat the past.

 

Hector


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Pirritano, Matthew
Sent: 02 July 2009 17:44
To: [hidden email]
Subject: percent increase vs regression line as a predictor

 

Pawsers,

 

Stats question. I’ve not done any forecasting before other than with multiple regression analyses. What is the difference between multiple regression and using average past percent change to make predictions? Based on one scenario I’m dealing with it looks like percent change results in a positive curvilinear (possibly logistic?) relationship.

 

Thanks,

matt

 

 

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648

Reply | Threaded
Open this post in threaded view
|

Logistic Regression

E. Bernardo
I am building a binary logistic regression model for some continuous and categorical predictors.  I will enter the predictors manually to the model one at a time, rather than using stepwise regression.  Any suggestions on the criteria on which predictor is to enter for each subsequent step?
 
Eins 


Feel safer online. Upgrade to the new, safer Internet Explorer 8 optimized for Yahoo! to put your mind at peace. It's free.
Get IE8 here!