File restucture

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

File restucture

Anton-24
Hello fellow Listers~

I have a multi-record data set that I'd like to restructure into a flat
file, one row per id.  Each record has an id, date, and several scores.
Ultimately, I need a data set with the first record followed by the date and
scores of the next record *that's at least 90 days* from the first.  This is
followed by the next record that's at least 90 days from this new "second"
record, and so on.

In other words, I need to go from this:

id date q1 q2 q3

1 08/08/2004 12 15 23
1 10/08/2004 10 14 24
1 12/22/2004 13 13 29
1 02/27/2004 20 16 20
1 05/08/2005 18 17 21
1 12/24/2006 20 11 19
1 01/08/2007 18 14 21

To this:

id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
q1.4 q2.4 q3.4

1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
11 19

Thanks in advance for any help,

Hunter

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

Bruce Weaver
Administrator
Anton-24 wrote
Hello fellow Listers~

I have a multi-record data set that I'd like to restructure into a flat
file, one row per id.  Each record has an id, date, and several scores.
Ultimately, I need a data set with the first record followed by the date and
scores of the next record *that's at least 90 days* from the first.  This is
followed by the next record that's at least 90 days from this new "second"
record, and so on.

In other words, I need to go from this:

id date q1 q2 q3

1 08/08/2004 12 15 23
1 10/08/2004 10 14 24
1 12/22/2004 13 13 29
1 02/27/2004 20 16 20
1 05/08/2005 18 17 21
1 12/24/2006 20 11 19
1 01/08/2007 18 14 21

To this:

id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
q1.4 q2.4 q3.4

1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
11 19

Thanks in advance for any help,

Hunter
Hunter, look up CASESTOVARS in the Help.  Pay attention to the GROUPBY sub-command, because you do not want the default setting for it.

HTH.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

Bruce Weaver
Administrator
Bruce Weaver wrote
Anton-24 wrote
Hello fellow Listers~

I have a multi-record data set that I'd like to restructure into a flat
file, one row per id.  Each record has an id, date, and several scores.
Ultimately, I need a data set with the first record followed by the date and
scores of the next record *that's at least 90 days* from the first.  This is
followed by the next record that's at least 90 days from this new "second"
record, and so on.

In other words, I need to go from this:

id date q1 q2 q3

1 08/08/2004 12 15 23
1 10/08/2004 10 14 24
1 12/22/2004 13 13 29
1 02/27/2004 20 16 20
1 05/08/2005 18 17 21
1 12/24/2006 20 11 19
1 01/08/2007 18 14 21

To this:

id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
q1.4 q2.4 q3.4

1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
11 19

Thanks in advance for any help,

Hunter
Hunter, look up CASESTOVARS in the Help.  Pay attention to the GROUPBY sub-command, because you do not want the default setting for it.

HTH.
Sorry...I was in a hurry (late on a Friday), and completely over-looked the bit about the 90 day gaps.  So my CASESTOVARS suggestion is incomplete.  

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

David Marso
Administrator
LAG and DATEDIFF (or CTIME)

Bruce Weaver wrote
Bruce Weaver wrote
Anton-24 wrote
Hello fellow Listers~

I have a multi-record data set that I'd like to restructure into a flat
file, one row per id.  Each record has an id, date, and several scores.
Ultimately, I need a data set with the first record followed by the date and
scores of the next record *that's at least 90 days* from the first.  This is
followed by the next record that's at least 90 days from this new "second"
record, and so on.

In other words, I need to go from this:

id date q1 q2 q3

1 08/08/2004 12 15 23
1 10/08/2004 10 14 24
1 12/22/2004 13 13 29
1 02/27/2004 20 16 20
1 05/08/2005 18 17 21
1 12/24/2006 20 11 19
1 01/08/2007 18 14 21

To this:

id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
q1.4 q2.4 q3.4

1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
11 19

Thanks in advance for any help,

Hunter
Hunter, look up CASESTOVARS in the Help.  Pay attention to the GROUPBY sub-command, because you do not want the default setting for it.

HTH.
Sorry...I was in a hurry (late on a Friday), and completely over-looked the bit about the 90 day gaps.  So my CASESTOVARS suggestion is incomplete.  
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

drfg2008
In reply to this post by Anton-24
This should work:

you need to have 7 lines for each case (fixed!)
the txt file is "cards.txt"
you can add more (7 are in your example, but here are 4).

GET DATA   /TYPE  = TXT
 /FILE = 'C:\<path>\cards.txt'
 /ARRANGEMENT  = FIXED
 /FIXCASE  = 7
 /FIRSTCASE  =1
 /VARIABLES  = /1 id 0-1 f8.0 date_1 2-11 a10 q1_1 13-14 f8.0 q2_1 16-17 f8.0 q3_1 19-20 f8.0
                         /2                date_2 2-11 a10 q1_2 13-14 f8.0 q2_2 16-17 f8.0 q3_2 19-20 f8.0
                         /3                date_3 2-11 a10 q1_3 13-14 f8.0 q2_3 16-17 f8.0 q3_3 19-20 f8.0
                         /4                date_4 2-11 a10 q1_4 13-14 f8.0 q2_4 16-17 f8.0 q3_4 19-20 8.0.          
EXECUTE .

Dr. Frank Gaeth

Reply | Threaded
Open this post in threaded view
|

Re: File restucture

Bruce Weaver
Administrator
In reply to this post by David Marso
If I understand Hunter's problem re the 90 days though, a solution involving LAG and DATEDIFF may have to be applied more than once.  E.g., for the sample data posted, here are the elapsed days from one row to the next (after sorting by date):


id       date q1 q2 q3 firstrec  days

 1 02/27/2004 20 16 20     1        .
 1 08/08/2004 12 15 23     0      163
 1 10/08/2004 10 14 24     0       61  <-- delete
 1 12/22/2004 13 13 29     0       75  <-- recompute elapsed days after deleting previous row
 1 05/08/2005 18 17 21     0      137
 1 12/24/2006 20 11 19     0      595
 1 01/08/2007 18 14 21     0       15  <-- delete

Number of cases read:  7    Number of cases listed:  7

I think Hunter wants to delete the two rows so marked above, but then to recompute the elapsed days where it is currently = 75.  After doing so, the elapsed days would be greater than 90, so the row would be good.  And I think THIS aspect of the problem is what he was asking about more than CASESTOVARS.



David Marso wrote
LAG and DATEDIFF (or CTIME)

Bruce Weaver wrote
Bruce Weaver wrote
Anton-24 wrote
Hello fellow Listers~

I have a multi-record data set that I'd like to restructure into a flat
file, one row per id.  Each record has an id, date, and several scores.
Ultimately, I need a data set with the first record followed by the date and
scores of the next record *that's at least 90 days* from the first.  This is
followed by the next record that's at least 90 days from this new "second"
record, and so on.

In other words, I need to go from this:

id date q1 q2 q3

1 08/08/2004 12 15 23
1 10/08/2004 10 14 24
1 12/22/2004 13 13 29
1 02/27/2004 20 16 20
1 05/08/2005 18 17 21
1 12/24/2006 20 11 19
1 01/08/2007 18 14 21

To this:

id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
q1.4 q2.4 q3.4

1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
11 19

Thanks in advance for any help,

Hunter
Hunter, look up CASESTOVARS in the Help.  Pay attention to the GROUPBY sub-command, because you do not want the default setting for it.

HTH.
Sorry...I was in a hurry (late on a Friday), and completely over-looked the bit about the 90 day gaps.  So my CASESTOVARS suggestion is incomplete.  
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

Anton-24
In reply to this post by Anton-24
Thanks to Bruce, for interpreting my original, less than clear, request.  I
do indeed need to identify (and ultimately restructure) only those records
that are at least 90 days apart from each *newly identified* subsequent record.

Hunter


On Mon, 14 Feb 2011 11:02:52 -0800, Bruce Weaver <[hidden email]>
wrote:

>If I understand Hunter's problem re the 90 days though, a solution involving
>LAG and DATEDIFF may have to be applied more than once.  E.g., for the
>sample data posted, here are the elapsed days from one row to the next
>(after sorting by date):
>
>
>id       date q1 q2 q3 firstrec  days
>
> 1 02/27/2004 20 16 20     1        .
> 1 08/08/2004 12 15 23     0      163
> 1 10/08/2004 10 14 24     0       61  <-- delete
> 1 12/22/2004 13 13 29     0       75  <-- recompute elapsed days after
>deleting previous row
> 1 05/08/2005 18 17 21     0      137
> 1 12/24/2006 20 11 19     0      595
> 1 01/08/2007 18 14 21     0       15  <-- delete
>
>Number of cases read:  7    Number of cases listed:  7
>
>I think Hunter wants to delete the two rows so marked above, but then to
>recompute the elapsed days where it is currently = 75.  After doing so, the
>elapsed days would be greater than 90, so the row would be good.  And I
>think THIS aspect of the problem is what he was asking about more than
>CASESTOVARS.
>
>
>
>
>David Marso wrote:
>>
>> LAG and DATEDIFF (or CTIME)
>>
>>
>> Bruce Weaver wrote:
>>>
>>>
>>> Bruce Weaver wrote:
>>>>
>>>>
>>>> Anton-24 wrote:
>>>>>
>>>>> Hello fellow Listers~
>>>>>
>>>>> I have a multi-record data set that I'd like to restructure into a flat
>>>>> file, one row per id.  Each record has an id, date, and several scores.
>>>>> Ultimately, I need a data set with the first record followed by the
>>>>> date and
>>>>> scores of the next record *that's at least 90 days* from the first.
>>>>> This is
>>>>> followed by the next record that's at least 90 days from this new
>>>>> "second"
>>>>> record, and so on.
>>>>>
>>>>> In other words, I need to go from this:
>>>>>
>>>>> id date q1 q2 q3
>>>>>
>>>>> 1 08/08/2004 12 15 23
>>>>> 1 10/08/2004 10 14 24
>>>>> 1 12/22/2004 13 13 29
>>>>> 1 02/27/2004 20 16 20
>>>>> 1 05/08/2005 18 17 21
>>>>> 1 12/24/2006 20 11 19
>>>>> 1 01/08/2007 18 14 21
>>>>>
>>>>> To this:
>>>>>
>>>>> id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3
>>>>> date.4
>>>>> q1.4 q2.4 q3.4
>>>>>
>>>>> 1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21
>>>>> 12/24/2006 20
>>>>> 11 19
>>>>>
>>>>> Thanks in advance for any help,
>>>>>
>>>>> Hunter
>>>>>
>>>>>
>>>>
>>>> Hunter, look up CASESTOVARS in the Help.  Pay attention to the GROUPBY
>>>> sub-command, because you do not want the default setting for it.
>>>>
>>>> HTH.
>>>>
>>>>
>>>
>>> Sorry...I was in a hurry (late on a Friday), and completely over-looked
>>> the bit about the 90 day gaps.  So my CASESTOVARS suggestion is
>>> incomplete.
>>>
>>>
>>>
>>
>>
>
>
>-----
>--
>Bruce Weaver
>[hidden email]
>http://sites.google.com/a/lakeheadu.ca/bweaver/
>
>"When all else fails, RTFM."
>
>NOTE: My Hotmail account is not monitored regularly.
>To send me an e-mail, please use the address shown above.
>
>--
>View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/File-restucture-tp3382125p3384918.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

hillel vardi
In reply to this post by Anton-24
  Shalom

Here is a syntax to do what you need .

DATA LIST fixed / id  date q1 q2 q3 (f2 1x adate10 3f3) .
BEGIN DATA
  1 08/08/2004 12 15 23
  1 10/08/2004 10 14 24
  1 12/22/2004 13 13 29
  1 02/27/2004 20 16 20
  1 05/08/2005 18 17 21
  1 12/24/2006 20 11 19
  1 01/08/2007 18 14 21
END DATA.
add files file=* / by id  / first=first.
compute days= datedeff(date,date.dmy(1,1,2000),'days') .
if first eq 1 testday=days.
if first eq 1 seq=1.
leave testday seq .
do if   days-testday lt 90 .
compute keep=0 .
else if days-testday ge 90 .
compute keep=1.
compute testday=days .
compute seq=sum(seq,1) .
end if.
if first eq 1 keep=1.
execute .
select if keep eq  1.
SORT CASES BY id seq.
CASESTOVARS
   /ID=id
   /INDEX=seq
   /drop=first keep testday days
   /GROUPBY = INDEX     .


Hillel Vard
BGU



On 11/02/2011 22:39, Anton wrote:

> Hello fellow Listers~
>
> I have a multi-record data set that I'd like to restructure into a flat
> file, one row per id.  Each record has an id, date, and several scores.
> Ultimately, I need a data set with the first record followed by the date and
> scores of the next record *that's at least 90 days* from the first.  This is
> followed by the next record that's at least 90 days from this new "second"
> record, and so on.
>
> In other words, I need to go from this:
>
> id date q1 q2 q3
>
> 1 08/08/2004 12 15 23
> 1 10/08/2004 10 14 24
> 1 12/22/2004 13 13 29
> 1 02/27/2004 20 16 20
> 1 05/08/2005 18 17 21
> 1 12/24/2006 20 11 19
> 1 01/08/2007 18 14 21
>
> To this:
>
> id date.1 q1.1 q2.1 q3.1 date.2 q1.2 q2.2 q3.2 date.3 q1.3 q2.3 q3.3 date.4
> q1.4 q2.4 q3.4
>
> 1 08/08/2004 12 15 23 12/22/2004 13 13 29 05/08/2005 18 17 21 12/24/2006 20
> 11 19
>
> Thanks in advance for any help,
>
> Hunter
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

David Marso
Administrator
Here is a similar approach to the setup code which is somewhat more direct.
Leaving out the CASESTOVARS.
DO IF ID<> LAG(ID) OR $CASENUM=1.
+  COMPUTE REFDATE=date.
+  COMPUTE RETAIN=1.
ELSE.
+  COMPUTE refdate=LAG(refdate).
+  COMPUTE DELTA=CTIME.DAYS(date-refdate).
+  DO IF (DELTA > 90).
+    COMPUTE RETAIN=1.
+    COMPUTE refdate=date.
+  END IF.
END IF.
EXE.
SELECT IF RETAIN.
MATCH FILES / FILE * / DROP refdate retain delta.
--
Add the CASESTOVARS here.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: File restucture

lucameyer
In reply to this post by hillel vardi
Just a couple of details on Hillem suggestions:

(1)
DATA LIST fixed / id  date q1 q2 q3 (f2 1x adate10 3f3) .
it doesn't work on my machine but this works fine:
DATA LIST fixed / id  date q1 q2 q3 (f3 1x adate10 3f3) .

(2)
In case your data are not sorted by id I think you would need to add
SORT CASES BY id.
before
add files file=* / by id  / first=first.

Cheers,
Luca
Mr. Luca Meyer
www.lucameyer.com