Checking Data

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Checking Data

DKUKEC
I have a dataset that ressembles the followining.  The dataset should follow a specific order/logic; where the variable RECORD follows dates in sequence.  You will notice for ID 1 and date 01-MAY-2013 the RECORD value EQ 2. and for the same ID and date 03-MAY-2013 the RECORD value EQ 1.  I am trying to figure out how to identifiy where this has occurred in the dataset and reverse the values for the variable RECORD within each specific ID.   I have prepared the sample data set below and illustrate the desired change.  Any suggestions would be greatly appreciated

NEW FILE .
DATA LIST LIST / ID (F5)  DATE (DATE11) RECORD (F5) .
BEGIN DATA  
1 12-MAY-2014  1
1 13-MAY-2014  2
1 01-MAY-2013  2    
1 03-MAY-2013  1
1 04-MAY-2013  4
2 10-MAY-2014 2
2 09-MAY-2014 1
3 05-MAY-2014 1
3 15-MAY-2014 2
3 16-MAY-2014 3
END DATA .
DATASET NAME TEST.
SORT CASES BY ID DATE RECORD (A) .

LIST ID DATE RECORD.
EXECUTE .

ID        DATE RECORD
 
    1 01-MAY-2013      2
    1 03-MAY-2013      1
    1 04-MAY-2013      4
    1 12-MAY-2014      1
    1 13-MAY-2014      2
    2 09-MAY-2014      1
    2 10-MAY-2014      2
    3 05-MAY-2014      1
    3 15-MAY-2014      2
    3 16-MAY-2014      3
 
 
Number of cases read:  10    Number of cases listed:  10

DESIRED FIX .

ID        DATE RECORD
 
    1 01-MAY-2013      1
    1 03-MAY-2013      2
    1 04-MAY-2013      4
    1 12-MAY-2014      1
    1 13-MAY-2014      2
    2 09-MAY-2014      1
    2 10-MAY-2014      2
    3 05-MAY-2014      1
    3 15-MAY-2014      2
    3 16-MAY-2014      3
 
 
Number of cases read:  10    Number of cases listed:  10
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

Andy W
I'm having trouble juxtaposing your own criteria with your desired fix. So you list as the end result to be for ID 1:

    1 01-MAY-2013      1
    1 03-MAY-2013      2
    1 04-MAY-2013      4
    1 12-MAY-2014      1
    1 13-MAY-2014      2

But from your description I would expect it to be:

    1 01-MAY-2013      1
    1 03-MAY-2013      2
    1 04-MAY-2013      3
    1 12-MAY-2014      4
    1 13-MAY-2014      5

So given your description what signifies that  

1 01-MAY-2013  2    
1 03-MAY-2013  1

Should be "flipped" - but

1 13-MAY-2014  2
1 01-MAY-2013  2

Should not be flipped? Is there another identifier in addition to ID that is not in this example?
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

DKUKEC
Sorry Andy,

I should have noted that for the variable RECORD EQ 1, this signifies the start of a new case.  Let me know if this clarifies my problem.

Thank you

Damir
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

Andy W
I'm still having trouble. So your original data looks like:

1 12-MAY-2014  1
1 13-MAY-2014  2
1 01-MAY-2013  2    
1 03-MAY-2013  1
1 04-MAY-2013  4

So how do you know that the transition between the two records below are associated with different "groups"?

1 13-MAY-2014  2
1 01-MAY-2013  2  

If you had a second variable that identifies these groups the task is pretty easy. It isn't clear from your example though if those groups can be easily identified. (I could provide syntax to work for this particular example - but I wouldn't be confident it extends to your application that might have more complicated patterns.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

DKUKEC
Thank you again Andy,

Each ID represents a person that should follow specific tasks that makes up a case within a given period of time.  So, as time increases for each ID and case, so does the variable RECORD value.  For example,

If I were to start today, my first RECORD value should be 1.  The next task associated with that case should be greater (e.g., 3 etc.).  If I start a new case on the same date or later date, the RECORD value will begin again at 1 and then increase accordingly.

Sorry for the confusion and your assistance is much appreciated.

Damir
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

Andy W
Imagine we have the situation

1 11-MAY-2014  1
1 12-MAY-2014  2
1 13-MAY-2014  1

So, there are by necessity two "cases" in this set. Is this set in the correct order, with the first two records signifying the first case and the 3rd record signifying the second case - or is it out of order and the second record really goes to the second case?

I don't believe you can do what you ask without other external information to identify your separate cases.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

RE: Checking Data

DKUKEC

The situation you present is in correct order, and the first two records signifies the first case and the third record signifies the second case.  Regrettably, I do not have any other external identifiers.

 

From: Andy W [via SPSSX Discussion] [mailto:ml-node+[hidden email]]
Sent: Tuesday, May 06, 2014 10:34 AM
To: Damir Kukec
Subject: Re: Checking Data

 

Imagine we have the situation

1 11-MAY-2014  1
1 12-MAY-2014  2
1 13-MAY-2014  1

So, there are by necessity two "cases" in this set. Is this set in the correct order, with the first two records signifying the first case and the 3rd record signifying the second case - or is it out of order and the second record really goes to the second case?

I don't believe you can do what you ask without other external information to identify your separate cases.

 


If you reply to this email, your message will be added to the discussion below:

http://spssx-discussion.1045642.n5.nabble.com/Checking-Data-tp5725811p5725820.html

To unsubscribe from Checking Data, click here.
NAML




Under Florida law, e-mail addresses are public records. If you do not want your e-mail address released in response to a public records request, do not send electronic mail to this entity. Instead, contact this office by phone or in writing.
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

David Marso
Administrator
In reply to this post by DKUKEC
Why not simply SORT by ID and date and then build a counter within ID using any of the many techniques which have been posted on this list over the years?

DKUKEC wrote
I have a dataset that ressembles the followining.  The dataset should follow a specific order/logic; where the variable RECORD follows dates in sequence.  You will notice for ID 1 and date 01-MAY-2013 the RECORD value EQ 2. and for the same ID and date 03-MAY-2013 the RECORD value EQ 1.  I am trying to figure out how to identifiy where this has occurred in the dataset and reverse the values for the variable RECORD within each specific ID.   I have prepared the sample data set below and illustrate the desired change.  Any suggestions would be greatly appreciated

NEW FILE .
DATA LIST LIST / ID (F5)  DATE (DATE11) RECORD (F5) .
BEGIN DATA  
1 12-MAY-2014  1
1 13-MAY-2014  2
1 01-MAY-2013  2    
1 03-MAY-2013  1
1 04-MAY-2013  4
2 10-MAY-2014 2
2 09-MAY-2014 1
3 05-MAY-2014 1
3 15-MAY-2014 2
3 16-MAY-2014 3
END DATA .
DATASET NAME TEST.
SORT CASES BY ID DATE RECORD (A) .

LIST ID DATE RECORD.
EXECUTE .

ID        DATE RECORD
 
    1 01-MAY-2013      2
    1 03-MAY-2013      1
    1 04-MAY-2013      4
    1 12-MAY-2014      1
    1 13-MAY-2014      2
    2 09-MAY-2014      1
    2 10-MAY-2014      2
    3 05-MAY-2014      1
    3 15-MAY-2014      2
    3 16-MAY-2014      3
 
 
Number of cases read:  10    Number of cases listed:  10

DESIRED FIX .

ID        DATE RECORD
 
    1 01-MAY-2013      1
    1 03-MAY-2013      2
    1 04-MAY-2013      4
    1 12-MAY-2014      1
    1 13-MAY-2014      2
    2 09-MAY-2014      1
    2 10-MAY-2014      2
    3 05-MAY-2014      1
    3 15-MAY-2014      2
    3 16-MAY-2014      3
 
 
Number of cases read:  10    Number of cases listed:  10
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Checking Data

Rich Ulrich
In reply to this post by DKUKEC
Without an external identifier, your option seems to be: you can tag or list
all the cases that fail to meet some criterion of "consistency" so that
you can go in and figure it out case-by-case.

From what I gather, using a file sorted by ID and Date: 
For the 1st occurrence of an ID --
  Flag the line if the first Order for an ID is not 1.
For the later occurrence of an ID --
  Order EQ 1 is always okay.  Otherwise,
  Flag the line if the Order is LE the LAG(Order).
When a line is flagged, the error might be in the preceding lines.

--
Rich Ulrich


Date: Tue, 6 May 2014 07:40:24 -0700
From: [hidden email]
Subject: Re: Checking Data
To: [hidden email]

The situation you present is in correct order, and the first two records signifies the first case and the third record signifies the second case.  Regrettably, I do not have any other external identifiers.

 

From: Andy W [via SPSSX Discussion] [mailto:ml-node+[hidden email]]
Sent: Tuesday, May 06, 2014 10:34 AM
To: Damir Kukec
Subject: Re: Checking Data

 

Imagine we have the situation

1 11-MAY-2014  1
1 12-MAY-2014  2
1 13-MAY-2014  1

So, there are by necessity two "cases" in this set. Is this set in the correct order, with the first two records signifying the first case and the 3rd record signifying the second case - or is it out of order and the second record really goes to the second case?

I don't believe you can do what you ask without other external information to identify your separate cases.

...