Identifying same transaction

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Identifying same transaction

Mark Webb-5
I would appreciate some help with syntax that I thought would be simple but I can't get it work.
I have transactional data from a store with each product in the transaction on a separate line.
I want to be able to identify lines of data that are the same transaction.
I can work out a transaction by identifying duplicate cased based on 3 variables - store / date / time which are unique to a transaction.
My data looks like -> I want to compute the TransactionNumber column shown below.

Store Date Time Product PrimaryLast TransactionNumber
1        x       y      a           0                  1
1        x       y      b           0                  1
1        x       y      c           1                  1
1        p       q      b           0                  2
1        p       q      c           1                  2
2        r        s      a           1                  3
3        f        g      a           1                  4

This data represents 4 transactions - the first 3 lines transaction 1, the next 2 lines transaction2, etc.
Unique transactions are identified by the SPSS procedure 'Identify Duplicate Cases' using Store, Date & Time as the basis for matching.
This transaction produces the PrimaryLast column above. [a 1 represents the end of a transaction.

I need to number each line according to which transaction it belongs - as shown above under Transaction Number.

I have tried various things but am struggling to get it to work - I would appreciate a few hints.

Regards
--
Mark Webb

Line +27 (21) 786 4379
Cell +27 (72) 199 1000 [Poor reception]
Fax  +27 (86) 260 1946

Skype       tomarkwebb
Email       [hidden email] 
Reply | Threaded
Open this post in threaded view
|

Re: Identifying same transaction

Andy W
One way to do this is to use MATCH FILES to find the first record in the group and then take the cumulative sum. (Another way is to use LAGS - each requires that the file is sorted.)

*************************************************************************************.
DATA LIST FREE /Store (F1.0) Date Time Product (3A1) PrimaryLast TransactionNumber (2F1.0).
BEGIN DATA
1        x       y      a           0                  1
1        x       y      b           0                  1
1        x       y      c           1                  1
1        p       q      b           0                  2
1        p       q      c           1                  2
2        r        s      a           1                  3
3        f        g      a           1                  4
END DATA.

*********************.
*Using MATCH FILES to flag FIRST variable.
SORT CASES BY Store Date Time.
MATCH FILES FILE = *
  /FIRST = MyCount
  /BY Store Date Time.
*Then CSUM.
CREATE Trans#2 = CSUM(MyCount).
*********************.

*************************************************************************************.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/