Dear Colleagues,
Let's suppose that my data contains three variables: CustomerID, orderID and orderTime. Every row in the data represent order of a customer. OrderID is unique. I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer. It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly. Do you have any ideas how to do that? Many thanks! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Kosetsky,
Could you please provide us with the syntax you're currently using? It's not very obvious how you'd go about with AGGREGATE and LOOP given the "long" format of the data you describe. I'd rather think towards sorting by CustomerId and OrderTime and then use a LAG function for the order count (so keep the long format and just compute 1 new variable). Best, Ruben |
In reply to this post by kosetsky
I don't quite see how you can use aggregate and loop to number customers. What is your syntax?
Isn't the order id a sequentially assigned number? Is the sequential assignment across orders or orders within customers? Ignoring that I think this will work Rank ordertime by customerid/rank into orderseq. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrey Sent: Monday, May 06, 2013 5:54 AM To: [hidden email] Subject: need help to optimize LOOP syntax Dear Colleagues, Let's suppose that my data contains three variables: CustomerID, orderID and orderTime. Every row in the data represent order of a customer. OrderID is unique. I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer. It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly. Do you have any ideas how to do that? Many thanks! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by kosetsky
There does not appear to be any LOOP syntax 'to optimize' present in your post.
Given the subject line, I would have assumed you would post your attempt(s). LOOP is a within row transformation device and will NOT apply for building a counter. See DO IF and LAG... or CREATE with CSUM? or RANK? but LOOP? Nah? Not in SPSS
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by kosetsky
Sort by customer and time (and orderID if needed.) I'm going to assume that order time includes a date so
Use lag to identify new customers. if ($casenum=1 or customerID ne lag(customerID)) orderseq=1. If (customerID eq lag(customerID)) orderseq=lag(orderseq)+1. freq orderseq. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrey Sent: Monday, May 06, 2013 4:54 AM To: [hidden email] Subject: [SPSSX-L] need help to optimize LOOP syntax Dear Colleagues, Let's suppose that my data contains three variables: CustomerID, orderID and orderTime. Every row in the data represent order of a customer. OrderID is unique. I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer. It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly. Do you have any ideas how to do that? Many thanks! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
A variation on the theme:
data list list / cust . begin data 1 1 1 1 2 2 2 2 end data. COMPUTE X=1. IF cust EQ LAG(cust) X=LAG(X)+1. *or. COMPUTE X2=SUM(1,(LAG(cust) EQ cust)*LAG(X2)). LIST. cust X X2 1.0000 1.0000 1.0000 1.0000 2.0000 2.0000 1.0000 3.0000 3.0000 1.0000 4.0000 4.0000 2.0000 1.0000 1.0000 2.0000 2.0000 2.0000 2.0000 3.0000 3.0000 2.0000 4.0000 4.0000 Number of cases read: 8 Number of cases listed: 8
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by kosetsky
Thank you all for your help!
RANK function does what I need. I just didn't know about it. On Mon, 6 May 2013 05:53:33 -0400, Andrey <[hidden email]> wrote: >Dear Colleagues, > > >Let's suppose that my data contains three variables: CustomerID, orderID and >orderTime. >Every row in the data represent order of a customer. OrderID is unique. > > >I need to create a variable that shows sequential number of every customer's >order according to a time it was made - is that first order, second, third >etc... of a certain customer. > >It's easy to define first order using "aggregate" function. After that, I'm >using LOOP to assign order's number. However, my dataset is very big and i'm >looking for more optimal method because LOOP do this task slowly. >Do you have any ideas how to do that? > >Many thanks! > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |