need help to optimize LOOP syntax

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

need help to optimize LOOP syntax

kosetsky
Dear Colleagues,


Let's suppose that my data contains three variables: CustomerID, orderID and
orderTime.
Every row in the data represent order of a customer. OrderID is unique.


I need to create a variable that shows sequential number of every customer's
order according to a time it was made - is that first order, second, third
etc... of a certain customer.

It's easy to define first order using "aggregate" function. After that, I'm
using LOOP to assign order's number. However, my dataset is very big and i'm
looking for more optimal method because LOOP do this task slowly.
Do you have any ideas how to do that?

Many thanks!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

Ruben Geert van den Berg
Dear Kosetsky,

Could you please provide us with the syntax you're currently using? It's not very obvious how you'd go about with AGGREGATE and LOOP given the "long" format of the data you describe.

I'd rather think towards sorting by CustomerId and OrderTime and then use a LAG function for the order count (so keep the long format and just compute 1 new variable).

Best,

Ruben
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

Maguin, Eugene
In reply to this post by kosetsky
I don't quite see how you can use aggregate and loop to number customers. What is your syntax?
Isn't the order id a sequentially assigned number? Is the sequential assignment across orders or orders within customers?

Ignoring that I think this will work
Rank ordertime by customerid/rank into orderseq.

Gene Maguin





-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrey
Sent: Monday, May 06, 2013 5:54 AM
To: [hidden email]
Subject: need help to optimize LOOP syntax

Dear Colleagues,


Let's suppose that my data contains three variables: CustomerID, orderID and orderTime.
Every row in the data represent order of a customer. OrderID is unique.


I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer.

It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly.
Do you have any ideas how to do that?

Many thanks!

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

David Marso
Administrator
In reply to this post by kosetsky
There does not appear to be any LOOP syntax 'to optimize' present in your post.
Given the subject line, I would have assumed you would post your attempt(s).
LOOP is a within row transformation device and will NOT apply for building a counter.
See DO IF and LAG...
or CREATE with CSUM?
or RANK?
but LOOP?  Nah? Not in SPSS
kosetsky wrote
Dear Colleagues,


Let's suppose that my data contains three variables: CustomerID, orderID and
orderTime.
Every row in the data represent order of a customer. OrderID is unique.


I need to create a variable that shows sequential number of every customer's
order according to a time it was made - is that first order, second, third
etc... of a certain customer.

It's easy to define first order using "aggregate" function. After that, I'm
using LOOP to assign order's number. However, my dataset is very big and i'm
looking for more optimal method because LOOP do this task slowly.
Do you have any ideas how to do that?

Many thanks!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

Melissa Ives
In reply to this post by kosetsky
Sort by customer and time (and orderID if needed.) I'm going to assume that order time includes a date so
Use lag to identify new customers.

if ($casenum=1 or customerID ne lag(customerID)) orderseq=1.
If (customerID eq lag(customerID)) orderseq=lag(orderseq)+1.
freq orderseq.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrey
Sent: Monday, May 06, 2013 4:54 AM
To: [hidden email]
Subject: [SPSSX-L] need help to optimize LOOP syntax

Dear Colleagues,


Let's suppose that my data contains three variables: CustomerID, orderID and orderTime.
Every row in the data represent order of a customer. OrderID is unique.


I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer.

It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly.
Do you have any ideas how to do that?

Many thanks!

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

David Marso
Administrator
A variation on the theme:
data list list / cust .
begin data
1
1
1
1
2
2
2
2
end data.
COMPUTE X=1.
IF cust EQ LAG(cust) X=LAG(X)+1.
*or.
COMPUTE X2=SUM(1,(LAG(cust) EQ cust)*LAG(X2)).
LIST.

  cust      X     X2

1.0000 1.0000 1.0000
1.0000 2.0000 2.0000
1.0000 3.0000 3.0000
1.0000 4.0000 4.0000
2.0000 1.0000 1.0000
2.0000 2.0000 2.0000
2.0000 3.0000 3.0000
2.0000 4.0000 4.0000


Number of cases read:  8    Number of cases listed:  8


Melissa Ives wrote
Sort by customer and time (and orderID if needed.) I'm going to assume that order time includes a date so
Use lag to identify new customers.

if ($casenum=1 or customerID ne lag(customerID)) orderseq=1.
If (customerID eq lag(customerID)) orderseq=lag(orderseq)+1.
freq orderseq.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrey
Sent: Monday, May 06, 2013 4:54 AM
To: [hidden email]
Subject: [SPSSX-L] need help to optimize LOOP syntax

Dear Colleagues,


Let's suppose that my data contains three variables: CustomerID, orderID and orderTime.
Every row in the data represent order of a customer. OrderID is unique.


I need to create a variable that shows sequential number of every customer's order according to a time it was made - is that first order, second, third etc... of a certain customer.

It's easy to define first order using "aggregate" function. After that, I'm using LOOP to assign order's number. However, my dataset is very big and i'm looking for more optimal method because LOOP do this task slowly.
Do you have any ideas how to do that?

Many thanks!

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: need help to optimize LOOP syntax

kosetsky
In reply to this post by kosetsky
Thank you all for your help!

RANK function does what I need.
I just didn't know about it.


On Mon, 6 May 2013 05:53:33 -0400, Andrey <[hidden email]> wrote:

>Dear Colleagues,
>
>
>Let's suppose that my data contains three variables: CustomerID, orderID
and
>orderTime.
>Every row in the data represent order of a customer. OrderID is unique.
>
>
>I need to create a variable that shows sequential number of every
customer's
>order according to a time it was made - is that first order, second, third
>etc... of a certain customer.
>
>It's easy to define first order using "aggregate" function. After that, I'm
>using LOOP to assign order's number. However, my dataset is very big and
i'm

>looking for more optimal method because LOOP do this task slowly.
>Do you have any ideas how to do that?
>
>Many thanks!
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD