Insufficient memory for time-dependent Cox regression

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Insufficient memory for time-dependent Cox regression

Dr Chris Poole
Dear Listers,

Does anyone have any advice about memory requirements for running time-dependent Cox regression analyses with large datasets.

We have a dataset of 184,000 cases with a quarterly exposure variable observed for upto twenty years (therefore 80 time segments).

In SPSS v.18 on recently purchased desktop PCs we keep getting 'Insufficient memory to process the command'. Annoying as this only seems to happen after the time program has run for many minutes.

I appreciate some others' experiences of t/d Cox regression and any solutions they found to overcome memory problems.

Kind regards,

Chris.

--
Dr Chris Poole
Senior Lecturer in the Evaluation of Medicines
Department of Primary Care & Public Health
School of Medicine
Cardiff University
Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
+44 (0)29 2068 2102

Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

David Marso
Administrator
See if
SET WORKSPACE=some large number helps.
Default is 6144 (kbtes) .

I thought most procedures just grabbed what is required but that may be a false assumption.  Hard to keep track of what does what etc...
I ran some *HUGE* MATRIX problems recently and my 4G machine happily
allowed SET WORKSPACE=200000 (ie 200M) and my 5000x5000 matrix inversion test ran in about 30 minutes.
HTH, David


Dr Chris Poole wrote
Dear Listers,

Does anyone have any advice about memory requirements for running
time-dependent Cox regression analyses with large datasets.

We have a dataset of 184,000 cases with a quarterly exposure variable
observed for upto twenty years (therefore 80 time segments).

In SPSS v.18 on recently purchased desktop PCs we keep getting
'Insufficient memory to process the command'. Annoying as this only seems
to happen after the time program has run for many minutes.

I appreciate some others' experiences of t/d Cox regression and any
solutions they found to overcome memory problems.

Kind regards,

Chris.

--
Dr Chris Poole
Senior Lecturer in the Evaluation of Medicines
Department of Primary Care & Public Health
School of Medicine
Cardiff University
Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
+44 (0)29 2068 2102
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

Chris Poole
Thanks for the prompt reply David,

I initially thought the same as you but didn't try such a big value for SET WORKSPACE.

There is a thread on IBM SPSS forum explaining how the number of computed values increases nearly exponentially using the TIME program. In our case, almost 20 billion are produced before the regression is computed.

We're also going to try running SPSS on a very large Amazon EC2 instance to improve performance.

Will let you know how we get on.

KR,

Chris.

On Monday, 20 February 2012, David Marso <[hidden email]> wrote:
> See if
> SET WORKSPACE=some large number helps.
> Default is 6144 (kbtes) .
>
> I thought most procedures just grabbed what is required but that may be a
> false assumption.  Hard to keep track of what does what etc...
> I ran some *HUGE* MATRIX problems recently and my 4G machine happily
> allowed SET WORKSPACE=200000 (ie 200M) and my 5000x5000 matrix inversion
> test ran in about 30 minutes.
> HTH, David
>
>
>
> Dr Chris Poole wrote
>>
>> Dear Listers,
>>
>> Does anyone have any advice about memory requirements for running
>> time-dependent Cox regression analyses with large datasets.
>>
>> We have a dataset of 184,000 cases with a quarterly exposure variable
>> observed for upto twenty years (therefore 80 time segments).
>>
>> In SPSS v.18 on recently purchased desktop PCs we keep getting
>> 'Insufficient memory to process the command'. Annoying as this only seems
>> to happen after the time program has run for many minutes.
>>
>> I appreciate some others' experiences of t/d Cox regression and any
>> solutions they found to overcome memory problems.
>>
>> Kind regards,
>>
>> Chris.
>>
>> --
>> Dr Chris Poole
>> Senior Lecturer in the Evaluation of Medicines
>> Department of Primary Care & Public Health
>> School of Medicine
>> Cardiff University
>> Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
>> +44 (0)29 2068 2102
>>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Insufficient-memory-for-time-dependent-Cox-regression-tp5499500p5499561.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

--
Chris Poole
[hidden email] | P. +44(0)7733004258
Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

David Marso
Administrator
You can actually set it even much larger.
I'm running SPSS 11.5 on a 4G Win32 Vista and
set workspace 8300000.
is accepted
set workspace 8400000.
fails with:
>Warning # 882 in column 15.  Text: 8400000
>The parameter of the WORKSPACE subcommand of the SET command must be a
>positive integer not larger than the computer's process memory limit.

Now I'd better set it back to something reasonable before I forget ;-)


On Mon, Feb 20, 2012 at 2:54 PM, Chris Poole <[hidden email]> wrote:

> Thanks for the prompt reply David,
>
> I initially thought the same as you but didn't try such a big value for SET
> WORKSPACE.
>
> There is a thread on IBM SPSS forum explaining how the number of computed
> values increases nearly exponentially using the TIME program. In our case,
> almost 20 billion are produced before the regression is computed.
>
> We're also going to try running SPSS on a very large Amazon EC2 instance to
> improve performance.
>
> Will let you know how we get on.
>
> KR,
>
> Chris.
>
> On Monday, 20 February 2012, David Marso <[hidden email]> wrote:
>> See if
>> SET WORKSPACE=some large number helps.
>> Default is 6144 (kbtes) .
>>
>> I thought most procedures just grabbed what is required but that may be a
>> false assumption.  Hard to keep track of what does what etc...
>> I ran some *HUGE* MATRIX problems recently and my 4G machine happily
>> allowed SET WORKSPACE=200000 (ie 200M) and my 5000x5000 matrix inversion
>> test ran in about 30 minutes.
>> HTH, David
>>
>>
>>
>> Dr Chris Poole wrote
>>>
>>> Dear Listers,
>>>
>>> Does anyone have any advice about memory requirements for running
>>> time-dependent Cox regression analyses with large datasets.
>>>
>>> We have a dataset of 184,000 cases with a quarterly exposure variable
>>> observed for upto twenty years (therefore 80 time segments).
>>>
>>> In SPSS v.18 on recently purchased desktop PCs we keep getting
>>> 'Insufficient memory to process the command'. Annoying as this only seems
>>> to happen after the time program has run for many minutes.
>>>
>>> I appreciate some others' experiences of t/d Cox regression and any
>>> solutions they found to overcome memory problems.
>>>
>>> Kind regards,
>>>
>>> Chris.
>>>
>>> --
>>> Dr Chris Poole
>>> Senior Lecturer in the Evaluation of Medicines
>>> Department of Primary Care & Public Health
>>> School of Medicine
>>> Cardiff University
>>> Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
>>> +44 (0)29 2068 2102
>>>
>>
>>
>> --
>> View this message in context:
>> http://spssx-discussion.1045642.n5.nabble.com/Insufficient-memory-for-time-dependent-Cox-regression-tp5499500p5499561.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
> --
> Chris Poole
> E: [hidden email] | P. +44(0)7733004258

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

Jon K Peck
In reply to this post by Chris Poole
Two points:
First, the 64-bit version can generally handle a lot more memory (and it is covered by the same license).  Second, do not leave the workspace setting at a big number, because that will starve procedures that do not use the workspace, which is most of them.  (I don't know for sure that this procedure actually uses the workspace, but the points above still apply.)

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Chris Poole <[hidden email]>
To:        [hidden email]
Date:        02/20/2012 01:27 PM
Subject:        Re: [SPSSX-L] Insufficient memory for time-dependent Cox              regression
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Thanks for the prompt reply David,

I initially thought the same as you but didn't try such a big value for SET WORKSPACE.

There is a thread on IBM SPSS forum explaining how the number of computed values increases nearly exponentially using the TIME program. In our case, almost 20 billion are produced before the regression is computed.

We're also going to try running SPSS on a very large Amazon EC2 instance to improve performance.

Will let you know how we get on.

KR,

Chris.

On Monday, 20 February 2012, David Marso <
david.marso@...> wrote:
> See if
> SET WORKSPACE=some large number helps.
> Default is 6144 (kbtes) .
>
> I thought most procedures just grabbed what is required but that may be a
> false assumption.  Hard to keep track of what does what etc...
> I ran some *HUGE* MATRIX problems recently and my 4G machine happily
> allowed SET WORKSPACE=200000 (ie 200M) and my 5000x5000 matrix inversion
> test ran in about 30 minutes.
> HTH, David
>
>
>
> Dr Chris Poole wrote
>>
>> Dear Listers,
>>
>> Does anyone have any advice about memory requirements for running
>> time-dependent Cox regression analyses with large datasets.
>>
>> We have a dataset of 184,000 cases with a quarterly exposure variable
>> observed for upto twenty years (therefore 80 time segments).
>>
>> In SPSS v.18 on recently purchased desktop PCs we keep getting
>> 'Insufficient memory to process the command'. Annoying as this only seems
>> to happen after the time program has run for many minutes.
>>
>> I appreciate some others' experiences of t/d Cox regression and any
>> solutions they found to overcome memory problems.
>>
>> Kind regards,
>>
>> Chris.
>>
>> --
>> Dr Chris Poole
>> Senior Lecturer in the Evaluation of Medicines
>> Department of Primary Care & Public Health
>> School of Medicine
>> Cardiff University
>> Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
>> +44 (0)29 2068 2102
>>
>
>
> --
> View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Insufficient-memory-for-time-dependent-Cox-regression-tp5499500p5499561.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
>
LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

--
Chris Poole

E: drchrispoole@... | P. +44(0)7733004258
Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

张伦
In reply to this post by Dr Chris Poole
Have you tried STATA? 
I ran  Cox regression on a dataset of 600,000 cases with about 10 variables. Stata works well. 
ZHANG Lun 

2012/2/20 Dr Chris Poole <[hidden email]>
Dear Listers,

Does anyone have any advice about memory requirements for running time-dependent Cox regression analyses with large datasets.

We have a dataset of 184,000 cases with a quarterly exposure variable observed for upto twenty years (therefore 80 time segments).

In SPSS v.18 on recently purchased desktop PCs we keep getting 'Insufficient memory to process the command'. Annoying as this only seems to happen after the time program has run for many minutes.

I appreciate some others' experiences of t/d Cox regression and any solutions they found to overcome memory problems.

Kind regards,

Chris.

--
Dr Chris Poole
Senior Lecturer in the Evaluation of Medicines
Department of Primary Care & Public Health
School of Medicine
Cardiff University
Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
<a href="tel:%2B44%20%280%2929%202068%202102" value="+442920682102" target="_blank">+44 (0)29 2068 2102


Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

MaxJasper

It is recommended that Complex Samples be used in order to extract appropriate samples for analysis from super large datasets. It is fun to see how nicely you are relieved from analyzing the whole large dataset.

 

Max.

 

From: 张伦 [via SPSSX Discussion] [mailto:[hidden email]]
Sent: 2012-Feb-20 Monday 20:08
To: MaxJasper
Subject: Re: Insufficient memory for time-dependent Cox regression

 

Have you tried STATA? 

I ran  Cox regression on a dataset of 600,000 cases with about 10 variables. Stata works well. 

ZHANG Lun 

2012/2/20 Dr Chris Poole <[hidden email]>

Dear Listers,

Does anyone have any advice about memory requirements for running time-dependent Cox regression analyses with large datasets.

We have a dataset of 184,000 cases with a quarterly exposure variable observed for upto twenty years (therefore 80 time segments).

In SPSS v.18 on recently purchased desktop PCs we keep getting 'Insufficient memory to process the command'. Annoying as this only seems to happen after the time program has run for many minutes.

I appreciate some others' experiences of t/d Cox regression and any solutions they found to overcome memory problems.

Kind regards,

Chris.

--
Dr Chris Poole
Senior Lecturer in the Evaluation of Medicines
Department of Primary Care & Public Health
School of Medicine
Cardiff University
Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
<a href="<a href="tel:%2B44%20%280%2929%202068%202102">tel:%2B44%20%280%2929%202068%202102" value="+442920682102" target="_blank">+44 (0)29 2068 2102

 

 


To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Insufficient memory for time-dependent Cox regression

Hector Maletta
In reply to this post by Dr Chris Poole

I have computed Cox regression with several million cases, about 20 predictor variables and several observations per case, which is more than your study of 184,000 x (80 observations + some predictors), and I had no problem processing those analyses except the (sometimes annoyingly long) time required by SPSS to perform the iterations. Never got any message about insufficient memory. I did get that message sometimes with categorical principal components (CATPCA) and other procedures requiring the whole dataset to be held in RAM, but not in the case of Cox.

Thus, there is a possibility that some other problem is the cause of the message, not the size of the dataset (cases x (measures + predictors). Perhaps, for instance, you do have insufficient memory in general.

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Dr Chris Poole
Enviado el: Monday, February 20, 2012 16:10
Para: [hidden email]
Asunto: Insufficient memory for time-dependent Cox regression

 

Dear Listers,

Does anyone have any advice about memory requirements for running time-dependent Cox regression analyses with large datasets.

We have a dataset of 184,000 cases with a quarterly exposure variable observed for upto twenty years (therefore 80 time segments).

In SPSS v.18 on recently purchased desktop PCs we keep getting 'Insufficient memory to process the command'. Annoying as this only seems to happen after the time program has run for many minutes.

I appreciate some others' experiences of t/d Cox regression and any solutions they found to overcome memory problems.

Kind regards,

Chris.

--
Dr Chris Poole
Senior Lecturer in the Evaluation of Medicines
Department of Primary Care & Public Health
School of Medicine
Cardiff University
Cardiff MediCentre, Heath Park, Cardiff, CF14 4UJ
+44 (0)29 2068 2102