SPSS - best practices?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

SPSS - best practices?

Linda George
Hi,

I'm pretty sure I'm losing saved data at random times (e.g. the
"disappearing data" problem), even though I'm on 16.0.2.  I just spent
a few hours re-creating some variables from 3 days ago.  Not pretty -
I'm almost to the point of learning R and doing something destructive
with my SPSS CD (not that it would help, but it might feel good for a
short while :)

But - in the interest of making sure that my problems are not my own
doing...

I work with some fairly large datasets that evolve over time through
ongoing data collection.  Periodically I download new files, re-
calculate variables, and save iterations of versions (in addition to
interim saves due to the "disappearing data" issue).  I'm trying to
think of practices I can use to make sure I'm working with the most
recent version of everything. I do keep variable definitions and
analysis syntax in syntax files (including relevant filters, selection
commands, etc.), so I can reproduce analyses when needed.

If you have any SPSS "best practices" you'd be willing to share I'd
love to hear them.  I'm concerned that it's all too easy to make all
sorts of silly mistakes in SPSS -

Thanks, --Linda

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS - best practices?

Maguin, Eugene
Linda,

Are you losing cases, variables or values? By the latter, I mean, are values
changing when they haven't otherwise been operated on. As you have probably
seen, there have been reported (and acknowledged, I believe) problems with
the display of records but the records, themselves, have not been deleted, I
think. Perhaps others can comment on this point.

It sounds like your operations are quite complex. My suggestion, and you may
already be doing this, is to institute an audit trail--and ruthlessly stick
to it. What I mean is that every operation or set of operations that adds or
deletes records, or modifies values is saved to a new dataset, which is
named in structured way so that the sequence is apparent without looking at
creation/modification dates. The new dataset is the then used for the next
set of operations. As a second step you could do the same with your syntax
files, either through separate files or through a single, increasingly large
file with internal documentation.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS - best practices?

Linda George
In reply to this post by Linda George
Hi Gene,

I'm losing new variables that I created, in a file that I thought I
saved. The file is there but it doesn't have my recent additions.

Thanks for the audit trail suggestion. I've been thinking about how to
keep some sort of log noting changes to each file; a strict naming
convention for data and syntax files would help.

Thanks, -Linda

-----------------------------

Date:    Fri, 25 Jul 2008 10:13:19 -0400
From:    Gene Maguin <[hidden email]>
Subject: Re: SPSS - best practices?

Linda,

Are you losing cases, variables or values? By the latter, I mean, are
values
changing when they haven't otherwise been operated on. As you have
probably
seen, there have been reported (and acknowledged, I believe) problems
with
the display of records but the records, themselves, have not been
deleted, I
think. Perhaps others can comment on this point.

It sounds like your operations are quite complex. My suggestion, and
you may
already be doing this, is to institute an audit trail--and ruthlessly
stick
to it. What I mean is that every operation or set of operations that
adds or
deletes records, or modifies values is saved to a new dataset, which is
named in structured way so that the sequence is apparent without
looking at
creation/modification dates. The new dataset is the then used for the
next
set of operations. As a second step you could do the same with your
syntax
files, either through separate files or through a single, increasingly
large
file with internal documentation.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS - best practices?

Hal 9000
If you really wanted to get hardcore about it, you could add a
timestamp to the filename (date + time). That nails things down to the
second.
-Gary

On Fri, Jul 25, 2008 at 9:47 PM, Linda George
<[hidden email]> wrote:

> Hi Gene,
>
> I'm losing new variables that I created, in a file that I thought I
> saved. The file is there but it doesn't have my recent additions.
>
> Thanks for the audit trail suggestion. I've been thinking about how to
> keep some sort of log noting changes to each file; a strict naming
> convention for data and syntax files would help.
>
> Thanks, -Linda
>
> -----------------------------
>
> Date:    Fri, 25 Jul 2008 10:13:19 -0400
> From:    Gene Maguin <[hidden email]>
> Subject: Re: SPSS - best practices?
>
> Linda,
>
> Are you losing cases, variables or values? By the latter, I mean, are
> values
> changing when they haven't otherwise been operated on. As you have
> probably
> seen, there have been reported (and acknowledged, I believe) problems
> with
> the display of records but the records, themselves, have not been
> deleted, I
> think. Perhaps others can comment on this point.
>
> It sounds like your operations are quite complex. My suggestion, and
> you may
> already be doing this, is to institute an audit trail--and ruthlessly
> stick
> to it. What I mean is that every operation or set of operations that
> adds or
> deletes records, or modifies values is saved to a new dataset, which is
> named in structured way so that the sequence is apparent without
> looking at
> creation/modification dates. The new dataset is the then used for the
> next
> set of operations. As a second step you could do the same with your
> syntax
> files, either through separate files or through a single, increasingly
> large
> file with internal documentation.
>
> Gene Maguin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS - best practices?

Dennis Deck
In reply to this post by Linda George
First, be sure your new variables were actually created.

Second, you did not mention multiple files but it is a likely culprit.
Note that v15 and v16 of SPSS now support multiple data files active in
a job.
If more than one file is open, v15 will drop files that were not
explicitly defined.
To use them during the job you should follow each GET (or similar
command) with a
DATASET NAME command.

Dennis Deck, PhD
RMC Research Corporation
111 SW Columbia Street, Suite 1200
Portland, Oregon 97201-5843
voice: 503-223-8248 x715
voice: 800-788-1887 x715
fax:  503-223-8248
[hidden email]


-----Original Message-----
From: Linda George [mailto:[hidden email]]
Sent: Friday, July 25, 2008 9:48 PM
Subject: Re: SPSS - best practices?

Hi Gene,

I'm losing new variables that I created, in a file that I thought I
saved. The file is there but it doesn't have my recent additions.

Thanks for the audit trail suggestion. I've been thinking about how to
keep some sort of log noting changes to each file; a strict naming
convention for data and syntax files would help.

Thanks, -Linda

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS - best practices?

Dennis Deck
In reply to this post by Linda George
Can you still run v15?  I have had problems with v16 using code that had
long been running fine with earlier versions.  After multiple attempts I
finally just ran the job with v15 and that worked fine.
Personally I have had more problems with v16 than any prior upgrade.

It is possible that your problem might be memory related (saw a message
today that suggests that v16 is more of a memory hog and this is one
that would be hard to diagnose from the messages) so another option is
to find the command to dedicate more memory for the job - or reduce the
demands of the task that is failing.

Dennis Deck, PhD
RMC Research Corporation
111 SW Columbia Street, Suite 1200
Portland, Oregon 97201-5843
voice: 503-223-8248 x715
voice: 800-788-1887 x715
fax:  503-223-8248
[hidden email]


-----Original Message-----
From: Linda George [mailto:[hidden email]]
Sent: Thursday, July 24, 2008 5:14 PM
Subject: SPSS - best practices?

Hi,

I'm pretty sure I'm losing saved data at random times (e.g. the
"disappearing data" problem), even though I'm on 16.0.2.  I just spent a
few hours re-creating some variables from 3 days ago.  Not pretty - I'm
almost to the point of learning R and doing something destructive with
my SPSS CD (not that it would help, but it might feel good for a short
while :)

But - in the interest of making sure that my problems are not my own
doing...

I work with some fairly large datasets that evolve over time through
ongoing data collection.  Periodically I download new files, re-
calculate variables, and save iterations of versions (in addition to
interim saves due to the "disappearing data" issue).  I'm trying to
think of practices I can use to make sure I'm working with the most
recent version of everything. I do keep variable definitions and
analysis syntax in syntax files (including relevant filters, selection
commands, etc.), so I can reproduce analyses when needed.

If you have any SPSS "best practices" you'd be willing to share I'd love
to hear them.  I'm concerned that it's all too easy to make all sorts of
silly mistakes in SPSS -

Thanks, --Linda

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD