Massive reduction in file size from SAVE out =

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Massive reduction in file size from SAVE out =

John F Hall

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Massive reduction in file size from SAVE out =

John F Hall

The problem occurs with other years as well.  In all cases there are no data in Data View, just blank cells.

 

From: John F Hall [mailto:[hidden email]]
Sent: 22 September 2017 18:37
To: '[hidden email]' <[hidden email]>
Subject: Massive reduction in file size from SAVE out =

 

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Massive reduction in file size from SAVE out =

Jon Peck
Many things could be going on here.  Some suggestions:
Run DROP DOCUMENTS on the master file.  You may have accumulated a lot of sludge.
Use TO rather than copying long lists of names so you can see what is being specified better.
Run DESCRIPTIVES on the problem years to verify that the master file actually contains the expected data.

Since SELECT IF results in unselected cases being deleted, make sure that you really reopened the master file before the next selection.  USE ALL is irrelevant here.

On Fri, Sep 22, 2017 at 11:04 AM John F Hall <[hidden email]> wrote:

The problem occurs with other years as well.  In all cases there are no data in Data View, just blank cells.

 

From: John F Hall [mailto:[hidden email]]
Sent: 22 September 2017 18:37
To: '[hidden email]' <[hidden email]>
Subject: Massive reduction in file size from SAVE out =

 

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 



=====================

To manage your subscription to SPSSX-L, send a message to

[hidden email] (not to SPSSX-L), with no body text except the

command. To leave the list, send the command

SIGNOFF SPSSX-L

For a list of commands to manage subscriptions, send the command

INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Massive reduction in file size from SAVE out =

Rich Ulrich
In reply to this post by John F Hall

That sounds like you are saying: The file has the right number of cases, but data are blank.

In that case, the small size comes from compression.

You  would have blank data if you gave the variable list for the wrong year....


--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of John F Hall <[hidden email]>
Sent: Friday, September 22, 2017 1:04:09 PM
To: [hidden email]
Subject: Re: Massive reduction in file size from SAVE out =
 

The problem occurs with other years as well.  In all cases there are no data in Data View, just blank cells.

 

From: John F Hall [mailto:[hidden email]]
Sent: 22 September 2017 18:37
To: '[hidden email]' <[hidden email]>
Subject: Massive reduction in file size from SAVE out =

 

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Massive reduction in file size from SAVE out =

John F Hall

Jon, Rich

 

Thanks both.

 

After each attempt I return to the mother file, so I’m in the correct file.  I tried var TO var as well but get the same result for 1990 1991 and 1995.  I’ll try DROP DOCUMENTS and also some later waves 2001 to 2015, then report back.  I can’t use var TO var because the variables aren’t in the same order in the source and mother files.

 

Latest syntax:

 

drop documents.

 

select if year = 1990.

freq year.

 

save out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother90.sav'

/keep

year

serial

~ ~ ~ ~

married

religsum.

 

This works! 

 

3250 kb in: 3250 kb out.

 

Looks like Jon was right, as usual.  Only another 27 files to process.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Rich Ulrich
Sent: 22 September 2017 20:35
To: [hidden email]
Subject: Re: Massive reduction in file size from SAVE out =

 

That sounds like you are saying: The file has the right number of cases, but data are blank.

In that case, the small size comes from compression.

You  would have blank data if you gave the variable list for the wrong year....

 

--

Rich Ulrich

 


From: SPSSX(r) Discussion <[hidden email]> on behalf of John F Hall <[hidden email]>
Sent: Friday, September 22, 2017 1:04:09 PM
To:
[hidden email]
Subject: Re: Massive reduction in file size from SAVE out =

 

The problem occurs with other years as well.  In all cases there are no data in Data View, just blank cells.

 

From: John F Hall [[hidden email]]
Sent: 22 September 2017 18:37
To: '[hidden email]' <
[hidden email]>
Subject: Massive reduction in file size from SAVE out =

 

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Massive reduction in file size from SAVE out =

John F Hall

Have now managed to recreate all BSAS files from 1983 to 2014.  Inconsistencies and incompatibilities all solved, but 2015 still needs work.  Variables from all files can now be merged across years to create whatever is needed.  Occasionally got reports of no data, but resolved by using TEMP:

 

temp.

select if year = <year>.

save out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep

year <varlist in original order from each year>.

 

I had to do it the tedious way, but it was quicker and it worked.

 

Next step is to rebuild the mother file, but that can wait.

 

From: John F Hall [mailto:[hidden email]]
Sent: 23 September 2017 14:33
To: [hidden email]
Cc: 'Cadogan, Susan J' <[hidden email]>; [hidden email]
Subject: RE: Massive reduction in file size from SAVE out =

 

Jon, Rich

 

Thanks both.

 

After each attempt I return to the mother file, so I’m in the correct file.  I tried var TO var as well but get the same result for 1990 1991 and 1995.  I’ll try DROP DOCUMENTS and also some later waves 2001 to 2015, then report back.  I can’t use var TO var because the variables aren’t in the same order in the source and mother files.

 

Latest syntax:

 

drop documents.

 

select if year = 1990.

freq year.

 

save out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother90.sav'

/keep

year

serial

~ ~ ~ ~

married

religsum.

 

This works! 

 

3250 kb in: 3250 kb out.

 

Looks like Jon was right, as usual.  Only another 27 files to process.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

 

 

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Rich Ulrich
Sent: 22 September 2017 20:35
To: [hidden email]
Subject: Re: Massive reduction in file size from SAVE out =

 

That sounds like you are saying: The file has the right number of cases, but data are blank.

In that case, the small size comes from compression.

You  would have blank data if you gave the variable list for the wrong year....

 

--

Rich Ulrich

 


From: SPSSX(r) Discussion <[hidden email]> on behalf of John F Hall <[hidden email]>
Sent: Friday, September 22, 2017 1:04:09 PM
To:
[hidden email]
Subject: Re: Massive reduction in file size from SAVE out =

 

The problem occurs with other years as well.  In all cases there are no data in Data View, just blank cells.

 

From: John F Hall [[hidden email]]
Sent: 22 September 2017 18:37
To: '[hidden email]' <
[hidden email]>
Subject: Massive reduction in file size from SAVE out =

 

I recently created a cumulative SPSS *.sav mother file for all years of the British Social Attitudes Survey (BSAS).  Because the metadata from later waves were better defined I did this in reverse year order.  This means that for any one year the variables are not in questionnaire order.  The mother file has all inconsistencies and incompatibilities removed.  From this mother file, I’m now recreating files for each year with variables in questionnaire order starting with 1983. 

 

USE all.

SELECT IF year = < year>.

SAVE out 'C:\Users\John Hall\Desktop\BSAS 1983-2015 mother files\mother<year>.sav'

/keep <varlist> .

 

Using the original files as deposited, I’m opening the file for each year and copying the full set of variable names from the Names column in the Data Editor ( anything from 650 to 1500 names) then pasting them into the syntax file .  This worked fine for 1983 to 1989.  The resultant *.sav files are between 1.500 and 3.500 kb but for 1990 and 1991 they come out at 220kb and 330kb instead of 3250 and 3520.  I’ve repeated the exercise three times now and get the same result.  Any ideas anyone?  Thanks in advance.

 

John F Hall

[Retired academic survey researcher]

IBM-SPSS Academic Author 9900074

 

Website:          http://surveyresearch.weebly.com/

SPSS course:   http://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html

Research:        http://surveyresearch.weebly.com/3-subjective-social-indicators-quality-of-life.html

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD