Recovery of data from a binary file

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Recovery of data from a binary file

John F Hall

Please can anyone help with getting SPSS to read a binary file or alternatively to (re)create an ASCII file from one?

 

I am working with SPSS files distributed by UK Data Services (University of Essex) for Gary Runciman’s 1962 survey, as reported in his book:

 

W G Runciman

Relative Deprivation and Social Justice

(RKP, 1966)

 

The original data were multipunched on 80-column cards: a single column can hold data for more than one variable.  The data were later spread out on new set of single-punched cards and read into SPSS using INPUT FORMAT.  Later operations involved a new version of the setup file using DATA LIST instead.

 

During data checks I discovered that counts for birthsno (no of live children ever) and  kidsdied (no of children who died) were impossible:

 

birthsno Q.2b: Total children including dead

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

341

24.1

100.0

100.0

Missing

System

1074

75.9

 

 

Total

1415

100.0

 

 

 

kidsdied Q.2c: Number of deceased children

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

383

27.1

27.1

27.1

2 Two

522

36.9

36.9

64.0

3 Three

326

23.0

23.0

87.0

4 Four

99

7.0

7.0

94.0

5 Five

37

2.6

2.6

96.6

6

21

1.5

1.5

98.1

7

17

1.2

1.2

99.3

8

7

.5

.5

99.8

9

3

.2

.2

100.0

Total

1415

100.0

100.0

 

 

These figures do not make sense.

 

Further checks revealed a discrepancy between the frequencies in the codebook and the frequencies obtained from the SPSS file for these and two other (adjacent) variables.  I immediately suspected a data format reading error and reported it back to UKDS.

 

UKDS have now privately supplied me with three files used to create the SPSS file

 

d2028.bin          a binary file with multipunched data (my computer thinks it’s movie.)

r028.dat            some sort of conversion from multi-punch to ASCII (Fortran?)

do28.fnt            holecount of multipunched data?

 

In the package distributed by UKDS there are two *.sps files and one *.sav file:

 

d028.sps          original 1975 setup file using INPUT FORMAT  

d028a.sps         modified setup file using DATA LIST

sn28.sav          SPSS saved file

 

d028.sps:

 

FILE NAME      RUNCIMAN,RELATIVE DEPRIVATION AND SOCIAL JUSTICE

VARIABLE LIST  CASENO,CARDNO,NEWHOME,OLDHOME,MARITAL,BIRTHSNO,KIDSDIED,KIDSLIVE,

               LEAVESCH,TEENLIVE,TEENSCH,FEESCHS,MOREEDUC,EDUCTYPE

~ ~ ~ ~

               LIFESTYL,ACCENT,AGE,SEX,OCCUP,EDUCFIN,INCOME,WIFECASH,SEENHOME,

               SEENWORK,SEENOTH

# OF CASES     1415

INPUT FORMAT   FIXED(F4.0,F2.0,3F1.0,F2.0,F1.0,F2.0,F1.0,F2.0,9F1.0,F2.0,52F1.0/

               6X,64F1.0,F2.0,8F1.0/6X,74F1.0/6X,53F1.0,2F2.0,3F1.0)

INPUT MEDIUM   SPL:D2028.DAT

~ ~ ~ ~

 

BIRTHSNO,TOTAL CHILDREN INCLUDING DEAD/

               KIDSDIED,NUMBER OF DECEASED CHILDREN/

               KIDSLIVE,NUMBER OF LIVE CHILDREN UNDER 15 YRS/

               LEAVESCH,AGE EXPECT KIDS TO LEAVE SCHOOL/

               TEENLIVE,NUMBER OF LIVE CHILDREN OVER 15 YRS/

 

d028a.sps

 

title     RUNCIMAN RELATIVE DEPRIVATION AND SOCIAL JUSTICE

file handle d028a/name='/ufs3/howas/028/d2028.dat'

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

            feeschs 19 moreeduc 20 eductype 21 madwell 22 head 23 adults 24

 

There is clearly an error in 4 variables which were read in from the wrong columns. 

 

The SPSS syntax supplied in d028a.sps above is wrong:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

 

It needs changing to:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10-11

            kidsdied 12 kidslive 13-14 leavesch 15 teenlive 16-17 teensch 18

 

Unfortunately the original single-punched raw data ASCII file no longer exists, so unless SPSS can read the binary file, the correct data can only be recovered in SPSS syntax by using an original spread out single punched Ascii file, but I think there may be an error in lines 8-14 of the conversion file r028.dat

 

COPY 10
CONVERT 11
GET 12 R 7 8 9 0 -
GET 13 R 1 2 3 4
GET 13 R 5 6 7 8
GET 14 R 1 2 3 - +
GET 15 R 1 2 3 4 - 0 +

 

I’ve managed to look inside the d028.fnt file: it’s some sort of holecount (see extract of cards 10 – 19 below).

 

           10:                     341                                                  1074         : 10:

          : 11:                     383   522   326    99    37    21    17     7     3               : 11:

          : 12:              1282    96    24     8     2     3                                       : 12:

          : 13:                     325                                                  1090         : 13:

          : 14:               594   228   165   390    25     6     4     2     1                     : 14:

          : 15:                     128   263    31    74                                 919         : 15:

          : 16:                     330                                                  1085         : 16:

          : 17:               384   295   206   428    54    21    14     6     2     5               : 17:

          : 18:               709   350   256    95     3     2                                       : 18:

          : 19:               346   172   862    26     9         

 

I’ve had a look at the FM but am none the wiser as to how to read the binary file into SPSS, or how to recreate the first data record.  Can anyone help? 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

                         

Reply | Threaded
Open this post in threaded view
|

Re: Recovery of data from a binary file

David Marso
Administrator
John,
I suspect you would need some sort of time machine.
That first code snippet takes me back some 30 years (I first used SPSS in 1983).
Wondering why this 50+ year old data is of any interest in 2014?
Good Luck!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Recovery of data from a binary file

PRogman
In reply to this post by John F Hall
Have you tried to rename the file d2028.bin to d2028.txt to see what's inside?
If it is very big it might help to have a good editor, otherwise notepad may be sufficient (unless you want to dive deep into DOS: WIN+R, cmd, type path\d2028.bin|more, Ctrl+C, Exit...).

From the definitions (INPUT FORMAT...) it seems to be 4 lines to each record, and the lines seem to have 80, 80, 80 and 63 columns respectively, making each record 303 columns on 72+73+74+63= 282 variables.
I would try to read in the d2028.bin using something like
DATALIST LIST FILE=d2028.bin RECORDS=4 1/row1varlist (formats) 2/row2varlist (formats)... 
providing the d0228.bin is in an understandable format.

HTH,
PR


Reply | Threaded
Open this post in threaded view
|

Re: Recovery of data from a binary file

David Marso
Administrator
In reply to this post by John F Hall
I suspect that .bin file is column binary (multipunch).
At one point long ago I knew how to read those (with major eye gouging  and great resistance) and the OLD SPSS manuals had some documentation in the appendix.
These days you see reference in FILE HANDLE to MODE=MULTIPUNCH however no examples of how to read the bloody thing.
Why not go with the .dat file (sounds like someone has already converted the multipunch data and all you need to do is hit the .dat with an appropriate DATA LIST command.
--
As far as Multipunch files?  I deliberately forgot everything I ever knew about them about 15 years ago.
Here is a hint from an OLD post in the SPSSX-L archive:
http://listserv.uga.edu/cgi-bin/wa?A2=ind9612&L=spssx-l&P=27638

Quoting from that:
"FILE HANDLE mpdata  NAME='your file and path' / MODE=MULTIPUNCH.
DATA LIST FILE=mpdata RECORDS n
   etc.


 
where n is the number of records per respondent, and mpdata is the
filename. You would attach your data file to the filename mpdata, and
specify the option "MODE=MULTIPUNCH", for a multipunch file. BTW, FILE
HANDLE can be used for ASCII files with record length>1024, as in the
LRECL=2000 option.
 
If you want to read a multipunched column (as with unaided brand
awareness), take a look at this example. Assume that you're interested
in card 1, column 23, and awareness punches are 1-9, 0, X, and Y.
Coding would work like this:
 
  /1 AWARE01 TO AWARE09 23:4-12 AWARE10 23:3 AWARE11 23:2 AWARE12 23:1
 
This creates separate "dummy" variables for each punch, coded as 1 for
the response code, and 0 if not coded for the respondent. You can then
run TABLES or MULTIPLE RESPONSE with these variables.
 
The numeric sequence is because SPSS reads multipunch "rows" as
Y,,X,0,1-9. Hope this helps you out.




Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Recovery of data from a binary file

John F Hall
In reply to this post by John F Hall

Bob

 

Thanks for this.  As far as I know it’s not a nested file.  I know about the + and  - zone punches as I had to deal with many of these in the 1970s, reading them in as alpha and then converting to numeric.  SPSS doesn’t like invalid combinations and is registering an error.

 

An invalid combination of punches was detected reading a variable in a MULTIPUNCHed file.  The field will be processed as though it contained blanks. This process may precipitate an additional warning.

 

 

I’m still working my way round the binary file and so far have managed to read part of it and produce correct frequencies for two named variables, plus frequencies for single columns.  The problem I now have is that the columns are not the same as the ones in the codebook and I still have to reconcile the frequencies and values tabulated with those in the codebook.

 

For instance the number of children born to R including those who died is supposed to be in columns 10-11 but is actually in column 7.

 

I’m using the tedious method of reading one column at a time, but at least it works.

 

FILE HANDLE sn28

    /NAME='C:\Users\John\d2028.bin' /MODE=MULTIPUNCH.

data list file sn28

/1 serial 1-4 v105 to v120 5-20 (a).

list serial /cases 5.

freq v105 to v120.

 

What I’m not sure about is the number of records in the binary file, but it looks like there are eight: there may only be one..

 

John

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

From: Bob Walker [mailto:[hidden email]]
Sent: 15 May 2014 19:23
To: John F Hall
Subject: RE: Recovery of data from a binary file

 

Hi John,

 

You’re implying that you have column-binary card image data an using 80-column card format. The FILE HANDLE command using MODE=MULTIPUNCH should be able to read your *.bin file without too much effort, I think. Keep in mind that you will need to account for 0-9, ‘x’, ‘y’, and ‘&’ by using RECODE with CONVERT.

 

It is rather hard to figure out the layout from what you pasted, but you may have a nested record structure (i.e., one master household record, and then perhaps one per child). The hole counts should help figure that out.

 

Bob Walker

Surveys & Forecasts, LLC

www.safllc.com

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of John F Hall
Sent: Thursday, May 15, 2014 4:20 AM
To: [hidden email]
Subject: Recovery of data from a binary file

 

Please can anyone help with getting SPSS to read a binary file or alternatively to (re)create an ASCII file from one?

 

I am working with SPSS files distributed by UK Data Services (University of Essex) for Gary Runciman’s 1962 survey, as reported in his book:

 

W G Runciman

Relative Deprivation and Social Justice

(RKP, 1966)

 

The original data were multipunched on 80-column cards: a single column can hold data for more than one variable.  The data were later spread out on new set of single-punched cards and read into SPSS using INPUT FORMAT.  Later operations involved a new version of the setup file using DATA LIST instead.

 

During data checks I discovered that counts for birthsno (no of live children ever) and  kidsdied (no of children who died) were impossible:

 

birthsno Q.2b: Total children including dead

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

341

24.1

100.0

100.0

Missing

System

1074

75.9

 

 

Total

1415

100.0

 

 

 

 

kidsdied Q.2c: Number of deceased children

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

383

27.1

27.1

27.1

2 Two

522

36.9

36.9

64.0

3 Three

326

23.0

23.0

87.0

4 Four

99

7.0

7.0

94.0

5 Five

37

2.6

2.6

96.6

6

21

1.5

1.5

98.1

7

17

1.2

1.2

99.3

8

7

.5

.5

99.8

9

3

.2

.2

100.0

Total

1415

100.0

100.0

 

 

These figures do not make sense.

 

Further checks revealed a discrepancy between the frequencies in the codebook and the frequencies obtained from the SPSS file for these and two other (adjacent) variables.  I immediately suspected a data format reading error and reported it back to UKDS.

 

UKDS have now privately supplied me with three files used to create the SPSS file

 

d2028.bin          a binary file with multipunched data (my computer thinks it’s movie.)

r028.dat            some sort of conversion from multi-punch to ASCII (Fortran?)

do28.fnt            holecount of multipunched data?

 

In the package distributed by UKDS there are two *.sps files and one *.sav file:

 

d028.sps          original 1975 setup file using INPUT FORMAT    

d028a.sps         modified setup file using DATA LIST

sn28.sav           SPSS saved file

 

d028.sps:

 

FILE NAME      RUNCIMAN,RELATIVE DEPRIVATION AND SOCIAL JUSTICE

VARIABLE LIST  CASENO,CARDNO,NEWHOME,OLDHOME,MARITAL,BIRTHSNO,KIDSDIED,KIDSLIVE,

               LEAVESCH,TEENLIVE,TEENSCH,FEESCHS,MOREEDUC,EDUCTYPE

~ ~ ~ ~

               LIFESTYL,ACCENT,AGE,SEX,OCCUP,EDUCFIN,INCOME,WIFECASH,SEENHOME,

               SEENWORK,SEENOTH

# OF CASES     1415

INPUT FORMAT   FIXED(F4.0,F2.0,3F1.0,F2.0,F1.0,F2.0,F1.0,F2.0,9F1.0,F2.0,52F1.0/

               6X,64F1.0,F2.0,8F1.0/6X,74F1.0/6X,53F1.0,2F2.0,3F1.0)

INPUT MEDIUM   SPL:D2028.DAT

~ ~ ~ ~

 

BIRTHSNO,TOTAL CHILDREN INCLUDING DEAD/

               KIDSDIED,NUMBER OF DECEASED CHILDREN/

               KIDSLIVE,NUMBER OF LIVE CHILDREN UNDER 15 YRS/

               LEAVESCH,AGE EXPECT KIDS TO LEAVE SCHOOL/

               TEENLIVE,NUMBER OF LIVE CHILDREN OVER 15 YRS/

 

d028a.sps

 

title     RUNCIMAN RELATIVE DEPRIVATION AND SOCIAL JUSTICE

file handle d028a/name='/ufs3/howas/028/d2028.dat'

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

            feeschs 19 moreeduc 20 eductype 21 madwell 22 head 23 adults 24

 

There is clearly an error in 4 variables which were read in from the wrong columns. 

 

The SPSS syntax supplied in d028a.sps above is wrong:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

 

It needs changing to:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10-11

            kidsdied 12 kidslive 13-14 leavesch 15 teenlive 16-17 teensch 18

 

Unfortunately the original single-punched raw data ASCII file no longer exists, so unless SPSS can read the binary file, the correct data can only be recovered in SPSS syntax by using an original spread out single punched Ascii file, but I think there may be an error in lines 8-14 of the conversion file r028.dat

 

COPY 10
CONVERT 11
GET 12 R 7 8 9 0 -
GET 13 R 1 2 3 4
GET 13 R 5 6 7 8
GET 14 R 1 2 3 - +
GET 15 R 1 2 3 4 - 0 +

 

I’ve managed to look inside the d028.fnt file: it’s some sort of holecount (see extract of cards 10 – 19 below).

 

           10:                     341                                                  1074         : 10:

          : 11:                     383   522   326    99    37    21    17     7     3               : 11:

          : 12:              1282    96    24     8     2     3                                       : 12:

          : 13:                     325                                                  1090         : 13:

          : 14:               594   228   165   390    25     6     4     2     1                     : 14:

          : 15:                     128   263    31    74                                 919         : 15:

          : 16:                     330                                                  1085         : 16:

          : 17:               384   295   206   428    54    21    14     6     2     5               : 17:

          : 18:               709   350   256    95     3     2                                       : 18:

          : 19:               346   172   862    26     9         

 

I’ve had a look at the FM but am none the wiser as to how to read the binary file into SPSS, or how to recreate the first data record.  Can anyone help? 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

                         

Reply | Threaded
Open this post in threaded view
|

Re: Recovery of data from a binary file

Robert Walker

Hi John,

 

Without a valid codebook, it will be a bit of a guessing game, so I don’t envy you. The first few columns were typically set aside for a respondent ID and 79-80 for a card number, but your data file may not adhere to this. I’d only suggest that you experiment with MODE switches, for example /MODE=MULTIPUNCH vs. /MODE=EBCDIC (the old IBM standard) may yield different output and perhaps make that alignment process a little less tedious for you.

 

Bob Walker

Surveys & Forecasts, LLC

www.safllc.com

 

From: John F Hall [mailto:[hidden email]]
Sent: Thursday, May 15, 2014 4:18 PM
To: Bob Walker
Cc: [hidden email]; 'UK Data Service Collections Development Team'; [hidden email]
Subject: RE: Recovery of data from a binary file

 

Bob

 

Thanks for this.  As far as I know it’s not a nested file.  I know about the + and  - zone punches as I had to deal with many of these in the 1970s, reading them in as alpha and then converting to numeric.  SPSS doesn’t like invalid combinations and is registering an error.

 

An invalid combination of punches was detected reading a variable in a MULTIPUNCHed file.  The field will be processed as though it contained blanks. This process may precipitate an additional warning.

 

 

I’m still working my way round the binary file and so far have managed to read part of it and produce correct frequencies for two named variables, plus frequencies for single columns.  The problem I now have is that the columns are not the same as the ones in the codebook and I still have to reconcile the frequencies and values tabulated with those in the codebook.

 

For instance the number of children born to R including those who died is supposed to be in columns 10-11 but is actually in column 7.

 

I’m using the tedious method of reading one column at a time, but at least it works.

 

FILE HANDLE sn28

    /NAME='C:\Users\John\d2028.bin' /MODE=MULTIPUNCH.

data list file sn28

/1 serial 1-4 v105 to v120 5-20 (a).

list serial /cases 5.

freq v105 to v120.

 

What I’m not sure about is the number of records in the binary file, but it looks like there are eight: there may only be one..

 

John

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

From: Bob Walker [[hidden email]]
Sent: 15 May 2014 19:23
To: John F Hall
Subject: RE: Recovery of data from a binary file

 

Hi John,

 

You’re implying that you have column-binary card image data an using 80-column card format. The FILE HANDLE command using MODE=MULTIPUNCH should be able to read your *.bin file without too much effort, I think. Keep in mind that you will need to account for 0-9, ‘x’, ‘y’, and ‘&’ by using RECODE with CONVERT.

 

It is rather hard to figure out the layout from what you pasted, but you may have a nested record structure (i.e., one master household record, and then perhaps one per child). The hole counts should help figure that out.

 

Bob Walker

Surveys & Forecasts, LLC

www.safllc.com

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of John F Hall
Sent: Thursday, May 15, 2014 4:20 AM
To: [hidden email]
Subject: Recovery of data from a binary file

 

Please can anyone help with getting SPSS to read a binary file or alternatively to (re)create an ASCII file from one?

 

I am working with SPSS files distributed by UK Data Services (University of Essex) for Gary Runciman’s 1962 survey, as reported in his book:

 

W G Runciman

Relative Deprivation and Social Justice

(RKP, 1966)

 

The original data were multipunched on 80-column cards: a single column can hold data for more than one variable.  The data were later spread out on new set of single-punched cards and read into SPSS using INPUT FORMAT.  Later operations involved a new version of the setup file using DATA LIST instead.

 

During data checks I discovered that counts for birthsno (no of live children ever) and  kidsdied (no of children who died) were impossible:

 

birthsno Q.2b: Total children including dead

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

341

24.1

100.0

100.0

Missing

System

1074

75.9

 

 

Total

1415

100.0

 

 

 

 

kidsdied Q.2c: Number of deceased children

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 One

383

27.1

27.1

27.1

2 Two

522

36.9

36.9

64.0

3 Three

326

23.0

23.0

87.0

4 Four

99

7.0

7.0

94.0

5 Five

37

2.6

2.6

96.6

6

21

1.5

1.5

98.1

7

17

1.2

1.2

99.3

8

7

.5

.5

99.8

9

3

.2

.2

100.0

Total

1415

100.0

100.0

 

 

These figures do not make sense.

 

Further checks revealed a discrepancy between the frequencies in the codebook and the frequencies obtained from the SPSS file for these and two other (adjacent) variables.  I immediately suspected a data format reading error and reported it back to UKDS.

 

UKDS have now privately supplied me with three files used to create the SPSS file

 

d2028.bin          a binary file with multipunched data (my computer thinks it’s movie.)

r028.dat            some sort of conversion from multi-punch to ASCII (Fortran?)

do28.fnt            holecount of multipunched data?

 

In the package distributed by UKDS there are two *.sps files and one *.sav file:

 

d028.sps          original 1975 setup file using INPUT FORMAT    

d028a.sps         modified setup file using DATA LIST

sn28.sav           SPSS saved file

 

d028.sps:

 

FILE NAME      RUNCIMAN,RELATIVE DEPRIVATION AND SOCIAL JUSTICE

VARIABLE LIST  CASENO,CARDNO,NEWHOME,OLDHOME,MARITAL,BIRTHSNO,KIDSDIED,KIDSLIVE,

               LEAVESCH,TEENLIVE,TEENSCH,FEESCHS,MOREEDUC,EDUCTYPE

~ ~ ~ ~

               LIFESTYL,ACCENT,AGE,SEX,OCCUP,EDUCFIN,INCOME,WIFECASH,SEENHOME,

               SEENWORK,SEENOTH

# OF CASES     1415

INPUT FORMAT   FIXED(F4.0,F2.0,3F1.0,F2.0,F1.0,F2.0,F1.0,F2.0,9F1.0,F2.0,52F1.0/

               6X,64F1.0,F2.0,8F1.0/6X,74F1.0/6X,53F1.0,2F2.0,3F1.0)

INPUT MEDIUM   SPL:D2028.DAT

~ ~ ~ ~

 

BIRTHSNO,TOTAL CHILDREN INCLUDING DEAD/

               KIDSDIED,NUMBER OF DECEASED CHILDREN/

               KIDSLIVE,NUMBER OF LIVE CHILDREN UNDER 15 YRS/

               LEAVESCH,AGE EXPECT KIDS TO LEAVE SCHOOL/

               TEENLIVE,NUMBER OF LIVE CHILDREN OVER 15 YRS/

 

d028a.sps

 

title     RUNCIMAN RELATIVE DEPRIVATION AND SOCIAL JUSTICE

file handle d028a/name='/ufs3/howas/028/d2028.dat'

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

            feeschs 19 moreeduc 20 eductype 21 madwell 22 head 23 adults 24

 

There is clearly an error in 4 variables which were read in from the wrong columns. 

 

The SPSS syntax supplied in d028a.sps above is wrong:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10

            kidsdied 11 kidslive 12 leavesch 13 teenlive 16-17 teensch 18

 

It needs changing to:

 

data list fixed file=d028a records=4

         /1 caseno 1-4 cardno01 5-6 newhome 7 oldhome 8 marital 9 birthsno 10-11

            kidsdied 12 kidslive 13-14 leavesch 15 teenlive 16-17 teensch 18

 

Unfortunately the original single-punched raw data ASCII file no longer exists, so unless SPSS can read the binary file, the correct data can only be recovered in SPSS syntax by using an original spread out single punched Ascii file, but I think there may be an error in lines 8-14 of the conversion file r028.dat

 

COPY 10
CONVERT 11
GET 12 R 7 8 9 0 -
GET 13 R 1 2 3 4
GET 13 R 5 6 7 8
GET 14 R 1 2 3 - +
GET 15 R 1 2 3 4 - 0 +

 

I’ve managed to look inside the d028.fnt file: it’s some sort of holecount (see extract of cards 10 – 19 below).

 

           10:                     341                                                  1074         : 10:

          : 11:                     383   522   326    99    37    21    17     7     3               : 11:

          : 12:              1282    96    24     8     2     3                                       : 12:

          : 13:                     325                                                  1090         : 13:

          : 14:               594   228   165   390    25     6     4     2     1                     : 14:

          : 15:                     128   263    31    74                                 919         : 15:

          : 16:                     330                                                  1085         : 16:

          : 17:               384   295   206   428    54    21    14     6     2     5               : 17:

          : 18:               709   350   256    95     3     2                                       : 18:

          : 19:               346   172   862    26     9         

 

I’ve had a look at the FM but am none the wiser as to how to read the binary file into SPSS, or how to recreate the first data record.  Can anyone help? 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop