SPSSX Discussion

reading text

Classic

List

Threaded

4 messages Options

Dogan, Enis

reading text

Dear all

I have my data stored in text format. Looks like this:

1 1 10 11 5 0 4 0 1 1 1 1

3 4 4 2 1 1 0

-------------------

2 1 9 11 5 8 4 0 1 1 1 1

5 4 3 3 1 1 0

------------------

So each case is 2 lines and 19 variables (including ID that appears at
the beginning before tab).

Cases are separated with ---------

Also there are blank lines between the end of the data string and the
--------- sign and blank after that again before the next data string
begins.

So I am wondering is there an easy way to read this into SPSS?

I deleted the -------- and the blank lines, then I can do it but it
takes too much time this way and I have other files just like this.

Thank

You

Enis

Mike P-5

Re: reading text

If you have python programmability installed you can delete the -----
and the spaces easily with a little piece of code.
And then read it in using your method at the minute

HtH

Mike

begin program.
import spss,
# 1. create a dictionary of things to be replaced and run this file.
for i in varList:
main_dic = {"---------------":""," ":"" } #deleteing spaces
and dashes
file_name = "your file here and it's location.txt" % (i)
f = open(file_name,"r")
file_string = f.read()
f.close()

working_file = "your file here and it's location _new.txt" % (i)
#this is what your new text file will be written as
new_file = open(working_file,"w")
for k in main_dic.keys():
file_string = file_string.replace("%s" % k, main_dic[k])
new_file.write(file_string)
new_file.close()

End program.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Dogan, Enis
Sent: 09 November 2006 14:52
To: [hidden email]
Subject: reading text

Dear all

I have my data stored in text format. Looks like this:

1 1 10 11 5 0 4 0 1 1 1 1

3 4 4 2 1 1 0

-------------------

2 1 9 11 5 8 4 0 1 1 1 1

5 4 3 3 1 1 0

------------------

So each case is 2 lines and 19 variables (including ID that appears at
the beginning before tab).

Cases are separated with ---------

Also there are blank lines between the end of the data string and the
--------- sign and blank after that again before the next data string
begins.

So I am wondering is there an easy way to read this into SPSS?

I deleted the -------- and the blank lines, then I can do it but it
takes too much time this way and I have other files just like this.

Thank

You

Enis

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. For more information on a proactive anti-virus
service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________

Richard Ristow

Re: reading text

In reply to this post by Dogan, Enis

At 09:52 AM 11/9/2006, Dogan, Enis wrote:

>I have my data stored in text format. Each case is 2 lines and 19
>variables (including ID that appears at the beginning before tab).
>
>Cases are separated with ---------.
>
>Also there are blank lines between the end of the data string and the
>--------- sign and blank after that again before the next data string
>begins.
>
> Looks like this:

Yours is one of many postings that seem to have extra blank lines
introduced in the data. Is this what it looks like? I've put SPSS
"BEGIN DATA/END DATA." around it:

BEGIN DATA
1 1 10 11 5 0 4 0 1 1 1 1
3 4 4 2 1 1 0

-------------------

2 1 9 11 5 8 4 0 1 1 1 1
5 4 3 3 1 1 0

------------------

END DATA.

Python is great, but for this it's like swatting a fly with a
sledgehammer. The following is SPSS draft output:

NEW FILE.
DATA LIST FREE
/ ID DATA01 TO DATA18 (19F2) HYPHENS(A20).
BEGIN DATA
1 1 10 11 5 0 4 0 1 1 1 1
3 4 4 2 1 1 0

-------------------

2 1 9 11 5 8 4 0 1 1 1 1
5 4 3 3 1 1 0

------------------

END DATA.
DELETE VARIABLES HYPHENS.
LIST.
|-----------------------------|---------------------------|
|Output Created |09-NOV-2006 12:17:54 |
|-----------------------------|---------------------------|
DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT DAT
ID A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 A11 A12 A13 A14 A15 A16
DATA17 DATA18

1 1 10 11 5 0 4 0 1 1 1 1 3 4 4 2 1
1 0
2 1 9 11 5 8 4 0 1 1 1 1 5 4 3 3 1
1 0

Number of cases read: 2 Number of cases listed: 2

Edward Boadi

Re: reading text

In reply to this post by Dogan, Enis

Hi Enis,
Suppose this is the format of TxtData.txt :

1 1 10 11 5 0 4 0 1 1 1 13 4 4 2 1 1 0
-------------------

2 1 9 11 5 8 4 0 1 1 1 15 4 3 3 1 1 0

------------------

The following syntax does what you want.

****Read Txt Data .
GET DATA /TYPE = TXT
/FILE = 'C:\TxtData.txt'
/DELCASE = LINE
/DELIMITERS = " "
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 1
/IMPORTCASE = ALL
/VARIABLES =
V1 A30
V2 F2.0
V3 F2.0
V4 F2.0
V5 F2.0
V6 F2.0
V7 F2.0
V8 F2.0
V9 F2.0
V10 F2.0
V11 F2.0
V12 F2.0
V13 F2.0
V14 F2.0
V15 F2.0
V16 F2.0
V17 F2.0
V18 F2.0
V19 F2.0
.
CACHE.
EXECUTE.

**deleted the "-------- " .
SELECT IF INDEX((V1),'----') =0.
EXECUTE.
***Convert the VI (Id) to numeric -if you want .
RECODE V1 (CONVERT) INTO Id.
EXECUTE.

Thanks.
Edward.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Dogan, Enis
Sent: Thursday, November 09, 2006 9:52 AM
To: [hidden email]
Subject: reading text

Dear all

I have my data stored in text format. Looks like this:

1 1 10 11 5 0 4 0 1 1 1 1

3 4 4 2 1 1 0

-------------------

2 1 9 11 5 8 4 0 1 1 1 1

5 4 3 3 1 1 0

------------------

So each case is 2 lines and 19 variables (including ID that appears at
the beginning before tab).

Cases are separated with ---------

Also there are blank lines between the end of the data string and the
--------- sign and blank after that again before the next data string
begins.

So I am wondering is there an easy way to read this into SPSS?

I deleted the -------- and the blank lines, then I can do it but it
takes too much time this way and I have other files just like this.

Thank

You

Enis