Hi, my name is Sean McKenzie, and I am up here in Anchorage Alaska.
I am a new user. My previous experience is mainly using GAUSS to program, a PASCAL like language especially suited to matrix operations, and EVIEWS and its precursor TSP, pretty much Econometrics stuff. I am by background, and Economist (booo, hiss). But now I am working on Child Care issues (hip, hip, hurray!), and we finally bought SPSS, which I actually recommended despite my lack of knowledge, because I know that it is commonly used in the Social Sciences arena outside of Economics. Most of my experience with lists etal are mainly from playing wargames. (So if any of you are up here in Alaska send me an email at: [hidden email] and may be we can play.) Given my backround, I believe for me it will be easier to work with matrix operations, with which I am essentially thoroughly familiar. I am just starting, and a routine for me in learning a new program is first to read in data and then spit it back out and confirm that the spit out data is exactly the same as the spit in. I did that easily enough. What I want to do in general is load data from an excel spreadsheet and put it into a matrix. So far to me it looks like the easiest way to do this is to upload the data, save it as an spss.sav matrix file and then upload it again. I am looking for a more efficient way. To directly upload data from excel as a matrix or to convert excel data to a matrix without first having to save/re-upload. My usual second experiment is to upload data, pull out the field names and then spit those out. Something I could not figure out how to do without using matrices. One of the desirable things is for me not to have to explicitly know the names of my fields or their order prior to uploading data/running a program, and not to need to know the row or column length of my matrices, as even when using the "same data" these typically change, even from day to day. Typically, thinking in GAUSS, the commands are something like LOADMATRIX M=<file> Names=M[1,.] Write Names The above is pulling out the field names in row 1 of my Excel SS. The above is just a test, but in practical terms I would then assign variable names using the first row of the matrix...since I have perhaps a 100 or coupla hundred variables in my database, I load a matrix of variable names into the program and using do loops I compare the list of variable names to the names I uploaded, so when Variablenames[i,1]=Names[1,i] and suppose the string there was GDP GDP=M[2:Countrows(M),i] In the end I will want to load data directly from databse, which is an access database, but it is just a file on my directory, no password et al. For now I'll settle for downloading data into excel and then taking it into SPSS. I can also import my data from text files. I know that was long, but can any body please help? As an aside, one of the things I find somewhat difficult to deal with is: In GAUSS the = sign is an "ASSIGNMENT" operator and works the same for MATRICES and Scalars. In general in GAUSS there was no special need to declare something to be a matrix, in general it treated everthing like a matrix, and a scalar is a [1,1] matrix. Is there something similar in SPSS, an assignment operator? Is it possible to operate in matrix style/matrix notation without going through the whole matrix procedure? |
At 01:03 PM 9/8/2006, Sean McKenzie wrote:
>I am by background, and Economis (booo, hiss). Hey, it happens to the best of us. >But now I am working on Child Care issues (hip, hip, hurray!), and we >finally bought SPSS, because I know that it is commonly used in the >Social Sciences arena outside of Economics. > >Given my backround, I believe for me it will be easier to work with >matrix operations, with which I am essentially thoroughly familiar. Maybe, and by all means do. But, depending on what you need to do, look at the native statistical and reporting procedures in SPSS, too. They were put there for this, after all. By all means post, with what statistical operations you want to do, and somebody likely can make suggestions. >My usual second experiment is to upload data, pull out the field names >and then spit those out. Something I could not figure out how to do >without using matrices. Well, if you just want to display the variable names, command DISPLAY VARIABLES should do it for you. If you want to access the variable names as string values, it's harder, and not particularly natural. You're right, that they're there in a matrix data file; you can read in your data, take the correlation matrix with CORRELATIONS or some such, and read the matrix data file as if it were a regular file. Of course, that loses the names of string variables. Better: You probably have SPSS 14, so you can access the data dictionary through a Python program. Or, see my post "Data dictionary in SPSS", Wed, 6 Sep 2006 13:30:05 -0400. >What I want to do in general is load data from an excel spreadsheet >and put it into a matrix. So far to me it looks like the easiest way >to do this is to upload the data, save it as an spss.sav matrix file >and then upload it again. There's your problem. An SPSS matrix, as in BEGIN MATRIX/ END MATRIX, has nothing to do with matrix-format datasets. I assume you want a matrix whose rows are the cases, or records, in your file, and whose columns are the variables, or fields. For that, you want a regular-format SPSS file, including the working file. See (you probably have seen) the GET statement in the matrix language - see section MATRIX in the Command Syntax reference. Load your data from the Excel spreadsheet, issue a MATRIX command, and get the data. (I'd save it to an SPSS format file, too; I'd do that for any data read from an outside source.) >One of the desirable things is for me not to have to explicitly know >the names of my fields or their order prior to uploading data/running >a program, and not to need to know the row or column length of my >matrices, as even when using the "same data" these typically change, >even from day to day. Well, there's GET TRANSLATE with the /FIELDNAMES subcommand, which is probably what you want; or GET DATA. SPSS is very flexible about the "column length", by which I take it you mean the number of cases or records. It is flexible about "row length", by which I take it you mean number of variables, when reading using one of the above methods; but you'll usually have to know the variable names in the file in order to do anything else. >I would then assign variable names using the first row of the >matrix... Does GET TRANSLATE ... /FIELDNAMES do this for you? >since I have perhaps a 100 or coupla hundred variables in my database, >I load a matrix of variable names into the program and using do loops >I compare the list of variable names to the names I uploaded, so when I think, here, you're getting far away from SPSS 'idiom'. SPSS isn't very good at meta-processing, dealing with its own data dictionary. It shouldn't be needed for what looks like something relatively simple: loading data from a spreadsheet. So, don't ask how to you mimic what you'd do in GAUSS; it's probably somewhere between awkward and impossible. Help us understand what you're trying to accomplish. I know. One of the most frustrating times in any of our lives is changing from one system to another with very different style and idiom. It can be maddeningly difficult to even phrase the question, in the new system's terms, how to do something that was obvious in the old system. BE PATIENT: It probably isn't that SPSS can't do it; it's probably that you're having trouble thinking in SPSS's idiom, when you're used to something very different. >As an aside, one of the things I find somewhat difficult to deal with >is: > >In GAUSS the = sign is an "ASSIGNMENT" operator and works the same for >MATRICES and Scalars. In general in GAUSS there was no special need >to declare something to be a matrix, in general it treated everthing >like a matrix, and a scalar is a [1,1] matrix. Is there something >similar in SPSS, an assignment operator? Yes; the '=' sign. But it has to be on a COMPUTE command. On the COMPUTE command, you can do assignment pretty much as you'd expect. >Is it possible to operate in matrix style/matrix notation without >going through the whole matrix procedure? If you mean, without a MATRIX/ END MATRIX structure, no, it isn't. That's what MATRIX is for; and the 'idiom' in MATRIX is very different from that in the rest of SPSS. That shouldn't be a problem, if you want to work that way: read your data, issue MATRIX, and then the GET statement of the MATRIX language. BUT, here's where you're hitting a difference in paradigm and idiom. It sounds like you're used to having your data in an explicit matrix, that you manipulate with matrix commands. The SPSS MATRIX language is pretty much like that. But in idiomatic standard SPSS, you're not looking at your data as an object. Think, not of operating on all the data at once, but on one row at a time. (Drive you crazy? But you can, in fact, do most of what you want that way. Its advantage is being able to handle MUCH larger files.) When you read about the transformation language, remember that everything you're writing is the interior of a loop, the implicit loop through the cases in the SPSS file. Procedures (like CORRELATION or REGRESSION) accept this kind of file and process it sequentially. Now, keep asking. You'll be frustrated for a little while longer, until you get the SPSS idiom. Quickly, - In the MATRIX language, you can work pretty much the way you used to in Gauss, though printed output, say, is probably less flexible - To work in general SPSS, outside MATRIX, you will need to think in a different idiom - To just read your data from day to day, I'm not sure where your difficulties lie, but it doesn't sound like it should be giving you as much trouble as it seems to be. Remember: in SPSS, you don't assign names to 'columns' (variables). SPSS is more like a database system than like a matrix system: variables exist ONLY with names. If I weren't so tired, I'd have made this shorter. I hope it's some help. Do post again. Good luck, Richard Ristow |
Free forum by Nabble | Edit this page |