I need to know the number of variables in each of a large number of system files. Is there any syntax for this? I'm on SPSS 23.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Example: A B C D E 1 2 1 2 1 2 2 2 2 2 CODEBOOK can show me that I have 2 cases but I need to know that I have 5 variables (A-E) and I can't figure out how to make CODEBOOK show that. From the few files that have been manually opened and counted I know there are files with from 18 to 662 variables. What I'd like is a way to have my computer run through the rest of them for me. Sorry if this has been discussed before but I didn't find it in my search. Thanks! Catherine |
sysfile info "<filename>".
Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Catherine Kubitschek <[hidden email]> To: [hidden email] Date: 06/29/2016 04:00 PM Subject: Syntax for counting variables in system file(s) Sent by: "SPSSX(r) Discussion" <[hidden email]> I need to know the number of variables in each of a large number of system files. Is there any syntax for this? I'm on SPSS 23. Example: A B C D E 1 2 1 2 1 2 2 2 2 2 CODEBOOK can show me that I have 2 cases but I need to know that I have 5 variables (A-E) and I can't figure out how to make CODEBOOK show that. From the few files that have been manually opened and counted I know there are files with from 18 to 662 variables. What I'd like is a way to have my computer run through the rest of them for me. Sorry if this has been discussed before but I didn't find it in my search. Thanks! Catherine ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
If you want to summarize a lot of files, try the GATHERMD extension command. It takes a wildcard expression like c:\mydata\*.sav and creates a dataset listing all the variables in all the files it finds along with the file name and variable label. So if you just want counts, aggregate that dataset using the file name as the break variable. The syntax would be like gathermd "c:\mydata\*.sav". There are some other options, but you probably don't need them. The command searches the subdirectories of the filespec as well as the specified directory. If you don't have this command installed, you can add it from the Utilities menu. On Wed, Jun 29, 2016 at 3:34 PM, Rick Oliver <[hidden email]> wrote: sysfile info "<filename>". |
Administrator
|
In reply to this post by Rick Oliver-3
Ah yes, good idea Rick. You could combine that with some OMS commands to generate one file that includes the variable counts for all the files. Something like this, maybe:
* Suppress all output via OMS . OMS /DESTINATION VIEWER=NO /TAG='suppressall'. DATASET DECLARE FileInfo. OMS /SELECT TABLES /IF COMMANDS=['Sysfile Info'] SUBTYPES=['File Information'] /DESTINATION FORMAT=SAV NUMBERED=TableNumber_ OUTFILE='FileInfo' VIEWER=NO. * For this bit, you could use a macro that loops through a list of files. SYSFILE INFO "C:\SPSSdata\accidents.sav". SYSFILE INFO "C:\SPSSdata\car_sales.sav". SYSFILE INFO "C:\SPSSdata\salesperformance.sav". OMSEND. DATASET ACTIVATE FileInfo. SELECT IF ANY(Var1,"Source","Data Information") and NOT ANY(Var2,"Weight Variable","Compressed"). . EXECUTE. Here is the main output from my small example: TableNumber_ Var1 Var2 Var3 1 Source C:\SPSSdata\accidents.sav 1 Data Information N of Cases 6 1 Data Information N of Defined Variable Elements 4 1 Data Information N of Named Variables 4 2 Source C:\SPSSdata\car_sales.sav 2 Data Information N of Cases 157 2 Data Information N of Defined Variable Elements 29 2 Data Information N of Named Variables 26 3 Source C:\SPSSdata\salesperformance.sav 3 Data Information N of Cases 60 3 Data Information N of Defined Variable Elements 2 3 Data Information N of Named Variables 2 Number of cases read: 12 Number of cases listed: 12
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
And why would you go to all that trouble, including having to list all the filespecs to be processed, instead of using the one-line solution I posted (or two if you count AGGREGATE)? gathermd "c:\mydata\*.sav". On Wed, Jun 29, 2016 at 4:08 PM, Bruce Weaver <[hidden email]> wrote: Ah yes, good idea Rick. You could combine that with some OMS commands to |
Administrator
|
I like Jon's idea. and that first thing Bruce posted with FLIP all is a complete nightmare (what if the file has 1M cases?) Should always have a SELECT IF $CASENUM LE 1 before doing that sort of shenanigans.
I might delete it myself (I have the power). OTOH, no need to resort to such barbaric crap in the first place ;-)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Deleted Bruce's monstrosity ;-) He will thank me later (even such it is still on UGA - Maybe he should contact Joe?)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
-----
-- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-for-counting-variables-in-system-file-s-tp5732550p5732562.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Cat got your tongue? ;-) On Thu, Jun 30, 2016 at 7:47 AM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote: -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
I also like Jon's suggestion. I didn't see it until after posting my follow-up to Rick.
And yes FLIP ALL is a monstrosity (especially if you have a large file and omit the SELECT IF $CASENUM EQ 1). But I would have left it there as an illustration that coding is an iterative process, and the first thought that pops into one's head isn't necessarily a great one! ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
OTOH: Sometimes the first thought is a great thought and you hit walls with SW bugs such as the crap I ran into recently with IMPUTE MISSING with split file and then a butt load of issues with OMS... turned a 3 day project into a F'ing 2 week nightmare with deplorable workaround bandaid bullshit!!!
Usually my first thoughts are the best thoughts due to years of experience (and making a whole lot of mistakes in the early years ;-))))). Some of those mistakes are still being reposted by people who can't tell the difference. ------
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
I hit "Post" when aiming for "Quote". Need more coffee!
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
You need a new mouse (or more coffee ;-)
Quote is FAR far away from the post button my friend ;-) On my rendering Quote is at the top and Post is at the bottom. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
This post was updated on .
In reply to this post by Jon Peck
Now that I'm back at the office, I have tried GATHERMD. After some Googling for more info and some experimentation, I discovered that it can take a list of files, which will be of interest to the OP. E.g.,
GATHERMD "C:\SPSSdata\accidents.sav" "C:\SPSSdata\car_sales.sav" "C:\SPSSdata\salesperformance.sav" . ****************************************************. DATASET ACTIVATE xDataset7. /* See my question below. ****************************************************. * Use AGGREGATE to create a simple file with one row for each source file * (variable Source) and variable NumVars showing the number of variables. AGGREGATE /OUTFILE=* MODE=REPLACE /BREAK=source /NumVars=N. DATASET NAME NumVars. ALTER TYPE Source(AMIN). LIST. Output from LIST: source NumVars C:/SPSSdata/accidents.sav 4 C:/SPSSdata/car_sales.sav 26 C:/SPSSdata/salesperformance.sav 2 Number of cases read: 3 Number of cases listed: 3 QUESTION: Is there some way to restart numeric sequencing of these 'xDataset' dataset names at 1? I got xDataset7 here despite having closed Xdataset1-xDataset6 before running this latest version of my syntax. Alternatively, is there a way to specify one's own dataset name? EDITED: By the way, I eventually discovered that the MD in GATHERMD stands for "meta-data". I thought I should add that in case anyone else was still puzzling over it.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
"QUESTION: Is there some way to restart numeric sequencing of these 'xDataset' dataset names at 1? I got xDataset7 here despite having closed Xdataset1-xDataset6 before running this latest version of my syntax. "
Likely not short of restarting SPSS. AFAIK it keeps a counter from start of session and increments each time a new dataset is defined ;-). Last week I was staring at DATASET 20001 etc...due to unforeseen nasty backend bugs which I will refrain from reiterating lest I crawl back into that nightmare and pull out the remainder of my rapidly graying hair. "Alternatively, is there a way to specify one's own dataset name? " I do not know, what do the docs say?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
My guess is that the OP, who said they had many files, would want to use the wildcard to avoid listing them, but an explicit list also works. Or regular expression patterns can be used with the FILENAMEPATTERN keyword on the OPTIONS subcommand. Statistics does not reuse automatic dataset names within a session. You can specify a dataset name to GATHERMD as DSNAME=name. The full syntax help is available via F1 in the Syntax Editor with the cursor on the command (just like built-in commands) with V23 or later or by running GATHERMD /HELP. On Thu, Jun 30, 2016 at 8:30 AM, Bruce Weaver <[hidden email]> wrote: Now that I'm back at the office, I have tried GATHERMD. After some Googling |
Administrator
|
Is the GATHERMD help available online somewhere? Here's what I get when I try your suggestions, Jon:
GATHERMD /HELP. Help file not found:file://C:\\ProgramData\\IBM\\SPSS\\Statistics\\23\\extensions\GATHERMD\markdown.html
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
This post was updated on .
I'll (at least partially) answer my own question. One can get a dialog for GATHERMD by clicking on File > Collect Variable Information. When I clicked on the Help button in that dialog, I was taken to a local copy of the help file, which I've appended at the bottom of this post (in case others are also having trouble getting access to it). Via the dialog, and after a little fiddling around, I was able to produce this example, which is probably close to what Catherine (the OP) is looking for. It reads all .sav files in the specified folder.
GATHERMD "C:\SPSSdata" /OPTIONS FILETYPES=spss ATTRLENGTH=256 DSNAME = GMDresults. DATASET ACTIVATE GMDresults. AGGREGATE /OUTFILE=* MODE=REPLACE /BREAK=source /NumVars=N. DATASET NAME NumVars. ALTER TYPE source(AMIN). LIST /CASES FROM 1 to 10. Output from LIST: source NumVars C:/SPSSdata/1991 U.S. General Social Survey.sav 43 C:/SPSSdata/accidents.sav 4 C:/SPSSdata/adl.sav 14 C:/SPSSdata/advert.sav 2 C:/SPSSdata/aflatoxin.sav 2 C:/SPSSdata/aflatoxin20.sav 2 C:/SPSSdata/anorectic.sav 22 C:/SPSSdata/anticonvulsants.sav 9 C:/SPSSdata/bankloan_binning.sav 9 C:/SPSSdata/bankloan_cs_noweights.sav 12 Number of cases read: 10 Number of cases listed: 10 APPENDIX: HELP FILE FOR GATHERMD Scan Data Files for Variable Information Using this dialog box you can create a dataset containing information about the variable dictionaries in one or more IBM® SPSS® Statistics, SAS, or Stata files. ►To run this procedure, from the menus choose: Files Collect Variable Information... ► All variables in the selected files are included. Directory to Search. Enter a directory to search. Its subdirectories are also searched. File Name Pattern. A regular expression pattern can be used to filter the files processed. For example, car would limit the files to those whose name starts with "car". The pattern .*car would accept any filenames containing "car". • Regular expressions are not the same as the filename wildcards found in many operating systems. For example, abc* will match any name starting with ab: it means literally ab followed by zero or more c's. • The regular expression is not case sensitive, and it is applied to the name of the file without the extension. For a full explanation of regular expressions, one good source is http://www.amk.ca/python/howto/regex/. If no pattern is given, all files of the specified types are processed. File Types to Scan. Check the boxes for the desired file types. The file types are: • IBM SPSS Statistics: .sav, .por • SAS: .sas7bdat, .sd7, .sd2, .ssd01, and .xpt • Stata: .dta List of Custom Attributes to Include. Enter the names of any custom attributes whose values should be included in the output. Maximum Length for Attribute Values. Custom attribute values can be up to 32767 bytes long. By default values are truncated to 256 bytes. Enter a different value to change this limit. Additional Features This dialog generates syntax for the GATHERMD extension command. To display help for this command, run the following syntax: GATHERMD /HELP. In syntax, a list of directories can be specified. Requirements This dialog requires the Integration Plug-In for Python and the GATHERMD extension command. For IBM SPSS Statistics 19 and higher, the Plug-In and the extension command are installed with the Essentials for Python package. For more information, see How to Get Integration Plug-Ins, under Core System > Frequently Asked Questions in the IBM SPSS Statistics Help system. If you downloaded the GATHERMD extension command from the SPSS Community, then please follow the instructions in the associated readme file. © Copyright IBM Corp. 1989, 2013
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Bruce Weaver
Firefox changed its behavior after V23 was released in a way that causes the syntax help access to fail. I worked around this in V24, but I also posted the updated code (extension.py) on the old Community website: The code is compatible with V23 and older versions. So, you can access the help in three ways 1) replace the extension.py module you have with the one above. It will be in the python\lib\site-packages directory under your Statistics installation in most cases. You will need to restart Statistics after doing that. 2) Use a different browser as your default browser. FF was the only one that failed when I tested this some time ago. 3) Access the help file directly from any browser as file://C:\ProgramData\IBM\SPSS\Statistics\23\extensions\GATHERMD\markdown.html in your case. This applies to the syntax help for all the extensions. On Thu, Jun 30, 2016 at 11:17 AM, Bruce Weaver <[hidden email]> wrote: Is the GATHERMD help available online somewhere? Here's what I get when I |
Free forum by Nabble | Edit this page |