|
Dear ListServ Members; We have data that has been stored in .XML and would like to use SPSS (v19) to import this data into a common SPSS file (many .XML files imported into one dataset). Each .XML file will have the same format (variable names, etc…) as all of the other .XML files we would like to import. Is there a way to import .XML formatted information into SPSS, preferably in batches (not importing each file separately and then doing some sort of add cases type thing – there are tens of thousands of these files), using SPSS v19? Best Regards,
Virginia Department of Corrections |
|
Dear John One way to do this is to write a Python program to parse and import the data. Raynald Levesque has some examples on his site http://www.spsstools.net Garry Gelade Business Analytic Ltd From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Turner, John E. Dear ListServ Members; We have data that has been stored in .XML and would like to use SPSS (v19) to import this data into a common SPSS file (many .XML files imported into one dataset). Each .XML file will have the same format (variable names, etc…) as all of the other .XML files we would like to import. Is there a way to import .XML formatted information into SPSS, preferably in batches (not importing each file separately and then doing some sort of add cases type thing – there are tens of thousands of these files), using SPSS v19? Best Regards,
Virginia Department of Corrections |
|
In reply to this post by Turner, John E. (VADOC)
There are two parts to this problem. First,
can you read a single xml data file into SPSS Statistics? Second,
how do you read a large set of identically structured xml files into one
dataset?
On the first part, XML files come in infinite variety. So try reading one with Statistics. To do this, you need to install the SPSS Data Access Pack, which contains the XML driver. The DAP can be downloaded from http://www.spss.com/drivers/ if you don't already have it. Then use File>Open Database>New Query and define the data source using the xml driver. From all this you can generate the GET DATA syntax to read an xml file. Once you are set up to read a single dataset, you are ready to set up a job to combine multiple datasets. For this you need to use Python programmability directly or indirectly in order to avoid having to enumerate each dataset individually. Using programmability, you can use the glob.glob method to find all the files matching your input set, say "c:/myxmldata/*.xml". Then iterate over these files, opening and matching each of these files by submitting commands to Statistics until you have the entire aggregate. Alternatively, if you are more comfortable just using SPSS Statistics, you can use the SPSSINC PROCESS FILES extension command from the SPSS Community (www.ibm.com/developerworks/spssdevcentral). It will iterate through the specified files and run a block of Statistics syntax for each one, So you could use the file handles or macros that it provides to your Statistics code to open and match each file. See the help for the command or its dialog box help for more information. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: "Turner, John E." <[hidden email]> To: [hidden email] Date: 02/16/2011 12:39 PM Subject: [SPSSX-L] Building a dataset from data stored in .XML format Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear ListServ Members; We have data that has been stored in .XML and would like to use SPSS (v19) to import this data into a common SPSS file (many .XML files imported into one dataset). Each .XML file will have the same format (variable names, etc…) as all of the other .XML files we would like to import. Is there a way to import .XML formatted information into SPSS, preferably in batches (not importing each file separately and then doing some sort of add cases type thing – there are tens of thousands of these files), using SPSS v19? Best Regards, John Turner Virginia Department of Corrections |
|
Jon, I have to import data form an XML data source, too, but I haven't been able to locate an appropriate driver in the SPSS Data Accessy Pack. I use SPSS version 22 and DAP version 7.1 - which of the drivers should I use?
Moreover, there are XML data stanards defined by the CDSIC organization that are becoming increasingly important in clinical research (prease see my previous post under http://spssx-discussion.1045642.n5.nabble.com/CDISC-standards-td5716047.html). Has SPSS ever considered providing a solution for this (SAS has ...)? Best, Andreas |
|
----- Original Message -----
> From: Andreas Voelp <[hidden email]> > To: [hidden email] > Cc: > Sent: Tuesday, April 1, 2014 10:54 AM > Subject: Re: [SPSSX-L] Building a dataset from data stored in .XML format > > Jon, I have to import data form an XML data source, too, but I haven't been > able to locate an appropriate driver in the SPSS Data Accessy Pack. I use > SPSS version 22 and DAP version 7.1 - which of the drivers should I use? > > Moreover, there are XML data stanards defined by the CDSIC organization that > are becoming increasingly important in clinical research (prease see my > previous post under > http://spssx-discussion.1045642.n5.nabble.com/CDISC-standards-td5716047.html > <http://spssx-discussion.1045642.n5.nabble.com/CDISC-standards-td5716047.html> > ). Has SPSS ever considered providing a solution for this (SAS has ...)? Hi, I do not know CDISC other than the name. There a lot of xml formats out there that are at least in part overlapping in terms of their goals: HL7, OpenEHR, SDMX. Would you say that CDISC is somehow more important/popular? HL7 is very widely used (maybe CDISC is a subset of HL7?) so why not have importers for this format too? Just curious. regards, Albert-Jan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
