Dear all,
is there a Syntax to calculate the mean absolute difference of a distribution (see http://en.wikipedia.org/wiki/Mean_difference )? I need it to calculate the mean absolute difference in life expectancies around the world. Your help is greatly appreciated! Best regards, Peter |
Administrator
|
Here's a start:
http://spssx-discussion.1045642.n5.nabble.com/TIP-Cartesian-product-using-MATRIX-td5719597.html
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by PeterMuenchen
Simple Python solution with example. Just
change the variable name at the bottom. Code could be elaborated
to only do half as many multiplications, but the time saving would be tiny.
begin program. # calculate Mean Average Deviation. # Assumes that data fit in memory. # Cases with missing values are omitted. import spss, spssdata def mad(x): """Return MAD for variable x""" dta = spssdata.Spssdata(x, names=False, omitmissing=True).fetchall() ncases = len(dta) xsum = 0. for i in range(ncases): for j in range(ncases): xsum += abs(dta[i][0] - dta[j][0]) print "Mean Absolute Deviation of %s: %s" % (x, xsum/(ncases*ncases)) mad("salary") end program. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: PeterMuenchen <[hidden email]> To: [hidden email] Date: 02/27/2015 10:43 AM Subject: [SPSSX-L] Calculation of Mean Absolute Difference in SPSS Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear all, is there a Syntax to calculate the mean absolute difference of a distribution (see http://en.wikipedia.org/wiki/Mean_difference )? I need it to calculate the mean absolute difference in life expectancies around the world. Your help is greatly appreciated! Best regards, Peter -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Calculation-of-Mean-Absolute-Difference-in-SPSS-tp5728831.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Jon, dear all,
thanks a lot for your reply - actually it's not the mean average deviation I'm trying to calculate, but the absolute mean difference (also known as mean difference or mean absolute difference - see the Wikipedia article on it: http://en.wikipedia.org/wiki/Mean_difference ). It is used, among others, for the analyis of absolute inequality in health outcomes such as life expectancy between different countries. It is the mean of the absolute differences between any pair of countries. I'd be very grateful for any ideas on how this could be calculated in SPSS! best, peter |
Administrator
|
Peter, I wonder if you'd have an easier time calculating it via R. E.g.,
http://www.inside-r.org/packages/cran/lmomco/docs/gini.mean.diff The Wikipedia page you gave said "Gini mean difference" is another name for what you're after. But it also suggested that one requires two variables (X and Y, independently & identically distributed) to calculate it. The inside-r link above, on the other hand makes it look like something that can be computed for a single variable. So I have to confess, I'm a bit confused! Good luck!
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by PeterMuenchen
To put everybody on the same page, you wanting to compute using this formula (equation 5) on the linked page? Yes?
MD = (Sum [i=1 to n] (Sum [j=1 to n] abs(y(i)-y(j))))/n**2 I think this might be most easily done as a matrix-end matrix. If your sample weren't "too big" another way would be to do a casestovars on a file containing only y (or flip a file containing only y) and then vectorize y and embed the abs function in a double loop structure. Vector y=y1 to y2000. Compute md=0. Loop #i=1 to 2000. Loop #j=1 to 2000. Compute md=md+abs(y(#i)-y(#j)). End loop. End loop. Compute md=md/2000**2. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of PeterMuenchen Sent: Friday, February 27, 2015 3:38 PM To: [hidden email] Subject: Re: Calculation of Mean Absolute Difference in SPSS Dear Jon, dear all, thanks a lot for your reply - actually it's not the mean average deviation I'm trying to calculate, but the absolute mean difference (also known as mean difference or mean absolute difference - see the Wikipedia article on it: http://en.wikipedia.org/wiki/Mean_difference ). It is used, among others, for the analyis of absolute inequality in health outcomes such as life expectancy between different countries. It is the mean of the absolute differences between any pair of countries. I'd be very grateful for any ideas on how this could be calculated in SPSS! best, peter -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Calculation-of-Mean-Absolute-Difference-in-SPSS-tp5728831p5728836.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by David Marso
Code from cited post.
MATRIX. COMPUTE x={1;2;3;4;5}. COMPUTE XX={KRONEKER(X,MAKE(NROW(X),1,1)),KRONEKER(MAKE(NROW(X),1,1),X)}. PRINT XX. END MATRIX. /*Adapted for MAD*/. DEFINE !mad (!POS !TOKENS(1)) MATRIX. GET y / FILE * /VAR !1 /MISSING=OMIT. COMPUTE nY=NROW(y). COMPUTE yy={KRONEKER(y,MAKE(nY,1,1)),KRONEKER(MAKE(nY,1,1),y)}. COMPUTE mad=CSUM(ABS(yy(:,1)-yy(:,2)))/(nY*nY). PRINT mad /FORMAT "F10.6". END MATRIX. !ENDDEFINE. GET FILE='G:\SPSS_InstallDir\Ver22\Samples\English\Employee data.sav'. DATASET NAME DataSet1 WINDOW=FRONT. !mad salary.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by PeterMuenchen
Peter,
It appears that Jon's code does precisely this! So does my MATRIX program and Gene's VECTOR approach (which requires a CASESTOVARS). HTH, David --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
reviewing the Wiki cite (go near bottom Sample estimates...):
replace COMPUTE mad=CSUM(ABS(yy(:,1)-yy(:,2)))/(nY*nY). with COMPUTE mad=CSUM(ABS(yy(:,1)-yy(:,2)))/(nY*(nY-1)).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
Another approach using MATRIX.
DEFINE !Mad (!POS !TOKENS (1)) MATRIX. GET y / FILE * / VARIABLES !1/MISSING=OMIT. COMPUTE N=NROW(y). COMPUTE ABSSUM=0. LOOP #=1 TO N. + LOOP ##=1 TO N. + COMPUTE AbsSim=AbsUSum + ABS(y(#) - t(##)). + END LOOP. END LOOP. COMPUTE MAD=AbsSum/N*(N-1). END MATRIX. !ENDDEFINE. !Mad Salary.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The Phyton Solution proposed by Jon works fine. Is it possible to introduce a weighting variable into it? (the normal SPSS weighting is not recognised by the Phyton solution. I have to weight by the population size of the countries I'm comparing).
thanks a lot!!!! Peter |
Here's a version that takes a weight variable
as another parameter.
begin program. # calculate Mean Average Deviation. # Assumes that data fit in memory. # Cases with missing values are omitted. import spss, spssdata def mad(x, y, w): """Return MAD for variable x vs y, weighted by w""" dta = spssdata.Spssdata([x,y, w], names=False, omitmissing=True).fetchall() ncases = len(dta) xsum = 0. wsum = 0. for i in range(ncases): for j in range(ncases): xsum += dta[i][2] * dta[j][2] * abs(dta[i][0] - dta[j][1]) wsum += dta[i][2] * dta[j][2] print "Mean Absolute Deviation of %s, %s weighted by %s: %s" % (x, y, w, xsum/(wsum)) mad("salary", "salbegin", "w") end program. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: PeterMuenchen <[hidden email]> To: [hidden email] Date: 02/28/2015 05:12 AM Subject: Re: [SPSSX-L] Calculation of Mean Absolute Difference in SPSS Sent by: "SPSSX(r) Discussion" <[hidden email]> The Phyton Solution proposed by Jon works fine. Is it possible to introduce a weighting variable into it? (the normal SPSS weighting is not recognised by the Phyton solution. I have to weight by the population size of the countries I'm comparing). thanks a lot!!!! Peter -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Calculation-of-Mean-Absolute-Difference-in-SPSS-tp5728831p5728846.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
What would be the Phyton code for one variable, but with weighting? And is it possible to get the result not printed in the output window, but as a new variable in the file?
This here was the Phyton solution for one variable, for which I would need the weighting: begin program. # calculate Mean Average Deviation. # Assumes that data fit in memory. # Cases with missing values are omitted. import spss, spssdata def mad(x): """Return MAD for variable x""" dta = spssdata.Spssdata(x, names=False, omitmissing=True).fetchall() ncases = len(dta) xsum = 0. for i in range(ncases): for j in range(ncases): xsum += abs(dta[i][0] - dta[j][0]) print "Mean Absolute Deviation of %s: %s" % (x, xsum/(ncases*ncases)) mad("salary") end program. |
The result is a scalar, so it doesn't really
fit with the casewise data, but it's easy to assign it to every case.
This is the last variation I'm going to do. * Encoding: UTF-8. begin program. # calculate Mean Average Deviation. # Assumes that data fit in memory. # Cases with missing values are omitted. import spss, spssdata def mad(x, w): """Return MAD for variable x weighted by w""" dta = spssdata.Spssdata([x, w], names=False, omitmissing=True).fetchall() ncases = len(dta) xsum = 0. wsum = 0. for i in range(ncases): for j in range(ncases): xsum += dta[i][1] * dta[j][1] * abs(dta[i][0] - dta[j][0]) wsum += dta[i][1] * dta[j][1] madvalue = xsum/(wsum) print "Mean Absolute Deviation of %s weighted by %s: %s" % (x, w, madvalue) spss.Submit("""compute MAD = %s""" % madvalue) mad("salary", "w") end program. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: PeterMuenchen <[hidden email]> To: [hidden email] Date: 02/28/2015 09:13 AM Subject: Re: [SPSSX-L] Calculation of Mean Absolute Difference in SPSS Sent by: "SPSSX(r) Discussion" <[hidden email]> What would be the Phyton code for one variable, but with weighting? And is it possible to get the result not printed in the output window, but as a new variable in the file? This here was the Phyton solution for one variable, for which I would need the weighting: begin program. # calculate Mean Average Deviation. # Assumes that data fit in memory. # Cases with missing values are omitted. import spss, spssdata def mad(x): """Return MAD for variable x""" dta = spssdata.Spssdata(x, names=False, omitmissing=True).fetchall() ncases = len(dta) xsum = 0. for i in range(ncases): for j in range(ncases): xsum += abs(dta[i][0] - dta[j][0]) print "Mean Absolute Deviation of %s: %s" % (x, xsum/(ncases*ncases)) mad("salary") end program. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Calculation-of-Mean-Absolute-Difference-in-SPSS-tp5728831p5728848.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
This post was updated on .
Weighted MAD MAtriX:
Edited: Changed matrix name yyww to ywyw (more appropriate ). DEFINE !MADW (!POS !TOKENS(1) /!POS !TOKENS(1)) PRESERVE. CD '%userprofile%\Desktop' . DATASET DECLARE @MAD. MATRIX. GET yw / FILE * /VARIABLES !1 !2 /MISSING=OMIT. COMPUTE ywyw={KRONEKER(yw,MAKE(NROW(yw),1,1)),KRONEKER(MAKE(NROW(yw),1,1),yw) }. SAVE {1, T(ABS(ywyw(:,1)-ywyw(:,3))) * (ywyw(:,2) &* ywyw(:,4) )/(T(ywyw(:,2)) * ywyw(:,4))} /OUTFILE @MAD/ VARIABLES @link MAD. END MATRIX. RESTORE. !ENDDEFINE. */Get your data... DATASET NAME rawdata. COMPUTE @link=1. !MADW y w . MATCH FILES FILE rawdata/TABLE @MAD / BY @link. EXECUTE. DELETE VARIABLES @link. DATASET CLOSE @MAD.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |