SPSSX Discussion

Re: Using SPSS-Linear Regression to develop Mortgage Models for a financial institution

Classic

List

Threaded

2 messages Options

Quentin Zavala

Re: Using SPSS-Linear Regression to develop Mortgage Models for a financial institution

Hello,

I wanted to reach out to those in this community whereby I’m trying to determine the potential of customers who have the best opportunity to acquire a mortgage.

Initially I utilized OLS Regression models (stepwise) in SPSS version 19 to select a model, score the dataset with the regression equation, and sort the data in descending order and group the data into equal deciles. Then this data was utilized to develop a Gain Chart like the one shown below. Although the regression equation used for the model only explained about 14% (adjusted R2-coefficient of determination) of the variance in the dependent variable (i.e., First Time Mortgagees, Non Mortgagees). I’m trying to develop a predictive model to see what variables have the best tendency to predict the outcomes of mortgage applications within a Marketing Department for a credit union.

My former boss who has 40 years of experience in Direct Marketing indicated to me that it isn’t important how much variance is explained, but rather I should look for is that the top decile is more than 10 times the bottom decile.

My basic question is if this is a viable approach given what I was told how to develop the model using OLS Regression. The dependent variable was Mortgage Balance (currency). Would it be better to utilize Logistic Regression as opposed to Linear Regression due to the research question (what variables predict the procurement of a mortgage versus those who are denied).

Any insights are highly appreciated because I’m new to Gain Charting and ignoring R2 when using regression analysis to develop models.

Group	Members	# of Mortgages	% Mortgages in Group	Cum # of Mortgages	Cum % of Mortgages	Gain	Mail Potential
1	19,820	4,272	21.55%	4,272	21.55%	630%	15,548
2	20,569	2,247	10.92%	6,519	16.14%	447%	18,322
3	18,629	1,284	6.89%	7,803	13.22%	348%	17,345
4	19,306	877	4.54%	8,680	11.08%	275%	18,429
5	17,183	592	3.45%	9,272	9.71%	229%	16,591
6	21,919	508	2.32%	9,780	8.33%	182%	21,411
7	18,488	322	1.74%	10,102	7.43%	152%	18,166
8	21,251	303	1.43%	10,405	6.62%	124%	20,948
9	17,268	207	1.20%	10,612	6.08%	106%	17,061
10	21,420	203	0.95%	10,815	5.52%	87%	21,217
11	20,719	171	0.83%	10,986	5.07%	72%	20,548
12	18,123	130	0.72%	11,116	4.74%	60%	17,993
13	17,841	107	0.60%	11,223	4.44%	51%	17,734
14	19,804	140	0.71%	11,363	4.17%	41%	19,664
15	20,540	95	0.46%	11,458	3.91%	32%	20,445
16	20,366	71	0.35%	11,529	3.68%	25%	20,295
17	20,403	60	0.29%	11,589	3.47%	18%	20,343
18	20,985	55	0.26%	11,644	3.28%	11%	20,930
19	20,793	41	0.20%	11,685	3.11%	5%	20,752
20	20,317	0	0.00%	11,685	2.95%	0%	20,317
	395,744	11,685	2.95% Overall Mort. Rate	11,685 Mort.

Thank you,

Quentin Zavala

SchoolsFirst Federal Credit Union
Business Analyst, Research and Analytics

714-258-4000 ext 8601

qzavala[hidden email]

[hidden email]

Rich Ulrich

Re: Using SPSS-Linear Regression to develop Mortgage Models for a financial institution

I'm new to Gain Charting, too, but I have a lot of experience
with OLS regression, doing them and describing them.
Here are some of my insights.

About the apparent results.
An R^2 of 0.14 is pretty fair for a dichotomous outcome, though
that sort of statement always depends on What is Possible or
What is Useful. Using the top 10% versus the bottom 10% to judge
the usefulness seems like a good approach -- especially if that is
how it is going to be used. And it has long been my opinion that
screening applications of this sort should probably focus on the
extremes -- especially to exclude the "worst" before considering
other criteria. It sounds like there ought to be quite a few examples
available elsewhere, in order to judge these results.

About the methodology.
I always flinch when I see "stepwise" because of the problems
inherent in those approaches. See
http://www.stata.com/support/faqs/stat/stepwise.html
Your N of 200 000 eliminates the questions about the F-tests
being invalid, but it does nothing about the questions of biases
and collinearity.

It is a good idea to use sub-samples in order to create replications,
to show the validity. You might do cross-validation by repeating
your methodology with 10 random sub-samples, each 1/10th of
original, and fitting the equations *outside* the deriving sample
That would be conventional and fairly convincing.

However, in order to further reduce the chance of irrelevant
biases, it could be wise to do some *non*-random sub-sampling.
- Does a formula created from one region of the country (say)
replicate when applied to data from another region? ... and so on.

The Gain Chart looks useful, but your description does leave me
wondering at what you were regressing. It *seems* to me that
you say that you are trying to predict whether a mortgage was
granted, but that you are using some other, continuous variable
(amount) as DV in a regression. That doesn't seem to be a
problem if the Gain Chart is useful and intelligible, except that
the eventual write-up should be clearer on what was done.

--
Rich Ulrich

Date: Tue, 22 May 2012 21:54:30 +0000
From: [hidden email]
Subject: Re: Using SPSS-Linear Regression to develop Mortgage Models for a financial institution
To: [hidden email]

Hello,

I wanted to reach out to those in this community whereby I’m trying to determine the potential of customers who have the best opportunity to acquire a mortgage.

Any insights are highly appreciated because I’m new to Gain Charting and ignoring R2 when using regression analysis to develop models.

Group	Members	# of Mortgages	% Mortgages in Group	Cum # of Mortgages	Cum % of Mortgages	Gain	Mail Potential
1	19,820	4,272	21.55%	4,272	21.55%	630%	15,548
2	20,569	2,247	10.92%	6,519	16.14%	447%	18,322
3	18,629	1,284	6.89%	7,803	13.22%	348%	17,345
4	19,306	877	4.54%	8,680	11.08%	275%	18,429
5	17,183	592	3.45%	9,272	9.71%	229%	16,591
6	21,919	508	2.32%	9,780	8.33%	182%	21,411
7	18,488	322	1.74%	10,102	7.43%	152%	18,166
8	21,251	303	1.43%	10,405	6.62%	124%	20,948
9	17,268	207	1.20%	10,612	6.08%	106%	17,061
10	21,420	203	0.95%	10,815	5.52%	87%	21,217
11	20,719	171	0.83%	10,986	5.07%	72%	20,548
12	18,123	130	0.72%	11,116	4.74%	60%	17,993
13	17,841	107	0.60%	11,223	4.44%	51%	17,734
14	19,804	140	0.71%	11,363	4.17%	41%	19,664
15	20,540	95	0.46%	11,458	3.91%	32%	20,445
16	20,366	71	0.35%	11,529	3.68%	25%	20,295
17	20,403	60	0.29%	11,589	3.47%	18%	20,343
18	20,985	55	0.26%	11,644	3.28%	11%	20,930
19	20,793	41	0.20%	11,685	3.11%	5%	20,752
20	20,317	0	0.00%	11,685	2.95%	0%	20,317
	395,744	11,685	2.95% Overall Mort. Rate	11,685 Mort.

Thank you,

Quentin Zavala

SchoolsFirst Federal Credit Union
Business Analyst, Research and Analytics

714-258-4000 ext 8601

qzavala[hidden email]

[hidden email]