Statistical modelling question

Statistical modelling question

Author
Discussion

speedy_thrills

Original Poster:

7,760 posts

243 months

Saturday 20th January 2018
quotequote all
Foremost I apologise if this is in the wrong place to post this question, I did consider the "finance" section but most of those questions pertained to personal finance.

I'm normally a "people manager" (although really I'm just a subject matter expert for a team) but I've been asked to look at a small fixed rate residential loan portfolio with a consistently very high churn rate. Unfortunatly it's come to light that there had been little consideration given to managing the customer churn aspect of the portfolio which we know is poor by external measures (acquisition is excellent, retention is terrible.) We have a range of historic data - all the usual loan data (i.e. amounts, repayment rates etc.) and a few peripherals such as if early exit fees have previously been calculated, if discounts and incentives where offered previously, if a client has been retained using incentives, monthly reporting on how many customers have refinanced to other providers etc.

Every month we generate a report of fixed rate agreements which will be expiring but they are infrequently contacted. What I wanted to do was look towards building a simple model so that we could isolate residential loans coming due that are at a higher probability refinancing to another financial service provider.

What sort of modelling do you think I should be looking into initially? While I studied a STEM at university it was a few years ago now and I did not take higher level statistics courses so your assistance and/or thoughts would be very much appreciated.

V8LM

5,174 posts

209 months

Saturday 20th January 2018
quotequote all
You could look at a multivariate analysis (partial least squares, principle component analysis, multivariate curve resolution) where you look to find a model that takes all your input data and predicts the refinancing probability. The value of this depends on whether there is indeed any relation, and how big and accurate your historical data is. Or a machine learning method, such as a neutral network.

Jag_luvver

81 posts

77 months

Saturday 20th January 2018
quotequote all
Neural networks seem to be 'de rigueur' at the moment. This document seems to give a good overview of stuff that should be relevant:

https://www.cc.gatech.edu/~isbell/tutorials/rbf-in...

The intro highlights some search terms that you could chuck into google scholar (i.e. different names for doing the same thing), and there seem to be some good examples that should help to apply RBF networks to your data.

speedy_thrills

Original Poster:

7,760 posts

243 months

Saturday 20th January 2018
quotequote all
Thanks, I'll investigate both of those and read through that document.

(steven)

448 posts

214 months

Saturday 20th January 2018
quotequote all
Logistic regression is probably the best approach as it is well known, it deals with a binary outcome and so is good for churn modelling.

If you work in finance, it’s also the same approach that is commonly used to forecast delinquent loans so you may have some SME’s in the business already.

There are more glamorous techniques, but logistic regression is probably the most understood in this context.

V8LM

5,174 posts

209 months

Saturday 20th January 2018
quotequote all
‘R’ is possibly the best way to test and try things out. Plenty of tutorials and examples on the web.

nammynake

2,589 posts

173 months

Sunday 21st January 2018
quotequote all
As above, assuming you have data for each account (i.e. key variables that you suspect are good indicators of whether a customer will leave) then logistic regression is perfectly suited to this kind of problem. It's the method used to build application scorecards.

speedy_thrills

Original Poster:

7,760 posts

243 months

Monday 22nd January 2018
quotequote all
Logistic regression does look like a good start. After reading a few case-studies it look like kNN is often next step on from that.

Toaster

2,939 posts

193 months

Tuesday 23rd January 2018
quotequote all
speedy_thrills said:
Logistic regression does look like a good start. After reading a few case-studies it look like kNN is often next step on from that.
Take your pick there is a lot of software out there https://en.wikipedia.org/wiki/List_of_statistical_...