Create a scatterplot of Age Income in Rapidminer, using color to differentiate customers who accepted the loan and those who did not.
INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
MBA 738: Homework 3
- Case: Acceptance of Loan Offers
Universal Bank is a relatively young bank growing rapidly in terms of overall customer acquisition. The majority of these customers are liability customers (depositors) with varying sizes of relationship with the bank. The customer base of asset customers (borrowers) is quite small, and the bank is interested in expanding this base rapidly to bring in more loan business. In particular, it wants to explore ways of converting its liability customers to personal loan customers (while retaining them as depositors).
A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9%. This has encouraged the retail marketing department to devise smarter campaigns with better target marketing. Your goal is to model the previous campaign’s customer behavior to analyze what combination of factors make a customer more likely to accept a personal loan offer. This will serve as the basis for the design of a new campaign.
The file UniversalBank.xls contains data on 5000 liability customers of Universal Bank who were targeted in the previous personal loan campaign. The data include customer demographic information (age, income etc.), the customer’s relationship with the bank (mortgage, securities account etc.), and the customer response to the last campaign (Personal Loan). A 1 in the Personal Loan column indicates the loan offer was accepted. The descriptions of the variables are in the Description worksheet in the file. (Read the descriptions to get a better idea about the variables.)
Use the dataset to answer the following questions.
- Create a scatterplot of Age Income in Rapidminer, using color to differentiate customers who accepted the loan and those who did not. Which variable (i.e., age or income) appears to be potentially more useful in classifying customers? Explain.
- Using RapidMiner, build a logistic regression model to classify customers into those who are likely to accept personal loan offer and those who are not. Use all the available variables as predictors except ID and ZIP
(Hint: Since the Logistic Regression operator expects nominal target variables, when the target variable is numeric, you will have to convert it to binominal by using the Numerical to Binominal operator)
- Evaluate the predictive accuracy of the model using appropriate metrics. (Do not just provide the numbers; offer your own analysis of what you think of the model based on those numbers.)
- What was the default cutoff probability used by Rapidminer to generate the classifications?
- a) Assuming that the dataset contains a representative sample of the liability customers of the bank, if the bank randomly targeted 100 liability customers, how many of them would potentially accept a personal loan offer?
- Now if the bank uses the predictive model you developed in Part (ii) to select 100 customers, how many of them would potentially accept a personal loan offer?
(Hint: Revise the process from Part (ii) to generate a lift chart.)
- Based on your responses to Parts a and b above, would your predictive model be useful to the bank for its purpose of trying to convert liability customers to personal customers?
- a) How good is your model in Part (ii) in identifying the potential positive responders, i.e., what percentage of the customers who accepted the loan were correctly classified by the model?
- Suppose the bank is interested in improving the accuracy of identifying the potential positive responders (i.e., those who would accept the loan offer) to at least 80%. Revise the model in Part (ii) to achieve this
- Compare the predictive accuracy of this revised model with that of the model developed in Part (ii). (Again, try to be analytical instead of just providing the numbers)
- Aside from the problem of predicting the likelihood of accepting loan offers, think of two other business problems where logistic regressions can be utilized for predictive modeling. For each problem, identify a target variable and four possible predictor
Attachments:
Related Questions
. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 1
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of
. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 2
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of