Dr. Carlos Eduardo Canfield Rivera
Universidad Anahuac-Mexico, Mexico
E-mail: Carlos.canfield@anahuac.mx
Submission: 07/09/2016
Revision: 23/09/2016
Accept: 08/06/2017
ABSTRACT
P2P
lending is a method of informal finance that uses the internet to connect
borrowers with on-line communities. Through limited participation of financial
intermediaries, P2P credit becomes a new model for unsecured loan origination.
The question framing this research is: What are the elements that could help
on-line financiers characterize default risk in these loans? It attempts to
advance knowledge about P2P default determinants, from the perspective of the
availability of information to on-line lending communities in emerging
countries. With the aid of a logistic regression model and data provided by
Mexican platforms and public information available to on-line investors, this
inquiry explores the effect of credit scores and other variables related to
loan and borrower´s characteristics over P2P default behavior. The results
showed that information provided by the platform is relevant for analyzing
credit risk, yet not conclusive. In congruence with the literature, on a scale
going from the safest to the riskiest, loan quality is positively associated
with default behavior. Other determinants for increasing the odds of default
are the payment-to-income ratio and having been refinanced on the same
platform. On the contrary loan purpose and being a female applicant reduce such
odds. Evidence from the sample showed that under equal credit conditions, a
case for differential default behavior among variations in gender, age or
geographical location, could not be established. However it was found that
having controlled for loan quality, women have longer loan survival times than
men. This is one of the first studies about debt crowdfunding in Latin America
and Mexico. Implications for lenders, researchers and policy-makers are also
discussed.
Keywords: Peer-to-peer
lending, default risk, emerging countries, Mexico, loan survival, logistic
regression, differential default behavior
1. INTRODUCTION
New
means of project sourcing have flourished with the advent of the Web 2.0 and
the upcoming popularity of online communities (Iturbide; Canfield, 2015). On-line financing and the
development of crowdfunding platforms (CF) that evolved from the Fintech
ecosystem constitute an example of a disruptive business model innovation (Markides, 2006).
With
limited participation of financial intermediaries, online peer-to-peer (P2P) or
market-place lending becomes a new model for unsecured loan origination (Galloway, 2009), where anonymous
backers parcel the amount loaned. Allegedly P2P lending constitutes a good
alternative for the financing community. Borrowers receive better credit
conditions than in traditional finance; Creditors take advantage of an
investment model where risk is coupled to the credit rating of the funded loans
(Bachmann, et al., 2011) and lending websites benefit by raising fees for
successfully realized transactions.
P2P
lending initiated in the U.K with the first online lending platform, Zopa (HULME;
WRIGHT, 2006) and is also developing apace around the world (Wan et al. 2016; Weiss et al.
2010; Berger; Gleisner, 2009). Various P2P platforms have evolved from
the Mexican Fintech ecosystem; some of them operate nation-wide. The loan approval rate for these platforms is
less than 5%, a common figure for this type of financial product (Hand; Henley, 1997).
The
purpose of this article is to attain further understanding about applicants´
delinquent credit behavior in P2P lending activities, in emerging markets,
particularly in the Mexican context.
This
study will address the following question: 1) what are the elements that could
help on-line financiers characterize default risk in these loans?
Platforms
collect information about the applicants and feed their credit rating
algorithm. As it´s the case of many on-line lending sites, each loan is
classified according to its risk characteristics, assigning a corresponding
loan rate believed to capture credit risk. Grades and loan rates increase with
the evaluated risk. In our sample, over a reverse scale, loan quality is
measured ranging from A down to G, with three notches each; Where Grade A
defines the highest quality loans charging 8.9% p.a., and Grade G, which
contains the riskier credits, charges 28.9% p.a. in the lowest notch.
Arguably,
some of the most important sources of concern for lenders in credit markets
derive from information asymmetry (IA), which stems from the fact, that
borrowers are better informed than lenders on at least two conditions that
basically shape credit and re-investment risks: (i) the ability and willingness
to repay the debt and (ii) the propensity for early payment. in this research,
only credit risk arising from default behavior is analyzed.
Creditors
benefit from knowing the true characteristics of borrowers, but moral hazard
hampers direct information sharing (IS) between market participants. IS and
monitoring activities lie at the base of financial intermediation where a
stream in financial theory considers that through IS, financial intermediaries
can reduce the adverse consequences of private information problems and
transaction costs.
For
that matter, the following argument constitutes the base for the present study:
In P2P lending, the platforms collect valuable information for understanding
default behavior, sharing it with the lending community in an attempt to
mitigate the adverse effects of IA, moral hazard and adverse selection.
Investors
rely on the quality of the rating process, but “soft and self-reported”
information also play important roles in the screening process (Ruiz-Ugarte, 2010). Authors such as Khwaja et al. (2009) found
that lenders infer the most from standard banking “hard” information but
likewise they use non-standard information, particularly when it provides
credible signals regarding borrower´s creditworthiness. Some lenders even go
further as to check looks and race as observed proxies for the determinants of
default (Duarte et al. 2012; Ravina, 2012).
Statistical
discrimination around appearance and popularity has not always proved effective
in understanding delinquent behavior (Freedman;
Jin, 2014; Ravina, 2012, Pope; Sydnor, 2011). In all cases, evidence
provides sound warning that return-maximizing investors should be careful in
interpreting social ties, appearance and other social characteristics alone,
when screening for loan applicants.
Through
the platform´ s web-page, backers have access to credit grades, direct
messaging with borrowers, and allegedly to relevant information about the
applicant´s characteristics and loan conditions that will help them identify
default risk. This is an important step further to allow on-line lenders to
overcome private information problems. In that sense this inquiry attempts to
study, from the lender´s perspective, the pertinence of the information
provided to the backers by the web-site.
In
many ways, the present research looks to contribute to the comprehension of
this new and fast-paced P2P ecosystem. For one thing, notwithstanding the
existence of extant literature about credit rating and its effect over on-line
lending behavior (Cf. Bachmann, et
al., 2011 for a survey), academic
work in the field tends to concentrate geographically in developed countries
like the United States, Germany and other European countries, or in China as
well, where P2P lending activities have flourished in the recent past (Wan et al., 2016).
Most
studies use rich data sets from P2P platforms such as Prosper, Lending Club and
Smava, and in that sense, to the best of our knowledge this study is one of the
few that analyzes lending behavior in on-line communities in emerging countries,
particularly in Latin America where this activity is relatively new. From the
lenders, academics and policy maker’s perspective, this research is important
because it´s one of the first that considers the Mexican crowdfunding and P2P
ecosystems, currently under construction.
This
investigation identifies relevant variables based on collected demographic and
financial characteristics and, using a multivariate regression analysis
framework, estimates default probabilities and analyzes the effect of gender on
delinquent behavior, after controlling by loan quality. Finally and based on
gender and age, this study examines the loan survival times of borrowers in
order to provide a better view of the possibility of differential non-payer
behavior in the Mexican P2P environment.
To
preview our results the model shows that information provided by the platforms
and available to financiers is relevant for analyzing credit risk, yet not
conclusive. The most important determinants for increasing the odds of delinquent
behavior, besides the credit score are those related to the ratio of monthly
payment to income and having been refinanced on the same platform.
On
the other hand, loan purpose and being female reduce such odds. Characterized
by credit rating, being a female applicant is relevant in positively
determining loan default only in credit scores C and F, consequently there is
no conclusive evidence that, under equal credit characteristics, lenders might
benefit by screening default behavior by gender in this sample. However it was
found that women have longer loan survival times measured in months (9.2) than
men (5.4).
Overall,
at equal credit ratings, women have better default behavior, as measured by
loan survival times alone. The above
mentioned results are in line with those found in the incipient literature of
P2P lending and contribute to three streams of knowledge; Being, the effects of
case and statistical discrimination in financial services, the study of
determinants of default and differential attitudes toward delinquency by gender
and age, and research in the recent field of P2P lending.
The
remainder of the article is organized as follows. Section II provides an
overview of the existing literature. The generality of lending processes at P2P
platforms and data collected are described in Section III. Section IV provides
the research hypothesis and the test methodology. Section V reports the
estimation results of the logistic model and the survival tables and Section VI
concludes.
2. LITERATURE REVIEW
P2P
lending is a new method of informal finance initiated in the U.K and which is
also developing apace around the world. This model for loan origination uses
the internet to directly connect borrowers with on-line communities. Various
authors have studied the up-surge and evolution of market-place lending (BERGER;
GLEISNER, 2009; BACHMANN et al., 2011).
With
limited participation of financial intermediaries, these types of platforms
facilitate unsecured micro-credits. Arguably P2P lending gives lenders the
opportunity to increase their income and offers debtors a means of accessing
financing which could not have been possible when there are strong requirement
for approval by the traditional financial intermediaries (Zeng, 2013).
Conceptually,
financial intermediation (FI) theory sets the foundation for market-place
lending based on transaction costs (SCHOLES et al. 1976). It has been argued
that FI could be used for alleviating the effects of market failures such as
adverse selection and moral hazard (PAULY, 1974; AKERLOF, 1970; ALLEN;
SANTOMERO, 1997).
Hence,
information asymmetry (IA) is perhaps, one of the most important sources of
concern in finance markets, leading to credit rationing (STIGLITZ; WEISS, 1981).
IA stems from the fact that borrowers are better informed than lenders about
their ability and willingness to repay the debt (credit risk) and the
propensity for early payment (re-investment risk).
Creditors
benefit from knowing the true characteristics of debtors, but moral hazard
hampers direct IS between market participants (LELAND; PYLE, 1977), and
verification by outside parties may be either costly or impossible (TOWNSEND,
1979). An important stream in financial theory considers that financial
intermediaries provide the means to reduce the economic consequences of private
information problems and transaction costs exchanges (DIAMOND; DYBVIG, 1983;
DIAMOND, 1984; LEVINE et al. 2000).
The
effect of IA over nonpayment behavior has been quantified by Karlan and Zinman
(2009), who in their study about the South African credit markets, found that
about 15% of default was due to private information problems. Literature
suggests that IS may overcome adverse
selection and reduce moral hazard, by raising borrowers’ effort to repay loans
(JAPPELLI; PAGANO, 2006) or by avoiding excessive lending when each borrower
may patronize several banks (BENNARDO; PAGANO, 2007). Brown et al. (2009) showed that IS relates
with improved availability and lower cost of credit to firms in transition
economies.
In
the case for P2P lending, there is very little assurance on the part of the
lender that the borrower, having been possibly rejected by other financial
intermediaries, will repay the loan (ZENG, 2013).
Weiss
et al. (2010) found evidence that the screening of potential borrowers is a
major instrument in alleviating adverse selection preventing the online market
to collapse. Information availability improves lender screening and
dramatically reduces the default rate for high-risk credits, but has little effect
on low-risk loans (MILLER, 2015).
Given
the importance of credit rating, Moro et al. (2015) found that credit scoring
is the main application used for predicting risk and supporting the loan
approval process. In a joint effort with credit rating agencies, valuable
information for understanding default behavior is shared with the lending
communities by the platforms. Investors rely on credit scoring and the quality
of the rating process in market-place loans, nevertheless “soft and
self-reported” information also play important roles in their internal systems
for decision making (KHWAJA, et al., 2009; FREEDMAN; JIN, 2014; DUARTE, et al.,
2012; RAVINA, 2012).
Based
on a statistical discrimination approach (PHELPS, 1972; ARROW, 1973), lenders
have the opportunity to check the debtor´s credit grades, work and academic
history, property, income to debt ratios and subjective information regarding
use of credit, communication capacities and social ties in an attempt to
separate payers from non-payers (RUIZ-UGARTE, 2010).
Moreover,
Zhang and Liu (2012) in their study about the herding behavior among backers at
Prosper found that investors infer the solvency of borrowers by observing
decisions of other creditors, and by using the publicly observable borrower
characteristics.
There
is a somewhat richer literature explaining funding success in P2P loans, but
this is by no means exhaustive (LEE; LEE, 2012; YUM, et al. 2012; LIN et al.
2013; GONZALEZ; LOUREIRO, 2014; ZHANG; LIU, 2012). For example, Herzenstein et
al. (2008) showed that debtors’ financial strength, their listing and
publicizing efforts and their demographic attributes, have an effect over the
likelihood of funding success. In their study about P2P loan bidding, Weiss et
al., (2010) showed that the most important factor used by lenders to allocate
funds is the rating assigned by the P2P lending site.
The
empirical evidence evaluating non-payment behavior in P2P lending is limited
and geographically concentrated in developed countries. To the best of our
knowledge there are no studies analyzing such behavior in Latin America, or
Mexico specifically.
As
per the literature, in recent studies about determinants of default behavior
based on platforms like Lending Club (Serrano-Cinca et al. 2015), the authors
tested the pertinence of variables that were classified according to the
following categories: borrower assessment, loan characteristics, borrower
characteristics, credit history, indebtedness and income related information.
The factors explaining default were loan purpose, annual income, current
housing situation, credit history and indebtedness and a model for predicting
defaults was also estimated.
Emekter
et al. (2015) found that variables like credit grade, debt-to-income ratio,
FICO score and revolving line utilization played an important role in loan
defaults while Dietrich and Wernli (2016) showed that borrower-specific factors
such as its economic status significantly influence lender evaluations of the
borrower’s credit risk and thus the interest rates offered. Studies on the
default behavior in the credit card market in Mexico, consider, as
determinants, variables such as loan history, payment-income ratios and loan
characteristics, factors that are usually included in parametric credit scoring
models (García et al. 2015) and are consequent with the regulation and
procedures established by the Mexican Banking Commission (CNBV, 2014).
3. DATA COLLECTION
As
part of the application process, platforms collect quantitative and qualitative
information about solicitants. Using credit rating algorithms and a minimum of
human intervention, the platforms evaluate, approve and grade the loans if
appropriate. As it is the norm in the credit industry, their model factors
elements such as credit history, payment capacity and other pieces of
information, some of them used traditionally in parametric credit
analysis. After the loan has been
formalized it is submitted to their on-line lending community. Data collected
and included in the study is first described in Table 1.
Table 1: Variables provided for the
study
Variables |
Description |
Type |
RQAMT |
Requested
amount in pesos (original)d |
Scale |
Amount_less_100000 |
Is
requested amount less than 100,000 MP? (Coded)e |
Binarya |
Approved |
Was the loan approved? (Coded)e |
Binarya |
LTERM |
Loan
Term in months (original)d |
Scale |
Short_Term |
Is
Term less than 12 months? (Coded)e |
Binarya |
CONSDEBT |
Is
the purpose of loan to consolidate debt? (Coded)e |
Binarya |
BUSINESS |
Is
the purpose of loan finance business? (Coded)e |
Binarya |
CARLOAN |
Is
the purpose a car loan? (Coded)e |
Binarya |
HOME |
Is
the purpose of loan home improvements? (Coded)e |
Binarya |
EDUC |
Is
the purpose of loan education? (Coded)e |
Binarya |
Gender_of |
Is
the gender male? (Coded)e |
Binarya |
Marriage |
Is
the status of women married? (Coded)e |
Binarya |
AMOUNT_FUNDED |
Actual
amount funded (original)d |
Scale |
LRATE |
Loan
rate in % (original)d |
Scale |
GRADE |
7
categories for credit grade (A-G) (Coded)e |
Ordinalb |
Credit_Risk |
Is
credit grade Prime (C1 or less)? (Coded)e |
Binarya |
Default |
Coded DV. Is loan
delinquent or bad credit? (Coded)e |
Binarya |
Funding_Time |
weeks
elapsed from registration to payment (Coded)e |
Scale |
Success_Funding |
Scale
from 1 = a week through 6 = 2 months (Coded)e |
Ordinalc |
REFINANCED |
Was
the loan refinanced? (Coded)e |
Binarya |
INVESTORS |
Number
of investors per loan (original)d |
Scale |
Dec_INCOME |
Self-reported
monthly income (original)d |
Scale |
Paid_capital |
Loan
amortization (original)d |
Scale |
Balance |
Amount-
Paid_capital (Calculated) |
Scale |
PMT_Balance |
Ratio
of monthly payment to balance (Calculated) |
Scale |
Monthly_PMT |
Monthly
payment (original)d |
Scale |
PMT_Income |
Ratio
of payment to income (Calculated) |
Scale |
Credit_Mos |
Loan
number of months to date (time effect) (Calculated) |
Scale |
OWNAUTO |
Owns
car? (Coded)e |
Binarya |
OWNHOUSE |
Owns
home? (Coded)e |
Binarya |
GRAD_UG |
Declared
graduate or undergraduate studies (Coded)e |
Binarya |
Table
1 Continued |
|
|
METRO |
Lives
in the Mexico City Area? (Coded)e |
Binarya |
AGE |
Age
of applicant (Calculated) |
Scale |
Under_25 |
Is
age under 25 years old? (Coded)e |
Binarya |
Male |
Is
the borrower male? (Coded)e |
Binarya |
Female |
Is
the borrower female? (Coded)e |
Binarya |
Previous_loan |
Borrowed
before in the platform? (Coded)e |
Binarya |
Default_mos |
Months
in delinquency (Calculated) |
Scale |
Days_default |
Days
in delinquency (Calculated) |
Scale |
Notes: aBinary response
variables: 0=No, 1=Yes. bOrdinal 7 categories for Credit Grade from
A down to G, A-grade being the safest. cordinal
6 categories, from 1 being up to one week of funding time through 6
corresponding to more than 2 months to complete funding. d(original)
= variable directly in the sample. eCoded by researcher.
Our
dataset includes 25,598 loan applications filed in the period from June 2012
through February of 2016. Loan approval rates depend on the product and are
defined by management. Usually the approval strategy is designed around
reducing risk by selecting only those applicants who are thought to have a very
low risk of defaulting, thus the proportion accepted and the proportion of
those accepted who subsequently default is inversely related. As per the
information provided, the main cause for loan denial was the inability to meet
one or more approval requirements. The descriptive statistics for variables in
the data-set are presented in Table 2.
Table 2: Descriptive statistics for
variables in the study
Variables |
M |
SD |
Range |
n |
RQAMT |
72,373 |
72,307 |
5,000-250,000 |
25,998 |
Ammount_less_100000 |
.78 |
.42 |
0-1 |
25,998 |
Approved |
.05 |
.21 |
0-1 |
25,998 |
LTERM |
27.42 |
9.98 |
7-120 |
25,598 |
Short_Term |
.24 |
.43 |
0-1 |
25,598 |
CONSDEBT |
.41 |
.49 |
0-1 |
25,998 |
BUSINESS |
.23 |
.42 |
0-1 |
25,998 |
CARLOAN |
.07 |
.25 |
0-1 |
25,998 |
HOME |
.11 |
.32 |
0-1 |
25,998 |
EDUC |
.06 |
.24 |
0-1 |
25,998 |
Gender_of |
.65 |
.48 |
0-1 |
25,998 |
Marriage |
.64 |
.48 |
0-1 |
8,940 |
AMOUNT_FUNDED |
74,254 |
59,906 |
0 – 250,000 |
1,161 |
LRATE |
19.74% |
3.76% |
8.9%-27.9% |
1,161 |
GRADE |
4.27 |
1.27 |
1-7 |
1,161 |
Credit_Risk |
.13 |
.34 |
0-1 |
1,161 |
Default |
.11 |
.32 |
0-1 |
1,161 |
Funding_Time |
3.21 |
2.19 |
0 – 6 |
1,161 |
Success_Funding |
.29 |
.46 |
0-1 |
1,161 |
Table 2 Continued |
|
|
|
|
REFINANCED |
.06 |
.23 |
0-1 |
1,161 |
INVESTORS |
61.22 |
46.68 |
0 – 336 |
1,161 |
Multiple_Investors |
.50 |
.50 |
0-1 |
1,161 |
DecINCOME |
22,838 |
22,306 |
2,600 – 27,648 |
1,161 |
Paid_capital |
18,285 |
22,110 |
0 – 250,000 |
1,161 |
Balance |
55,968 |
54,558 |
0 – 250,000 |
1,161 |
PMT_Balance |
.11 |
.56 |
0 -11.95 |
1,161 |
Monthly_PMT |
3,498 |
2,729 |
352 -23,946 |
1,161 |
PMT_Income |
.18 |
.11 |
.009 – 1.55 |
1,161 |
Credit_Mos |
17.41 |
10.98 |
0 – 48 |
1,161 |
OWNAUTO |
.67 |
.47 |
0-1 |
1,161 |
OWNHOUSE |
.47 |
.50 |
0-1 |
1,161 |
property |
.77 |
.42 |
0-1 |
1,161 |
GRAD_UG |
.85 |
.35 |
0-1 |
1,161 |
METRO |
.42 |
.49 |
0-1 |
1,161 |
AGE |
36.67 |
.82 |
17-82 |
1,161 |
Over_60 |
.04 |
.19 |
0-1 |
1,161 |
Under_25 |
.05 |
.22 |
0-1 |
1,161 |
Male |
.67 |
.47 |
0-1 |
1,161 |
Female |
.33 |
.47 |
0-1 |
1,161 |
Previous_loan |
.15 |
.36 |
0-1 |
1,161 |
Rural |
.04 |
.19 |
0-1 |
1,161 |
Median_Income |
.76 |
.43 |
0-1 |
1,161 |
Default_mos |
7.88 |
7.92 |
.27 – 37.7 |
130 |
Days_default |
236 |
237 |
0 – 1165 |
130 |
Notes: n = 25,598 refers to the
total loan applications: n = 1, 1161 to approved loans, n = 8,940 to women in
the sample and n =130 to defaulted loans.
Of
total applicants, 65% are men and 35% women. The loan denial rate for men was
94.6% for men and 93.3% for women. Grouped by gender, in this section data
collected through the platform is first described in Table 3.
Table 3: Results of t-tests and Descriptive Statistics of variables in the
dataset, grouped by gender
|
Men |
|
Women |
|
95% CI for Mean Difference |
|
|
||
Variable |
M |
SD |
n |
M |
SD |
n |
t |
df |
|
Loan
characteristics |
|
|
|
|
|
|
|
|
|
RQAMT |
74,772 |
73,461 |
16,650 |
67,910 |
69,893 |
8,948 |
5,033 ,
8,690 |
7.36* |
19,105 |
LTERM |
27 |
10 |
16,650 |
27.61 |
9.88 |
8,948 |
-.54, -.03 |
-2.21* |
18,565 |
LRATE |
19.6% |
3.8% |
774 |
20.1% |
3.7% |
387 |
-.97% -.06% |
-2.21* |
1,159 |
INVESTORS |
63 |
48 |
774 |
57 |
44 |
387 |
1.03,
12.42 |
2.32* |
1,159 |
REFINANCED |
5.3% |
22.4% |
774 |
7.0% |
25.5% |
387 |
-4.7%, 1.3% |
-1.1 |
690 |
CONSDEBT |
40.8% |
49.1% |
16,650 |
41.8% |
49.3% |
8,948 |
-2.3%, .3% |
1.54 |
18,249 |
BUSINESS |
24.1% |
42.8% |
16,650 |
21.5% |
41.1% |
8,948 |
1.5%, 3.6% |
4.62* |
18,930 |
CARLOAN |
8.0% |
27.2% |
16,650 |
4.9% |
21.6% |
8,948 |
2.5%, 3.7% |
10.02* |
22,053 |
HOME |
11.3% |
31.7% |
16,650 |
11.3% |
31.7% |
8,948 |
-.8%, .9% |
.11 |
25,596 |
EDUC |
5.5% |
22.7% |
16,650 |
7.2% |
25.8% |
8,948 |
-2.4%,-1.1% |
-5.33* |
16,404 |
Property |
|
|
|
|
|
|
|
|
|
OWNAUTO |
51.9% |
50.0% |
16,650 |
40.1% |
49.0% |
8,948 |
-1.1%, 1.3% |
18.32* |
18,615 |
OWNHOUSE |
44.1% |
49.7% |
16,650 |
41.8% |
49.3% |
8,948 |
1.1%, 3.6% |
3.58* |
18,412 |
GRAD_UG |
68.6% |
46.4% |
16,650 |
64.2% |
47.9% |
8,948 |
3.2%, 5.6% |
7.11* |
17,799 |
AGE |
35.3 |
9.9 |
16,650 |
35.3 |
9.9 |
8,948 |
-.31, .21 |
-.35 |
24,629 |
METRO |
37.7% |
48.5% |
16,650 |
34.7% |
47.6% |
8,948 |
1.8%, 4.3% |
4.81* |
18,593 |
Notes: *p <.05. n = 16,650 for
men and n = 8,948 for women.
Loan
purpose can be assimilated to perceived riskiness by lenders. As exhibited on
Table 3, debt consolidation (CONSDEBT) is the most self-reported loan purpose,
followed by financing business (BUSINESS). The Mexico City Metro Area (METRO)
concentrates more than one third of total applications. The mean loan term (LTERM)
in months is 27 and the non-weighted loan interest rate charged (LRATE) on
average was 19.7%.
Results
of the grouped samples t-tests show that mean requested amount RQAMT differ
between men (M=74,772, SD =73,461) and women (M=67,910, SD=67,910) at the .05
level of significance (t = 7.36, df = 19, 105, n = 25,998, p < .05, 95% CI
for mean difference is 5,033 to 8690). On average, women requested smaller
loans than men, (6,962 pesos less). With respect to loan rates, variable LRATE
also differs (M=19.96%, SD =3.8%) for men and (M=20.1%, SD=3.7%) for women at
the .05 level of significance (t = -2.21, df = 1,159, n = 1,161, p < .05,
95% CI for mean difference is -.97% to .-06%). Female loan rates are quoted
50bp higher than those quoted for men. In the sample women are less likely to
own property and in they do not concentrate in the Mexico City Area as men do.
In
the platform, filings are classified over 21 categories ranging from A1 being
the highest quality rating through G3, the lowest. In Table 4, we observe the
following gender patterns of behavior after controlling for credit rating that
is, collapsing the full scoring set into a subset comprising only 7
sub-categories, from A down to G, A-grade being the safest.
Table 4: Means and Standard Errors for
variables: LRATE, LTERM and RQAMT grouped by Grade
|
Men |
|
Women |
|
|||
Variable/grade |
M |
SE |
n |
M |
SE |
n |
|
LOAN RATE |
|||||||
A |
9.65% |
.16% |
28 |
10.23% |
.29% |
9 |
|
B |
13.31% |
.12% |
41 |
13.40% |
.18% |
16 |
|
C |
16.04% |
.07% |
133 |
16.05% |
.10% |
66 |
|
D |
19.07% |
.05% |
229 |
19.02% |
.08% |
111 |
|
E |
21.87% |
.06% |
233 |
21.83% |
.07% |
107 |
|
F |
24.68% |
.07% |
102 |
24.78% |
.09% |
68 |
|
G |
26.90% |
2.1E-17% |
8 |
27.40% |
.17% |
10 |
|
LOAN TERM |
|
||||||
A |
30.00 |
1.799 |
28 |
32.00 |
2.828 |
9 |
|
B |
33.65 |
.861 |
41 |
32.25 |
2.112 |
16 |
|
C |
30.56 |
.704 |
133 |
30.67 |
1.042 |
66 |
|
D |
30.48 |
.573 |
229 |
29.46 |
.885 |
111 |
|
E |
29.74 |
.591 |
233 |
30.30 |
.851 |
107 |
|
F |
33.46 |
1.442 |
102 |
29.29 |
1.157 |
68 |
|
G |
33.00 |
3.000 |
8 |
29.90 |
5.740 |
10 |
|
Amount |
|
|
|
|
|
|
|
A |
120,464.30 |
13,867.62 |
28 |
136,666.70 |
30,092.45 |
9 |
|
B |
91,560.98 |
10,705.65 |
41 |
101,187.50 |
16,825.13 |
16 |
|
C |
83,630.08 |
5,391.32 |
133 |
93,878.79 |
9,471.35 |
66 |
|
D |
93,310.04 |
4,467.80 |
229 |
80,666.67 |
6,551.46 |
111 |
|
E |
101,055.80 |
4,872.76 |
233 |
84,943.93 |
6,651.94 |
107 |
|
F |
129,098.00 |
7,910.49 |
102 |
105,147.10 |
10,590.73 |
68 |
|
G |
140,375.00 |
19,858.37 |
8 |
112,600.00 |
25,800.60 |
10 |
|
Notes: Credit rating (Grades)
collapse the full scoring set (21 categories) into a subset comprising only 7
sub-categories, from A down to G, A-grade being the safest. N = 1,161 loans
approved.
As can be observed in Table 4, for loans of the same
quality, women in general pay higher rates and borrow greater amounts than men,
especially at the safest categories.
Defaulted
loans are those that are charged-off, refinanced and/or late in payment. Good
status loans are loans that are fully paid or current in payment schedule. For
that matter the rate of default in this article is defined as: . Default is the dependent variable (DV)
in this study and it´s a binary response coded variable which is equal to 1 if
the loan has defaulted and 0 otherwise. The simple mean rate of default for men
is 11.76% and 10.07% for women. Classified by grade, as can be observed in
Table 5, there are no significant differences by gender in actual default
rates.
Table 5: Results of t-test and Descriptive Statistics for
Default by gender, classified over Credit Grades
|
Men |
|
Women |
|
|
|
||
Outcome |
M |
SE |
n |
M |
SE |
n |
t |
dfa |
A |
0 |
0 |
28 |
0 |
0 |
9 |
-- |
-- |
B |
.049 |
.03 |
41 |
.063 |
0.25 |
16 |
.19 |
24 |
C |
.070 |
.02 |
133 |
.030 |
0.02 |
66 |
-1.64 |
187 |
D |
.122 |
.02 |
229 |
.100 |
0.03 |
111 |
-.65 |
236 |
E |
.120 |
.02 |
233 |
.150 |
0.03 |
107 |
.73 |
189 |
F |
.210 |
.04 |
102 |
.130 |
0.04 |
68 |
-1.27 |
159 |
G |
.125 |
.13 |
8 |
.000 |
0.00 |
10 |
-1.00 |
7 |
Notes: adf: Satterthwaite’s
degrees of freedom. *p < .05 |
4. RESEARCH HYPOTHESES AND TEST METHODOLOGY
The
research question addressed in this study is what are the elements that could
help lenders characterize default risk in P2P loans? Given the fact that
on-line investors are exposed to the totality of credit risk, besides credit
scoring, lending, web-sites must provide pertinent information for overcoming
the adverse effect of IA. Default probabilities by grade are not directly
observed by backers therefore it is believed that they can rationally benefit
from statistical discrimination by factoring in information believed to be
correlated with default probabilities.
From
a set of collected variables, mostly available to the on-line community, this
study will attempt to identify, which of them are pertinent for assessing
default rates. Previous studies on P2P non-payment behavior, at first, guide
our variable selection process (EMEKTER et al. 2015; GARCÍA et al. 2015; WEISS
et al., 2010; SERRANO-CINCA et al. 2015).
In
our study, bearing in mind our lenders perspective, and the fact that the
platform has already factored in a scoring model all the information collected,
we will be testing whether variables related to loan characteristics,
demographics, loan performance in the platform, credit purpose, income and debt
servicing capacity, besides credit scores are correlated with default
comportment in the sample.
The
first hypothesis that would be tested is: Grades, as derived from the
platform´s credit scoring model are considered determinants of default behavior
in our P2P lending case.
The
second hypothesis that follows, in general is: Lenders benefit from statistical
discrimination in the sense given by Schwab (1986), by factoring variables
related to loan and demographic characteristics of the applicants: CONSDEBT,
PMT_Income, Female, Paid_capital, OWNHOUSE, Credit_Mos (time effect),
REFINANCED in addition to credit scoring variables (GRADE), as provided by the
lending web-site. In this case, the benefit for investors resides in attaining
a better knowledge of non-payment behavior by debtors.
The
third hypothesis is: The gender variable, specifically if the loan applicant is
a female, is believed to be correlated with default behavior. Therefore,
lenders may benefit from screening loans by sex, using taste-based
discrimination. To be able to identify the case discrimination effect of gender
we need to control credit quality. This hypothesis would be also tested after
having controlled by GRADE derived from the platform’s credit scoring model.
Our
hypotheses testing rely on the reduced form model: Where
is the expected value of
given
. In our case
is the probability of default as a function of
a set of available information about the borrower. Following Aguilera et al.
(2006), the logistic regression model used for testing the hypotheses is
defined in the following way: Let
be a set of continuous or categorical observed
variables and let us consider n observations of those variables represented in
the matrix
=
. Let Y =