DETERMinants of default in P2P lending: THE MEXICAN CASE

 

Dr. Carlos Eduardo Canfield Rivera

Universidad Anahuac-Mexico, Mexico

E-mail: Carlos.canfield@anahuac.mx

 

Submission: 07/09/2016

Revision: 23/09/2016

Accept: 08/06/2017

 

ABSTRACT

P2P lending is a method of informal finance that uses the internet to connect borrowers with on-line communities. Through limited participation of financial intermediaries, P2P credit becomes a new model for unsecured loan origination. The question framing this research is: What are the elements that could help on-line financiers characterize default risk in these loans? It attempts to advance knowledge about P2P default determinants, from the perspective of the availability of information to on-line lending communities in emerging countries. With the aid of a logistic regression model and data provided by Mexican platforms and public information available to on-line investors, this inquiry explores the effect of credit scores and other variables related to loan and borrower´s characteristics over P2P default behavior. The results showed that information provided by the platform is relevant for analyzing credit risk, yet not conclusive. In congruence with the literature, on a scale going from the safest to the riskiest, loan quality is positively associated with default behavior. Other determinants for increasing the odds of default are the payment-to-income ratio and having been refinanced on the same platform. On the contrary loan purpose and being a female applicant reduce such odds. Evidence from the sample showed that under equal credit conditions, a case for differential default behavior among variations in gender, age or geographical location, could not be established. However it was found that having controlled for loan quality, women have longer loan survival times than men. This is one of the first studies about debt crowdfunding in Latin America and Mexico. Implications for lenders, researchers and policy-makers are also discussed.

Keywords: Peer-to-peer lending, default risk, emerging countries, Mexico, loan survival, logistic regression, differential default behavior

1.     INTRODUCTION

            New means of project sourcing have flourished with the advent of the Web 2.0 and the upcoming popularity of online communities (Iturbide; Canfield, 2015). On-line financing and the development of crowdfunding platforms (CF) that evolved from the Fintech ecosystem constitute an example of a disruptive business model innovation (Markides, 2006).

            With limited participation of financial intermediaries, online peer-to-peer (P2P) or market-place lending becomes a new model for unsecured loan origination (Galloway, 2009), where anonymous backers parcel the amount loaned. Allegedly P2P lending constitutes a good alternative for the financing community. Borrowers receive better credit conditions than in traditional finance; Creditors take advantage of an investment model where risk is coupled to the credit rating of the funded loans (Bachmann, et al., 2011) and  lending websites benefit by raising fees for successfully realized transactions. 

            P2P lending initiated in the U.K with the first online lending platform, Zopa (HULME; WRIGHT, 2006) and is also developing apace around the world (Wan et al. 2016; Weiss et al. 2010; Berger; Gleisner, 2009). Various P2P platforms have evolved from the Mexican Fintech ecosystem; some of them operate nation-wide.  The loan approval rate for these platforms is less than 5%, a common figure for this type of financial product (Hand; Henley, 1997). 

            The purpose of this article is to attain further understanding about applicants´ delinquent credit behavior in P2P lending activities, in emerging markets, particularly in the Mexican context.

            This study will address the following question: 1) what are the elements that could help on-line financiers characterize default risk in these loans?

            Platforms collect information about the applicants and feed their credit rating algorithm. As it´s the case of many on-line lending sites, each loan is classified according to its risk characteristics, assigning a corresponding loan rate believed to capture credit risk. Grades and loan rates increase with the evaluated risk. In our sample, over a reverse scale, loan quality is measured ranging from A down to G, with three notches each; Where Grade A defines the highest quality loans charging 8.9% p.a., and Grade G, which contains the riskier credits, charges 28.9% p.a. in the lowest notch.

            Arguably, some of the most important sources of concern for lenders in credit markets derive from information asymmetry (IA), which stems from the fact, that borrowers are better informed than lenders on at least two conditions that basically shape credit and re-investment risks: (i) the ability and willingness to repay the debt and (ii) the propensity for early payment. in this research, only credit risk arising from default behavior is analyzed.

            Creditors benefit from knowing the true characteristics of borrowers, but moral hazard hampers direct information sharing (IS) between market participants. IS and monitoring activities lie at the base of financial intermediation where a stream in financial theory considers that through IS, financial intermediaries can reduce the adverse consequences of private information problems and transaction costs.

            For that matter, the following argument constitutes the base for the present study: In P2P lending, the platforms collect valuable information for understanding default behavior, sharing it with the lending community in an attempt to mitigate the adverse effects of IA, moral hazard and adverse selection.

            Investors rely on the quality of the rating process, but “soft and self-reported” information also play important roles in the screening process (Ruiz-Ugarte, 2010).  Authors such as Khwaja et al. (2009) found that lenders infer the most from standard banking “hard” information but likewise they use non-standard information, particularly when it provides credible signals regarding borrower´s creditworthiness. Some lenders even go further as to check looks and race as observed proxies for the determinants of default (Duarte et al. 2012; Ravina, 2012).

            Statistical discrimination around appearance and popularity has not always proved effective in understanding delinquent behavior (Freedman; Jin, 2014; Ravina, 2012, Pope; Sydnor, 2011). In all cases, evidence provides sound warning that return-maximizing investors should be careful in interpreting social ties, appearance and other social characteristics alone, when screening for loan applicants. 

            Through the platform´ s web-page, backers have access to credit grades, direct messaging with borrowers, and allegedly to relevant information about the applicant´s characteristics and loan conditions that will help them identify default risk. This is an important step further to allow on-line lenders to overcome private information problems. In that sense this inquiry attempts to study, from the lender´s perspective, the pertinence of the information provided to the backers by the web-site. 

            In many ways, the present research looks to contribute to the comprehension of this new and fast-paced P2P ecosystem. For one thing, notwithstanding the existence of extant literature about credit rating and its effect over on-line lending behavior (Cf. Bachmann, et al., 2011 for a survey), academic work in the field tends to concentrate geographically in developed countries like the United States, Germany and other European countries, or in China as well, where P2P lending activities have flourished in the recent past (Wan et al., 2016).

            Most studies use rich data sets from P2P platforms such as Prosper, Lending Club and Smava, and in that sense, to the best of our knowledge this study is one of the few that analyzes lending behavior in on-line communities in emerging countries, particularly in Latin America where this activity is relatively new. From the lenders, academics and policy maker’s perspective, this research is important because it´s one of the first that considers the Mexican crowdfunding and P2P ecosystems, currently under construction.

            This investigation identifies relevant variables based on collected demographic and financial characteristics and, using a multivariate regression analysis framework, estimates default probabilities and analyzes the effect of gender on delinquent behavior, after controlling by loan quality. Finally and based on gender and age, this study examines the loan survival times of borrowers in order to provide a better view of the possibility of differential non-payer behavior in the Mexican P2P environment.

            To preview our results the model shows that information provided by the platforms and available to financiers is relevant for analyzing credit risk, yet not conclusive. The most important determinants for increasing the odds of delinquent behavior, besides the credit score are those related to the ratio of monthly payment to income and having been refinanced on the same platform.

            On the other hand, loan purpose and being female reduce such odds. Characterized by credit rating, being a female applicant is relevant in positively determining loan default only in credit scores C and F, consequently there is no conclusive evidence that, under equal credit characteristics, lenders might benefit by screening default behavior by gender in this sample. However it was found that women have longer loan survival times measured in months (9.2) than men (5.4).

            Overall, at equal credit ratings, women have better default behavior, as measured by loan survival times alone.  The above mentioned results are in line with those found in the incipient literature of P2P lending and contribute to three streams of knowledge; Being, the effects of case and statistical discrimination in financial services, the study of determinants of default and differential attitudes toward delinquency by gender and age, and research in the recent field of P2P lending.

            The remainder of the article is organized as follows. Section II provides an overview of the existing literature. The generality of lending processes at P2P platforms and data collected are described in Section III. Section IV provides the research hypothesis and the test methodology. Section V reports the estimation results of the logistic model and the survival tables and Section VI concludes.

2.     LITERATURE REVIEW

            P2P lending is a new method of informal finance initiated in the U.K and which is also developing apace around the world. This model for loan origination uses the internet to directly connect borrowers with on-line communities. Various authors have studied the up-surge and evolution of market-place lending (BERGER; GLEISNER, 2009; BACHMANN et al., 2011).

            With limited participation of financial intermediaries, these types of platforms facilitate unsecured micro-credits. Arguably P2P lending gives lenders the opportunity to increase their income and offers debtors a means of accessing financing which could not have been possible when there are strong requirement for approval by the traditional financial intermediaries (Zeng, 2013).

            Conceptually, financial intermediation (FI) theory sets the foundation for market-place lending based on transaction costs (SCHOLES et al. 1976). It has been argued that FI could be used for alleviating the effects of market failures such as adverse selection and moral hazard (PAULY, 1974; AKERLOF, 1970; ALLEN; SANTOMERO, 1997).

            Hence, information asymmetry (IA) is perhaps, one of the most important sources of concern in finance markets, leading to credit rationing (STIGLITZ; WEISS, 1981). IA stems from the fact that borrowers are better informed than lenders about their ability and willingness to repay the debt (credit risk) and the propensity for early payment (re-investment risk).

            Creditors benefit from knowing the true characteristics of debtors, but moral hazard hampers direct IS between market participants (LELAND; PYLE, 1977), and verification by outside parties may be either costly or impossible (TOWNSEND, 1979). An important stream in financial theory considers that financial intermediaries provide the means to reduce the economic consequences of private information problems and transaction costs exchanges (DIAMOND; DYBVIG, 1983; DIAMOND, 1984; LEVINE et al. 2000).

            The effect of IA over nonpayment behavior has been quantified by Karlan and Zinman (2009), who in their study about the South African credit markets, found that about 15% of default was due to private information problems. Literature suggests that  IS may overcome adverse selection and reduce moral hazard, by raising borrowers’ effort to repay loans (JAPPELLI; PAGANO, 2006) or by avoiding excessive lending when each borrower may patronize several banks (BENNARDO; PAGANO, 2007).  Brown et al. (2009) showed that IS relates with improved availability and lower cost of credit to firms in transition economies. 

            In the case for P2P lending, there is very little assurance on the part of the lender that the borrower, having been possibly rejected by other financial intermediaries, will repay the loan (ZENG, 2013).

            Weiss et al. (2010) found evidence that the screening of potential borrowers is a major instrument in alleviating adverse selection preventing the online market to collapse. Information availability improves lender screening and dramatically reduces the default rate for high-risk credits, but has little effect on low-risk loans (MILLER, 2015). 

            Given the importance of credit rating, Moro et al. (2015) found that credit scoring is the main application used for predicting risk and supporting the loan approval process. In a joint effort with credit rating agencies, valuable information for understanding default behavior is shared with the lending communities by the platforms. Investors rely on credit scoring and the quality of the rating process in market-place loans, nevertheless “soft and self-reported” information also play important roles in their internal systems for decision making (KHWAJA, et al., 2009; FREEDMAN; JIN, 2014; DUARTE, et al., 2012; RAVINA, 2012). 

            Based on a statistical discrimination approach (PHELPS, 1972; ARROW, 1973), lenders have the opportunity to check the debtor´s credit grades, work and academic history, property, income to debt ratios and subjective information regarding use of credit, communication capacities and social ties in an attempt to separate payers from non-payers (RUIZ-UGARTE, 2010).

            Moreover, Zhang and Liu (2012) in their study about the herding behavior among backers at Prosper found that investors infer the solvency of borrowers by observing decisions of other creditors, and by using the publicly observable borrower characteristics.

            There is a somewhat richer literature explaining funding success in P2P loans, but this is by no means exhaustive (LEE; LEE, 2012; YUM, et al. 2012; LIN et al. 2013; GONZALEZ; LOUREIRO, 2014; ZHANG; LIU, 2012). For example, Herzenstein et al. (2008) showed that debtors’ financial strength, their listing and publicizing efforts and their demographic attributes, have an effect over the likelihood of funding success. In their study about P2P loan bidding, Weiss et al., (2010) showed that the most important factor used by lenders to allocate funds is the rating assigned by the P2P lending site.

            The empirical evidence evaluating non-payment behavior in P2P lending is limited and geographically concentrated in developed countries. To the best of our knowledge there are no studies analyzing such behavior in Latin America, or Mexico specifically.

            As per the literature, in recent studies about determinants of default behavior based on platforms like Lending Club (Serrano-Cinca et al. 2015), the authors tested the pertinence of variables that were classified according to the following categories: borrower assessment, loan characteristics, borrower characteristics, credit history, indebtedness and income related information. The factors explaining default were loan purpose, annual income, current housing situation, credit history and indebtedness and a model for predicting defaults was also estimated.

            Emekter et al. (2015) found that variables like credit grade, debt-to-income ratio, FICO score and revolving line utilization played an important role in loan defaults while Dietrich and Wernli (2016) showed that borrower-specific factors such as its economic status significantly influence lender evaluations of the borrower’s credit risk and thus the interest rates offered. Studies on the default behavior in the credit card market in Mexico, consider, as determinants, variables such as loan history, payment-income ratios and loan characteristics, factors that are usually included in parametric credit scoring models (García et al. 2015) and are consequent with the regulation and procedures established by the Mexican Banking Commission (CNBV, 2014).

3.     DATA COLLECTION

            As part of the application process, platforms collect quantitative and qualitative information about solicitants. Using credit rating algorithms and a minimum of human intervention, the platforms evaluate, approve and grade the loans if appropriate. As it is the norm in the credit industry, their model factors elements such as credit history, payment capacity and other pieces of information, some of them used traditionally in parametric credit analysis.  After the loan has been formalized it is submitted to their on-line lending community. Data collected and included in the study is first described in Table 1.

Table 1: Variables provided for the study

Variables

Description

Type

RQAMT

Requested amount in pesos (original)d

Scale

Amount_less_100000

Is requested amount less than 100,000 MP? (Coded)e

Binarya

Approved

Was the loan approved? (Coded)e

Binarya

LTERM

Loan Term in months (original)d

Scale

Short_Term

Is Term less than 12 months? (Coded)e

Binarya

CONSDEBT

Is the purpose of loan to consolidate debt? (Coded)e

Binarya

BUSINESS

Is the purpose of loan finance business? (Coded)e

Binarya

CARLOAN

Is the purpose a car loan? (Coded)e

Binarya

HOME

Is the purpose of loan home improvements? (Coded)e

Binarya

EDUC

Is the purpose of loan education? (Coded)e

Binarya

Gender_of

Is the gender male?  (Coded)e

Binarya

Marriage

Is the status of women married? (Coded)e

Binarya

AMOUNT_FUNDED

Actual amount funded (original)d

Scale

LRATE

Loan rate in % (original)d

Scale

GRADE

7 categories for credit grade (A-G) (Coded)e

Ordinalb

Credit_Risk

Is credit grade Prime (C1 or less)? (Coded)e

Binarya

Default

Coded DV. Is loan delinquent or bad credit? (Coded)e

Binarya

Funding_Time

weeks elapsed from registration to payment (Coded)e

Scale

Success_Funding

Scale from 1 = a week through 6 = 2 months (Coded)e

Ordinalc

REFINANCED

Was the loan refinanced? (Coded)e

Binarya

INVESTORS

Number of investors per loan (original)d

Scale

Dec_INCOME

Self-reported monthly income (original)d

Scale

Paid_capital

Loan amortization (original)d

Scale

Balance

Amount- Paid_capital (Calculated)

Scale

PMT_Balance

Ratio of monthly payment to balance (Calculated)

Scale

Monthly_PMT

Monthly payment (original)d

Scale

PMT_Income

Ratio of payment to income (Calculated)

Scale

Credit_Mos

Loan number of months to date (time effect) (Calculated)

Scale

OWNAUTO

Owns car? (Coded)e

Binarya

OWNHOUSE

Owns home? (Coded)e

Binarya

GRAD_UG

Declared graduate or undergraduate studies (Coded)e

Binarya

Table 1 Continued

 

 

METRO

Lives in the Mexico City Area? (Coded)e

Binarya

AGE

Age of applicant (Calculated)

Scale

Under_25

Is age under 25 years old? (Coded)e

Binarya

Male

Is the borrower male? (Coded)e

Binarya

Female

Is the borrower female? (Coded)e

Binarya

Previous_loan

Borrowed before in the platform? (Coded)e

Binarya

Default_mos

Months in delinquency (Calculated)

Scale

Days_default

Days in delinquency (Calculated)

Scale

Notes: aBinary response variables: 0=No, 1=Yes. bOrdinal 7 categories for Credit Grade from A down to G,  A-grade being the safest. cordinal 6 categories, from 1 being up to one week of funding time through 6 corresponding to more than 2 months to complete funding. d(original) = variable directly in the sample. eCoded by researcher.

            Our dataset includes 25,598 loan applications filed in the period from June 2012 through February of 2016. Loan approval rates depend on the product and are defined by management. Usually the approval strategy is designed around reducing risk by selecting only those applicants who are thought to have a very low risk of defaulting, thus the proportion accepted and the proportion of those accepted who subsequently default is inversely related. As per the information provided, the main cause for loan denial was the inability to meet one or more approval requirements. The descriptive statistics for variables in the data-set are presented in Table 2.

Table 2: Descriptive statistics for variables in the study

Variables

M

SD

Range

n

RQAMT

72,373

72,307

 5,000-250,000

25,998

Ammount_less_100000

.78

.42

 0-1

25,998

Approved

.05

.21

 0-1

25,998

LTERM

27.42

9.98

 7-120

25,598

Short_Term

.24

.43

 0-1

25,598

CONSDEBT

.41

.49

 0-1

25,998

BUSINESS

.23

.42

 0-1

25,998

CARLOAN

.07

.25

 0-1

25,998

HOME

.11

.32

 0-1

25,998

EDUC

.06

.24

 0-1

25,998

Gender_of

.65

.48

 0-1

25,998

Marriage

.64

.48

 0-1

8,940

AMOUNT_FUNDED

74,254

59,906

 0 – 250,000

1,161

LRATE

19.74%

3.76%

8.9%-27.9%

1,161

GRADE

4.27

1.27

 1-7

1,161

Credit_Risk

.13

.34

 0-1

1,161

Default

.11

.32

 0-1

1,161

Funding_Time

3.21

2.19

 0 – 6

1,161

Success_Funding

.29

.46

 0-1

1,161

Table 2 Continued

 

 

 

 

REFINANCED

.06

.23

 0-1

1,161

INVESTORS

61.22

46.68

 0 – 336

1,161

Multiple_Investors

.50

.50

 0-1

1,161

DecINCOME

22,838

22,306

 2,600 – 27,648

1,161

Paid_capital

18,285

22,110

  0 – 250,000

1,161

Balance

55,968

54,558

  0 – 250,000

1,161

PMT_Balance

.11

.56

 0 -11.95

1,161

Monthly_PMT

3,498

2,729

 352 -23,946

1,161

PMT_Income

.18

.11

 .009 – 1.55

1,161

Credit_Mos

17.41

10.98

 0 – 48

1,161

OWNAUTO

.67

.47

 0-1

1,161

OWNHOUSE

.47

.50

 0-1

1,161

property

.77

.42

 0-1

1,161

GRAD_UG

.85

.35

 0-1

1,161

METRO

.42

.49

 0-1

1,161

AGE

36.67

.82

 17-82

1,161

Over_60

.04

.19

 0-1

1,161

Under_25

.05

.22

 0-1

1,161

Male

.67

.47

 0-1

1,161

Female

.33

.47

 0-1

1,161

Previous_loan

.15

.36

 0-1

1,161

Rural

.04

.19

 0-1

1,161

Median_Income

.76

.43

 0-1

1,161

Default_mos

7.88

7.92

 .27 – 37.7

130

Days_default

236

237

 0 – 1165

130

Notes: n = 25,598 refers to the total loan applications: n = 1, 1161 to approved loans, n = 8,940 to women in the sample and n =130 to defaulted loans.

            Of total applicants, 65% are men and 35% women. The loan denial rate for men was 94.6% for men and 93.3% for women. Grouped by gender, in this section data collected through the platform is first described in Table 3.

Table 3: Results of t-tests and Descriptive Statistics of variables in the dataset, grouped by gender

 

Men

 

Women

 

95% CI for Mean Difference

 

 

Variable

M

SD

n

M

SD

n

t

df

Loan characteristics

 

 

 

 

 

 

 

 

RQAMT

74,772

73,461

16,650

67,910

69,893

8,948

5,033 , 8,690

7.36*

19,105

LTERM

27

10

16,650

27.61

9.88

8,948

 -.54, -.03

 -2.21*

18,565

LRATE

19.6%

3.8%

774

20.1%

3.7%

387

 -.97% -.06%

 -2.21*

1,159

INVESTORS

63

48

774

57

44

387

1.03, 12.42

 2.32*

1,159

REFINANCED

5.3%

22.4%

774

7.0%

25.5%

387

 -4.7%, 1.3%

-1.1

690

CONSDEBT

40.8%

49.1%

16,650

41.8%

49.3%

8,948

 -2.3%, .3%

1.54

18,249

BUSINESS

24.1%

42.8%

16,650

21.5%

41.1%

8,948

 1.5%, 3.6%

 4.62*

18,930

CARLOAN

8.0%

27.2%

16,650

4.9%

21.6%

8,948

 2.5%, 3.7%

10.02*

22,053

HOME

11.3%

31.7%

16,650

11.3%

31.7%

8,948

 -.8%, .9%

.11

25,596

EDUC

5.5%

22.7%

16,650

7.2%

25.8%

8,948

 -2.4%,-1.1%

 -5.33*

16,404

Property

 

 

 

 

 

 

 

 

 

OWNAUTO

51.9%

50.0%

16,650

40.1%

49.0%

8,948

 -1.1%, 1.3%

18.32*

18,615

OWNHOUSE

44.1%

49.7%

16,650

41.8%

49.3%

8,948

 1.1%, 3.6%

 3.58*

18,412

GRAD_UG

68.6%

46.4%

16,650

64.2%

47.9%

8,948

 3.2%, 5.6%

 7.11*

17,799

AGE

35.3

9.9

16,650

35.3

9.9

8,948

 -.31, .21

-.35

24,629

METRO

37.7%

48.5%

16,650

34.7%

47.6%

8,948

 1.8%, 4.3%

 4.81*

18,593

Notes: *p <.05. n = 16,650 for men and n = 8,948 for women.

            Loan purpose can be assimilated to perceived riskiness by lenders. As exhibited on Table 3, debt consolidation (CONSDEBT) is the most self-reported loan purpose, followed by financing business (BUSINESS). The Mexico City Metro Area (METRO) concentrates more than one third of total applications. The mean loan term (LTERM) in months is 27 and the non-weighted loan interest rate charged (LRATE) on average was 19.7%. 

            Results of the grouped samples t-tests show that mean requested amount RQAMT differ between men (M=74,772, SD =73,461) and women (M=67,910, SD=67,910) at the .05 level of significance (t = 7.36, df = 19, 105, n = 25,998, p < .05, 95% CI for mean difference is 5,033 to 8690). On average, women requested smaller loans than men, (6,962 pesos less). With respect to loan rates, variable LRATE also differs (M=19.96%, SD =3.8%) for men and (M=20.1%, SD=3.7%) for women at the .05 level of significance (t = -2.21, df = 1,159, n = 1,161, p < .05, 95% CI for mean difference is -.97% to .-06%). Female loan rates are quoted 50bp higher than those quoted for men. In the sample women are less likely to own property and in they do not concentrate in the Mexico City Area as men do.

            In the platform, filings are classified over 21 categories ranging from A1 being the highest quality rating through G3, the lowest. In Table 4, we observe the following gender patterns of behavior after controlling for credit rating that is, collapsing the full scoring set into a subset comprising only 7 sub-categories, from A down to G, A-grade being the safest.

Table 4: Means and Standard Errors for variables: LRATE, LTERM and RQAMT grouped by Grade

 

Men

 

Women

 

Variable/grade

M

SE

n

M

SE

n

LOAN  RATE

A

9.65%

.16%

28

10.23%

.29%

9

B

13.31%

.12%

41

13.40%

.18%

16

C

16.04%

.07%

133

16.05%

.10%

66

D

19.07%

.05%

229

19.02%

.08%

111

E

21.87%

.06%

233

21.83%

.07%

107

F

24.68%

.07%

102

24.78%

.09%

68

G

26.90%

2.1E-17%

8

27.40%

.17%

10

LOAN TERM

 

A

30.00

1.799

28

32.00

2.828

9

B

33.65

.861

41

32.25

2.112

16

C

30.56

.704

133

30.67

1.042

66

D

30.48

.573

229

29.46

.885

111

E

29.74

.591

233

30.30

.851

107

F

33.46

1.442

102

29.29

1.157

68

G

33.00

3.000

8

29.90

5.740

10

Amount

 

 

 

 

 

 

A

120,464.30

13,867.62

28

136,666.70

30,092.45

9

B

91,560.98

10,705.65

41

101,187.50

16,825.13

16

C

83,630.08

5,391.32

133

93,878.79

9,471.35

66

D

93,310.04

4,467.80

229

80,666.67

6,551.46

111

E

101,055.80

4,872.76

233

84,943.93

6,651.94

107

F

129,098.00

7,910.49

102

105,147.10

10,590.73

68

G

140,375.00

19,858.37

8

112,600.00

25,800.60

10

Notes: Credit rating (Grades) collapse the full scoring set (21 categories) into a subset comprising only 7 sub-categories, from A down to G, A-grade being the safest. N = 1,161 loans approved.

As can be observed in Table 4, for loans of the same quality, women in general pay higher rates and borrow greater amounts than men, especially at the safest categories. 

Defaulted loans are those that are charged-off, refinanced and/or late in payment. Good status loans are loans that are fully paid or current in payment schedule. For that matter the rate of default in this article is defined as:  .  Default is the dependent variable (DV) in this study and it´s a binary response coded variable which is equal to 1 if the loan has defaulted and 0 otherwise. The simple mean rate of default for men is 11.76% and 10.07% for women. Classified by grade, as can be observed in Table 5, there are no significant differences by gender in actual default rates.

Table 5: Results of t-test and Descriptive Statistics for Default by gender, classified over Credit Grades

 

Men

 

Women

 

 

 

Outcome

M

SE

n

M

SE

n

t

dfa

A

0

0

28

0

0

9

 --

 --

B

.049

.03

41

.063

0.25

16

.19

24

C

.070

.02

133

.030

0.02

66

-1.64

187

D

.122

.02

229

.100

0.03

111

-.65

236

E

.120

.02

233

.150

0.03

107

.73

189

F

.210

.04

102

.130

0.04

68

-1.27

159

G

.125

.13

8

.000

0.00

10

-1.00

7

Notes: adf: Satterthwaite’s degrees of freedom. *p < .05

4.     RESEARCH HYPOTHESES AND TEST METHODOLOGY

            The research question addressed in this study is what are the elements that could help lenders characterize default risk in P2P loans? Given the fact that on-line investors are exposed to the totality of credit risk, besides credit scoring, lending, web-sites must provide pertinent information for overcoming the adverse effect of IA. Default probabilities by grade are not directly observed by backers therefore it is believed that they can rationally benefit from statistical discrimination by factoring in information believed to be correlated with default probabilities.

            From a set of collected variables, mostly available to the on-line community, this study will attempt to identify, which of them are pertinent for assessing default rates. Previous studies on P2P non-payment behavior, at first, guide our variable selection process (EMEKTER et al. 2015; GARCÍA et al. 2015; WEISS et al., 2010; SERRANO-CINCA et al. 2015).

            In our study, bearing in mind our lenders perspective, and the fact that the platform has already factored in a scoring model all the information collected, we will be testing whether variables related to loan characteristics, demographics, loan performance in the platform, credit purpose, income and debt servicing capacity, besides credit scores are correlated with default comportment in the sample.

            The first hypothesis that would be tested is: Grades, as derived from the platform´s credit scoring model are considered determinants of default behavior in our P2P lending case.

            The second hypothesis that follows, in general is: Lenders benefit from statistical discrimination in the sense given by Schwab (1986), by factoring variables related to loan and demographic characteristics of the applicants: CONSDEBT, PMT_Income, Female, Paid_capital, OWNHOUSE, Credit_Mos (time effect), REFINANCED in addition to credit scoring variables (GRADE), as provided by the lending web-site. In this case, the benefit for investors resides in attaining a better knowledge of non-payment behavior by debtors.

            The third hypothesis is: The gender variable, specifically if the loan applicant is a female, is believed to be correlated with default behavior. Therefore, lenders may benefit from screening loans by sex, using taste-based discrimination. To be able to identify the case discrimination effect of gender we need to control credit quality. This hypothesis would be also tested after having controlled by GRADE derived from the platform’s credit scoring model.

            Our hypotheses testing rely on the reduced form model:  Where  is the expected value of  given. In our case  is the probability of default as a function of a set of available information about the borrower. Following Aguilera et al. (2006), the logistic regression model used for testing the hypotheses is defined in the following way: Let  be a set of continuous or categorical observed variables and let us consider n observations of those variables represented in the matrix = .  Let Y =