Determinants of using formal vs informal financial sector in BRICS group

The determinants of the usage of the formal versus the informal financial sector within the BRICS countries are analysed. Regression tree and probit methods are applied to a subset of observations from the 2021 Global Findex database. Results of these different methods are robust and complement each other. The main findings are: (a) Individuals with regular income has higher probability of using the formal financial sector; (b) There is a nonlinear relationship with age and the financial sector channels, individual above 36 are less likely to use the informal channel but are more likely to use the formal channel.


Introduction
Financial inclusion is an enabler for seven of the 17 Sustainable Development Goals (SDGs) of the United Nations (Bathula and Gupta, 2021).The World Bank advocates for financial inclusion as a tool to reduce poverty and inequality while advancing economic development (Daud and Ahmad, 2023).Financial inclusion is defined by the World Bank as access and use of affordable financial products and services.However, financial institutions can be classified into informal and formal.Where the former is based on interpersonal relationships and the latter depends on anonymous interaction between a client and a regulated formal institution.
Subsequently, this paper uses the most comprehensive individual-level data from the global financial index database of the World Bank.Our objective is to identify and quantify the factors driving the use of the formal or informal financial sectors in Brazil (BRA), Russia (RUS), India (IND), China (CHN), and South Africa (ZAF).The following research questions are addressed: (1) What are the factors determining financial inclusion in BRICS countries?To answer this question, descriptive statistics and the regression tree method is applied to find who is most likely to do financial transactions.(2) What are the factors determining the choice of either formal or informal financial services in the BRICS countries for savings and borrowing?The rest of the paper is structured as follows; Section 2 presents the data and methodologies applied.Section 3 focuses on the empirical results and discussion, and Section 4 concludes the study.

Data and methodology
The analysis uses 2021 individual-level data of BRICS nations from the Global Findex database of the World Bank surveys.Of the individuals in the sample 55% and 48% had borrowed or saved in the year before the survey, respectively.However, most of them used the formal sector, and the use of the informal sector is less widespread.In the analysis, the main variables of interest are dummies describing if an individual saves or borrows either formally or informally.These "financial transaction" variable takes 1 if either save or borrow.In addition, data on the demographical, educational and income characteristics of the individuals are used to identify and quantify the most important factors behind this decision.Table A1 in the appendix shows the description and explanation of all the seven financial inclusion measures entering our models as the dependant variable and the explanatory variables.Table A2 in the appendix shows the main descriptive statistics of the variables.
The analysis uses several methods to identify the factors behind the abovementioned variables.First, the Conditional Inference Tree the 'Ctree' -algorithm is applied, which estimates a regression relationship by binary recursive partitioning response variable in a conditional inference framework (Hothorn et al., 2006;Kuhn and Johnson, 2018).In this procedure, the dataset is randomly partitioned into two subsamples, a test and a train sample with 7486 and 2981 observations, respectively.The regression tree method is particularly important as it reveals that certain characteristics are often accompanied by a different behaviour.Ctree models can handle different types of data and unlike linear models, with trees you do not need to set the form of the predictor relationship with the response (Hothorn et al., 2006;Kuhn and Johnson, 2018).
From the initial results of the regression tree, a probit model is specified to quantify the importance and significance of each explanatory variable.It turns out that savings and borrowing behaviour, both, are highly influenced by having a regular income, as is predicted by theory, therefore the endogeneity of this explanatory variable cannot be excluded.Therefore, an instrumental probit model is estimated to understand how this endogeneity influences the results.The income variable ("receive wage") of the first stage regression is a dummy variable indicating if the individual received a wage in the previous year.It is instrumented with another dummy, "mobile phone", indicating whether the individual owns a mobile phone.

Regression tree
Fig. 1 shows a boxplot as a decision tree presented downward with 15 nodes, obtained from the test sample.Terminal nodes are presented by the shaded box plot (inter-quartile range) of our response variable (Financial Transaction) and can take a value of 1 or 0. Node1 represents the most important variable in our tree.Node 1 and 2 suggest that the formal and informal financial sectors are substitutes in the BRICS.These results are different from those of Sibindi and Mpofu (2022) in Nigeria where the formal financial sector complements the informal.In node 1, individuals are split into 2, those who use the formal sector for financial transactions go to node 15, about 3311 observations from the test sample (45%).In terminal node 14, we see 1215 individuals (16% of the test sample) that opt only for informal financial transactions.
While individuals who did not participate in either of the sectors to borrow and save money can do financial transactions indirectly through the channel of mobile phone (node 3).From node 3 individuals without mobile phone goes to terminal node 4 and represent individuals who do not do financial transactions.Terminal node 13 represents individuals who do financial transactions using a mobile phone (node 3), these individuals also receive a wage (node 5) that is above the middle-income quintile (node 11).These results support the argument that higher income and owning a mobile phone increase the probability of having access to digital finance (Bathula and Gupta, 2021;Pandey et al., 2023).Finally, terminal nodes 7, 9,10 and 12 show that individuals with a mobile phone, but have lower income, lower education levels, and no regular income are excluded from the financial transaction.Overall, the tree results are in line with previous studies (Bathula and Gupta, 2021;Demirgüç-Kunt et al., 2016;Goodstein and Rhine, 2017).The effects of individual attributes on financial transactions are larger and owning a mobile phone alone does not translate to a financial transaction not unless one received a wage within the high-income quintile.

Regression results
Regression analysis is used to detect how different factors influence the usage of the financial sector.First, financial inclusion is analysed where no distinction is made regarding whether individuals borrow or save.This analysis is complemented by understanding the differences between borrowing and saving decisions.

Probit models on financial inclusion
Individuals' relationship with the financial sector is heterogenous, more than every fourth person in the sample report neither saving nor borrowing using either of the two sectors.The factors increasing the likelihood of performing any kind of financial transaction are not very surprising: people with higher education or higher income are more likely to actively manage their finances.Age has an inverted U-shape relationship with using financial services: people older than 42 are less likely to be actively engaged with the financial sector.The estimated coefficients and the standard errors of the probit model can be seen in Table 1.The table shows the results for all three of our main variables of interest: actively using the formal financial sector (formal: regression 2), or the informal financial sector (informal: regression 3), or any of them (financial services: regression 1).
The use of the formal financial sector is very similar to the general picture in the sense that education and income level both play an even stronger role, and the threshold age level is also 42 years.On the other hand, the use of the informal financial sector is different: young, less educated, and low-income people have a higher probability of using it.Receiving wage income regularly plays a key part in the usage of financial services, however, it is less important in the informal sector, as the estimated coefficient suggests.Higher-income families also tend to have less relationship with the informal sector.Gender differences appear in the regression: educated males are generally more likely to use services both in the formal and informal sectors than educated women, whereas low-educated persons generally favour the informal sector.In James (2015) the informal financial sector of South Africa was found to cater for the demand for financial transactions from poorer relatives or neighbours, low educated and those belonging to savers clubs.

Instrumented Probit models on financial inclusion
Financial decisions, let they either be saving or borrowing, are highly related to the level of income people have.The largest coefficients of the explanatory dummy variables belong to the receive_wage variable, which expresses the fact that the individual has received labour income earlier.The endogeneity of the wage variable is expected, regular income can make financial services affordable to individuals.Endogeneity tests confirmed this hypothesis; therefore, the initial regression has been repeated using an instrumental variable probit regression method.In the IV probit mobile phone ownership was used as an instrument.
Some financial services are available on mobile phones, these are partially indicated in the initial dataset.Some individuals in 4 of the countries (except for China1 ) had reported owning mobile money accounts or using mobile money accounts to borrow.In the total sample, 18 percent had reported having a mobile money account and only 2 percent used it to borrow.There is no direct evidence that mobile phone ownership and financial inclusion are directly linked to each other.On the other hand, mobile phone ownership, as an instrumental variable can be used to predict receive_wage.
Guided by literature, e.g.Allen et al. (2016); Bathula & Gupta (2021), tertiary education has been removed from the equation of financial services and used as an instrument as well.Educational level influences the usage of financial services only through its impact on the labour market opportunities of the individuals, namely the higher probability of employment and a higher level of income.
The first stage regressions on receive_wage have highly significant explanatory variables including age, age square, country dummies, mobile phone ownership, tertiary education, primary education, and gender (table A3 in the appendix).The instrumented receive_wage variable was then used to predict the usage of financial services both in the formal sector and in the informal sector.The results of the IV probit regressions are shown in Table 2.
The analysis supports the role of income in using financial services: individuals with regular income (receive wage) have a higher probability of using the formal financial sector.This variable is again the main factor behind choosing the channel.Another difference compared to the regular probit regression is that the age threshold seems to be different: individuals above 36 are less likely to use the informal channel but are more likely to use the formal channel.Because of this mixed effect, the age variables are not significant in regression (4).Males are more likely to choose the informal channel than women, however, there seems to be no difference between higher-educated and lower-educated women.

Instrumented probit models on borrowing and saving decisions
Financial access does not guarantee that both borrowing and saving are equally available or equally demanded.Separate IV probit regressions are used to detect the differences between borrowing and saving.Table A4 in the appendix page shows the first stage regressions results.Using the same specification as for the financial transactions' variable, the following general picture emerges (see Table 3 below).Regular income has a dividing role in the decisions: people with regular income are more likely to save and choose the formal sector for their savings.They are also more likely to borrow (but to a smaller degree) mainly from the formal sector.Borrowing from the informal sector is in a negative relationship with the received income: people with regular income are likely to avoid this financial source.Similar to our findings, Babajide (2011) finds strong links between the formal and informal financial sectors in the savings market but not in the credit market using Nigerian data.
The age thresholds, as the coefficients of the linear and quadratic terms define them, show a very intuitive picture.Borrowing Table 1.Probit regression on using formal vs informal financial sectors.
(  follows an inverted U-shaped pattern: after a certain age borrowing declines.There is no difference between the formal and the informal sector in this regard.However, the threshold age is different, it is smaller in the informal sector (36) than in the formal sector (40).Saving in the informal sector shows a similar story, above 43 years of age, people are less likely to use it.However, the usage of the formal financial sector for savings follows a U-shaped pattern: people above 36 are more likely to save there, and every additional year will increase the probability of saving in the formal sector.Apart from the few years overlapping the intervals, the pattern that emerges from the regression confirms that the informal sector is more likely used by the younger generations.

Conclusion
Results show that people with low education level without regular income save less and borrow more and has a higher probability to turn towards the informal financial sector.In the dataset, these decisions are mainly carried out by male respondents which might reflect the influence of social norms and cultural values.In China, lower levels of financial inclusion for women are associated with a high level of male dominance in bank account ownership (Pandey et al., 2023).More specifically, there is evidence of an inverted U-shaped relationship between age and transacting in the informal financial sector, as individuals above 36 years old are less likely to use the channel.Intuitively, individuals above 36 have stable careers with stable salaries and thus they opt for the formal sector to save and borrow.On the other hand, for the formal sector, a U-shaped relationship has been detected.Receiving wages is an important variable for financial transactions in both sectors.Individuals who receive a wage, save, and borrow through the formal sector.
The result of this study provides an understanding of how individual factors determine the level of both formal and informal financial inclusion.More specifically, the implication of our findings regarding factors driving informal and formal financial transactions can be incorporated when building specialized policies for enhancing financial inclusion.

Declaration of Competing Interest
The authors declare no conflict of interest.

Fig. 1 .
Fig. 1.. Regression tree -Factors determining the decision to participate in financial transactions Source: R-output.

Table 2
Instrumental variable probit regressions on using the formal vs informal financial sectors.