Binary Logistic Regression Modeling on Net Income of Pagar Alam Coffee Farmers

Pagar Alam Coffee is a Besemah coffee originating from the Smallholder Plantation in South Sumatra, Indonesia. The majority of Pagar Alam coffee farming is a hereditary business. Coffee farmers' income is very dependent on coffee production, production costs, and coffee prices. This study aims to obtain a probability model of Pagar Alam coffee farmers income based on the factors that influence it. The independent variables studied were the number of dependents, economic conditions, number of trees, age of trees, frequency of fertilizer used, frequency of pesticide used, production at harvest time, production outside harvest time, number of women workers outside the family, minimum price of coffee, maximum price of coffee, farmers' gross income, and land productivity. Modeling used binary logistic regression method on 179 respondents. There were three methods used, i.e. enter method, forward and backward methods. The model using enter method results the greatest prediction accuracy which is 87.7%. The factors that have a significant influence on the net income of Pagar Alam coffee farmers are gross income, land productivity, and the number of women workers from outside the family. The most influential variable is gross income.


Introduction
Coffee is one of the mainstay export commodities of Indonesian plantation. Based on export value in 2017, Indonesia is among the 10 largest coffee exporting countries in the world. This can be seen from the results of the selection of leading export commodities (winning commodities) using several analytical methods, namely Computable General Equilibrium (CGE) and Export Product Dynamics (EPD) (Data source: Indonesia Eximbank Institute and UNIED, 2019 in [1]). Indonesia is one of the fourth largest coffee producers in the world with an output of 6.84% of world coffee production. The provinces that contributed the most to Indonesia's coffee production were South Sumatra, Lampung, North Sumatra, Bengkulu, and Aceh.
In the WTO Series Webinars and Trade Policy Analysis on June 8-14, 2020, one of the speakers Dedi Budiman Hakim said that in 2017, the number of smallholder estates was estimated to have decreased by -0.07%, while state-owned and private estates rose by 0.07% and 0.18% respectively. The volume of coffee produced by smallholder plantations in 2017 is predicted to reach 599,902 tons, and the production growth will decrease by -0.37% compared to 2016. While state and private plantations' production volume were increased by 0.42% and 2.36% respectively. The factor that caused the decline in Indonesian coffee production was due to weather factors that did not support production activities. In addition, the factor of the lack of knowledge about coffee plantations by some farmers and the high price of fertilizers has caused Indonesia's coffee production to be at a maximum level.
The world coffee price is projected to increase due to the increasing demand for coffee for retail coffee shop needs, which is experiencing a positive trend due to a change in coffee drinking culture. In 2019, it is projected that Indonesian Coffee exports to the United States and several major destination countries will again grow positively in line with the increase in domestic production. Projections can be higher than expected probability model based on factors that are significantly related to net income. Data that is used base on [6]. Method used to determine the net income probability model is logistic regression. The estimation model in this paper can be used as a recommendation for parties from relevant agencies who need information about the factors that simultaneously influence the net income of coffee farmers in Pagar Alam. It can also be input for policy makers and related institutions to improve the welfare of coffee farmers through the optimization of production factors that affect farmers' incomes.
Logistic regression is a regression analysis that is used to describe the relationship between a dependent variable that is dichotomous (nominal or ordinal scale with 2 categories) or polychotomus (nominal or ordinal scale with more than 2 categories) and a set of predictor variables that are continuous or categorical [8]. If the dependent variable consists of two categories (binary), then the logistic regression is called binary logistic regression [9,10].
One of application of binary logistic regression models was in [11]. The model can be used to analyse the characteristics of songket craftsmen in Ogan Ilir Regency, so that the potential of the craftsmen and the factors that directly influence the productivity of the craftsmen can be known. Factors that greatly influence productivity can be recommended for policy makers in improving the welfare of craftsmen. The same thing is also obtained from the results of the probability model on the factors that affect the income of coffee farmers.

Results and Discussion
The variables used in this paper are significantly related to the net income variable of Pagar Alam coffee farmers based on the results of correspondence analysis [6]. Data  By defining the variables as in Table 1, the data matrix of 179 respondents was processed using SPSS version 24. Some of the outputs from data processing using the Enter method are as follows in Table 2. Based on Table 2, the number of respondents analyzed was 179 people and no data were missed.  Table 3, the dependent variable code is 0 for low net income and 1 for high net income.  Based on Table 5, with the Wald test at we reject H0, so there are independent variables that affect farmers' net income.

Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 1 Step

Model Summary
Step Similarly, Nagelkerke's value of 0.719 states that the independent variables in the model are able to explain the high or low net income of a farmer at 71.9%.  The Classification Table in Table 8 shows how well the model classifies net income cases into 2 categories. The overall accuracy of model prediction is 87.7%. The accuracy value of this model is obtained from the corresponding column based on predictions divided by the amount of data (or the number of respondents). This shows that the model is better than the previous model with only a constant which is only 63.1% in predicting the net income probability of Pagar Alam coffee farmers. While the accuracy of prediction of low and high net income farmers is respectively 72.7% and 96.5%.   and , so the probability as follows: Then, we get .
Based on the value π(x) obtained, it can be seen that the probability for high net income for coffee farmers who do not employ TKWL have gross income of 10 till 25 million rupiahs, and have land productivity of is 74.04 %.
The results of calculating the probability of high net income from a combination of categories available from the three variables can be seen in Table 10. Based on Table 10, if each of the independent variable categories is higher, then the probability value of the farmers' net income is higher. The most influential variable is gross income. While the variable with the smallest influence on the model is TKWL.
In each TKWL category, if the gross income category is 4 (notated by ), then for the land productivity in category 1 to 4 (starting from ), the value .
In addition, in each TKWL category, if the gross income category is 3 (notated by ), then for the land productivity in category 1 to 4 (starting from ), the value . Likewise, if the gross income category is 2 (notated by ), then for the land productivity in category 4 (notated by ), the value .
In this case, the increasing net income of coffee farmers can be represented by high gross income, higher number of female workers outside the family, and high land productivity. In each TKWL category, if the gross income category is 1 (notated by ), then regardless of the land productivity category , the probability value of net income is very small.
The model results if data processing uses the forward and backward methods as follows in Table 11. Table 11. Some outputs by using the forward step method

Model Summary
Step Based on Table 11, the forward step method (as many as 2 steps) results two independent variables that have a significant effect on income, namely Gross Income and TKWL. The overall accuracy of predictions is 86.6%. This percentage is lower than the accuracy of the model of the enter method's result.
The model results if data processing uses the backward methods as follows in Table 12.  Table 12, using the backward step method (as many as 11 steps), obtained 3 independent variables that have a significant effect on net income, namely gross income, land productivity, and TKW-L. The overall accuracy of predictions is 86.6%.
This percentage is the same as the accuracy of the model using the forward method, but lower than the model generated by the enter method. The following Table 13 shows a recapitulation of data processing results based on 3 methods. Backward 86.6 X 9 , X 12 , and X 13 Note: X 9 : TKWL, X 12 : Gross income, and X 13 : Land productivity Based on Table 13, the accuracy of the model resulting from the enter method is greatest. The coefficients of the independent variables on the model are also highest among three models.

International Journal of Applied Sciences and Smart Technologies
The resulting model contains the gross income variable which has the highest effect on net income. Net income is gross income which has been reduced by production costs incurred by farmers. Production costs include land management, crop maintenance, labor costs, and other costs. Labor wages are usually issued for workers from outside the family, both men and women. Women workers are paid for picking coffee fruit.
Plant maintenance includes the provision of fertilizers and weed control by herbicides.
Tillage also supports crop maintenance. Crop maintenance costs also relate to the age of the tree and the number of trees. Older trees need better maintenance, so that the roots remain sturdy and also need to be rejuvenated. The more trees, the greater the maintenance costs. Plant spacing that is too tight can reduce production. In this case, the frequency of fertilizer use, frequency of pesticide use, number of trees, and age of trees variables are related to production costs. Production costs are contained in the variable gross income.

Conclusion
The conclusions obtained from this study are the factors that have a significant influence on the net income of Pagar Alam coffee farmers are gross income, land productivity, and the number of women workers from outside the family.
Simultaneously, the gross income ( ), land productivity ( ), and the number of women workers from outside the family ( ) variables affect net income with the probability model: All the coefficients of the variables that have a significant effect are positive, then these variables can increase the probability value of the model. If each variable category gets higher, the probability value of the farmers' net income is higher. In each TKWL category, if the gross income category is 4 ( ) and land productivity category is starting from 1 ( ), then the value .
This study does not describe variables related to land productivity. Because in this study it was found that land productivity has a significant effect on the binary logistic regression model, it is necessary to examine the indirect relationship between variables that have no significant effect on the model on farmer's net income. In this case, it needs to be further analyzed by path analysis regarding the indirect effects of the number of trees, frequency of fertilizer used, frequency of pesticides used, crop production, land area, area of 1 tree, and length of time of harvest.