Title: | Local and Global Beta Regression |
---|---|
Description: | Fit a regression model for when the response variable is presented as a ratio or proportion. This adjustment can occur globally, with the same estimate for the entire study space, or locally, where a beta regression model is fitted for each region, considering only influential locations for that area. Da Silva, A. R. and Lima, A. O. (2017) <doi:10.1016/j.spasta.2017.07.011>. |
Authors: | Roberto Marques [aut, cre], Alan da Silva [aut] |
Maintainer: | Roberto Marques <[email protected]> |
License: | GPL-3 |
Version: | 1.0.5 |
Built: | 2024-10-31 21:09:45 UTC |
Source: | https://github.com/romarq23/gwbr |
Fits a global regression model using the beta distribution, recommended for rates and proportions, via maximum likelihood using a parametrization with mean (transformed by the link function) and precision parameter (called phi). For more details see Ferrari and Cribari-Neto (2004).
betareg_gwbr( yvar, xvar, data, link = c("logit", "probit", "loglog", "cloglog"), maxint = 100 )
betareg_gwbr( yvar, xvar, data, link = c("logit", "probit", "loglog", "cloglog"), maxint = 100 )
yvar |
A vector with the response variable name. |
xvar |
A vector with descriptive variable(s) name(s). |
data |
A data set object with |
link |
The link function used in modeling. The options are: |
maxint |
A Maximum number of iterations to numerically maximize the log-likelihood function in search of the estimators. The default is |
A list that contains:
parameter_estimates
- Parameter estimates.
phi
- Precision parameter estimate.
residuals
- Table with observed values (y
), estimated values in classical regression (yhatcl
), pure residual in classical regression (ecl
), estimated values (yhat
), the link function applied in the estimated values (eta
), pure residual (res
), standardized residual (resstd
), standardized weighted residual 2 (resstd2
), residual deviance (resdeviance
), Cooks distance (cookD
) and generalized leverage (glbp
).
log_likelihood
- Log-likelihood of the fitted model.
aicc
- Corrected Akaike information criterion.
r2
- Pseudo R2 and adjusted pseudo R2 statistics.
bp_test
- Breusch-Pagan test for heteroscedasticity.
link_function
- The link function used in modeling.
n_iter
- Number of iterations used in convergence.
data(saopaulo) output_list=betareg_gwbr("prop_landline",c("prop_urb","prop_poor"),saopaulo) ## Parameters output_list$parameter_estimates ## R2 and AICc output_list$r2 output_list$aicc
data(saopaulo) output_list=betareg_gwbr("prop_landline",c("prop_urb","prop_poor"),saopaulo) ## Parameters output_list$parameter_estimates ## R2 and AICc output_list$r2 output_list$aicc
The Golden Section Search (GSS) algorithm is used in searching for the best bandwidth for geographically weighted regression. For more details see Da Silva and Mendes (2018).
gss_gwbr( yvar, xvar, lat, long, data, method = c("fixed_g", "fixed_bsq", "adaptive_bsq"), link = c("logit", "probit", "loglog", "cloglog"), type = c("cv", "aic"), globalmin = TRUE, distancekm = TRUE, maxint = 100 )
gss_gwbr( yvar, xvar, lat, long, data, method = c("fixed_g", "fixed_bsq", "adaptive_bsq"), link = c("logit", "probit", "loglog", "cloglog"), type = c("cv", "aic"), globalmin = TRUE, distancekm = TRUE, maxint = 100 )
yvar |
A vector with the response variable name. |
xvar |
A vector with descriptive variable(s) name(s). |
lat |
A vector with the latitude variable name. |
long |
A vector with the longitude variable name. |
data |
A data set object with |
method |
Kernel function used to set bandwidth parameter. The options are: |
link |
The link function used in modeling. The options are: |
type |
Can be |
globalmin |
Logical. If |
distancekm |
Logical. If |
maxint |
A maximum number of iterations to numerically maximize the log-likelihood function in search of parameter estimates. The default is |
A list that contains:
global_min
- Global minimum of the function, giving the best bandwidth (h
).
local_mins
- Local minimums of the function.
type
- Function used to estimate the bandwidth.
data(saopaulo) output_list=gss_gwbr("prop_landline",c("prop_urb","prop_poor"),"y","x",saopaulo,"fixed_g") ## Best bandwidth output_list$global_min
data(saopaulo) output_list=gss_gwbr("prop_landline",c("prop_urb","prop_poor"),"y","x",saopaulo,"fixed_g") ## Best bandwidth output_list$global_min
Fits a local regression model for each location using the beta distribution, recommended for rates and proportions, using a parametrization with mean (transformed by the link function) and precision parameter (called phi). For more details see Da Silva and Lima (2017).
gwbr( yvar, xvar, lat, long, h, data, xglobal = NA_character_, grid = data.frame(), method = c("fixed_g", "fixed_bsq", "adaptative_bsq"), link = c("logit", "probit", "loglog", "cloglog"), distancekm = TRUE, global = FALSE, maxint = 100 )
gwbr( yvar, xvar, lat, long, h, data, xglobal = NA_character_, grid = data.frame(), method = c("fixed_g", "fixed_bsq", "adaptative_bsq"), link = c("logit", "probit", "loglog", "cloglog"), distancekm = TRUE, global = FALSE, maxint = 100 )
yvar |
A vector with the response variable name. |
xvar |
A vector with descriptive variable(s) name(s). |
lat |
A vector with the latitude variable name. |
long |
A vector with the longitude variable name. |
h |
The bandwidth parameter. |
data |
A data set object with |
xglobal |
A vector with descriptive variable(s) name(s) with global effect. |
grid |
A data set with the location variables. Only used when the location variable are in another data set, different from data set used in parameter |
method |
The kernel function used. The options are: |
link |
The link function used in modeling. The options are: |
distancekm |
Logical. If |
global |
Logical. If |
maxint |
A maximum number of iterations to numerically maximize the log-likelihood function in search of the parameter estimates. The default is |
A list that contains:
parameter_estimates_qtls
- Parameter estimates quartiles and interquartile range.
parameter_estimates_desc
- Parameter estimates mean, minimum and maximum.
std_qtls
- Standard deviation quartiles and interquartile range.
std_desc
- Standard deviation mean, minimum and maximum.
est_n_parameters
- Number of parameters.
est_gwr_parameters
- Effective number of parameters in the local model.
phi
- Vector of precision parameter estimates.
global_parameter
- Global parameter estimates, when existing.
global_phi
- Global scale parameter estimate, when existing.
global_parameter_tab
- Global parameter estimates table, when existing.
residuals
- Table with observed values (y
), estimated values (yhat
), the link function applied in the estimated values (eta
), pure residual (res
), standardized residual (resstd
), standardized weighted residual 2 (resstd2
), residual deviance (resdeviance
), Cooks distance (cookD
), generalized leverage (glbp
) and number of iterations (iteration
).
log_likelihood
- Log-likelihood of the fitted model.
aicc
- Corrected Akaike information criterion.
r2
- Pseudo R2 and adjusted pseudo R2 statistics.
bp_test
- Breusch-Pagan test for heteroscedasticity.
w
- Matrix of weights.
parameters
- Table with parameter estimates of each model.
significance
- Significance level of each model.
bandwidth
- Bandwidth used.
link_function
- The link function used in modeling.
data(saopaulo) output_list=gwbr("prop_landline",c("prop_urb", "prop_poor"),"y","x",116.3647,saopaulo) ## Descriptive statistics of the parameter estimates output_list$parameter_estimates_desc ## Table with all parameter estimates and your respective statistics output_list$parameters
data(saopaulo) output_list=gwbr("prop_landline",c("prop_urb", "prop_poor"),"y","x",116.3647,saopaulo) ## Descriptive statistics of the parameter estimates output_list$parameter_estimates_desc ## Table with all parameter estimates and your respective statistics output_list$parameters
Data from 2010 of the municipalities of Sao Paulo state, Brazil.
data(saopaulo)
data(saopaulo)
A data frame with 644 observations and 14 variables:
municipality
Municipality name.
state
State.
geocode
Municipality geocode according to IBGE.
households
Number of households.
landline
Number of households with landline.
pop
Total population.
pop_rural
Rural population.
pop_urb
Urban population.
hdim
Municipal Human Development Index.
prop_urb
Proportion of urban population.
prop_poor
Proportion of poor population (Considering per capita household income equal or less than R$140.00 per month).
prop_landline
Proportion of households with landline.
x
Longitude of the centroid of the city.
y
Latitude of the centroid of the city.