| Title: | Hierarchical Bayesian Area-Level Small Area Estimation Models |
|---|---|
| Description: | Fits area-level Hierarchical Bayesian Small Area Estimation models. The methodological foundation follows the standard area-level Small Area Estimation literature, primarily Rao and Molina (2015, ISBN: 9781118735787) <doi:10.1002/9781118735855>, while computational implementation is adapted to the parameterisation and prior-specification conventions of the 'brms' package <doi:10.18637/jss.v080.i01>, which targets the Stan back-end. Supports a principled Bayesian workflow <doi:10.48550/arXiv.2011.01808>, with prior predictive checks, convergence diagnostics, model comparison, spatial random effects, custom distributions, missing-data handling, and a bilingual 'shiny' application for non-programmer analysts. |
| Authors: | Achmad Syahrul Choir [aut, cre]
|
| Maintainer: | Achmad Syahrul Choir <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.1.0 |
| Built: | 2026-06-03 11:32:18 UTC |
| Source: | https://github.com/madsyair/hbsaems |
A small example adjacency matrix used for fitting Conditional
Autoregressive (CAR) random effects on the **province** level.
Pairs with data_fhnorm and data_betalogitnorm, whose
province column has matching labels.
adjacency_matrix_caradjacency_matrix_car
A binary symmetric matrix with row- and
column-names province_01 .. province_05; entries are
1 for adjacent province pairs and 0 otherwise.
Simulated.
A small example adjacency matrix used for fitting Conditional
Autoregressive (CAR) random effects on the **regency** (coarse
spatial-cluster, kabupaten) level. Pairs with data_binlogitnorm
and data_lnln, whose regency column has matching labels.
adjacency_matrix_car_regencyadjacency_matrix_car_regency
A binary symmetric matrix with row- and
column-names regency_01 .. regency_05; entries are
1 for adjacent regency pairs and 0 otherwise.
The naming regency_01..05 (two-digit suffix) is reserved
in this package for the COARSE 5-level cluster used by
data_binlogitnorm and data_lnln. The 100-level FINE
regency column in data_fhnorm and data_betalogitnorm
uses three-digit suffixes (regency_001..100) and pairs with
the larger spatial_weight_sar matrix. See
vignette("hbsaems-spatial") for the naming convention.
Simulated.
Returns the custom_family + stanvars pair required to fit a
brms model with a Loglogistic response distribution. The Stan code
is loaded from inst/stan/loglogistic.stan via
build_brms_custom_family; post-processing functions
(log_lik, posterior_predict, posterior_epred) are
wired in automatically.
brms_custom_loglogistic()brms_custom_loglogistic()
Two parameters:
muScale (mu > 0, log link).
betaShape (beta > 0, log link).
A list with elements custom_family and
stanvars_family ready for use with brms::brm(). The
returned custom_family object has log_lik,
posterior_predict, and posterior_epred registered so
that brms::loo(), brms::posterior_predict(), and
brms::posterior_epred() work without further setup.
dloglogistic,
build_brms_custom_family,
register_hbsae_brms_custom.
library(hbsaems) library(brms) ll <- brms_custom_loglogistic() # fit <- brm(y ~ x, data = d, # family = ll$custom_family, # stanvars = ll$stanvars_family) # loo(fit) # uses log_lik_loglogistic # posterior_predict(fit) # uses posterior_predict_loglogistic # posterior_epred(fit) # uses posterior_epred_loglogisticlibrary(hbsaems) library(brms) ll <- brms_custom_loglogistic() # fit <- brm(y ~ x, data = d, # family = ll$custom_family, # stanvars = ll$stanvars_family) # loo(fit) # uses log_lik_loglogistic # posterior_predict(fit) # uses posterior_predict_loglogistic # posterior_epred(fit) # uses posterior_epred_loglogistic
Returns the custom_family + stanvars pair required to fit a
brms model with a Shifted Loglogistic response distribution. The Stan
code is loaded from inst/stan/shifted_loglogistic.stan via
build_brms_custom_family; post-processing functions
(log_lik, posterior_predict, posterior_epred) are
wired in automatically.
brms_custom_shifted_loglogistic()brms_custom_shifted_loglogistic()
Three parameters:
muLocation parameter (identity link).
sigmaScale parameter (sigma > 0, log link).
xiShape parameter (identity link).
A list with elements custom_family and
stanvars_family ready for use with brms::brm().
dshifted_loglogistic,
build_brms_custom_family,
register_hbsae_brms_custom.
library(hbsaems) library(brms) sll <- brms_custom_shifted_loglogistic() # fit <- brm(y ~ x, data = d, # family = sll$custom_family, # stanvars = sll$stanvars_family)library(hbsaems) library(brms) sll <- brms_custom_shifted_loglogistic() # fit <- brm(y ~ x, data = d, # family = sll$custom_family, # stanvars = sll$stanvars_family)
Convenience function that combines custom_family and
stanvar into the standard list(custom_family,
stanvars_family) pair returned by every brms_custom_* helper in
hbsaems. The Stan code is read directly from
inst/stan/<name>.stan – the user does not need to maintain it as
an R string.
build_brms_custom_family( name, dpars, links, lb = NA, ub = NA, type = c("real", "int"), loop = FALSE, log_lik = NULL, posterior_predict = NULL, posterior_epred = NULL, use_stan_native = FALSE )build_brms_custom_family( name, dpars, links, lb = NA, ub = NA, type = c("real", "int"), loop = FALSE, log_lik = NULL, posterior_predict = NULL, posterior_epred = NULL, use_stan_native = FALSE )
name |
Character. Family name; must match a
|
dpars |
Character vector of distributional parameter names. The
first MUST be |
links |
Character vector of link functions, same length as
|
lb, ub
|
Numeric vectors of lower / upper bounds; |
type |
|
loop |
Logical. |
log_lik |
Optional function for computing observation-level
log-likelihoods (used by |
posterior_predict |
Optional function for drawing from the posterior
predictive distribution (used by |
posterior_epred |
Optional function returning the conditional
expectation |
use_stan_native |
Logical. If |
Stan code is built once per call; if your code is reused many times, wrap the returned object in a function and call it lazily.
A list with two elements:
custom_familyA brms::customfamily object.
stanvars_familyA brms::stanvars object with the
Stan code in the functions block, or NULL when
use_stan_native = TRUE.
read_stan_function,
register_hbsae_brms_custom.
library(hbsaems) library(brms) # Build the loglogistic family. Stan code is loaded from # inst/stan/hbsae_loglogistic.stan — the `hbsae_` prefix avoids # collision with Stan's BUILT-IN `loglogistic_lpdf` (Stan >= 2.29). ll <- build_brms_custom_family( name = "hbsae_loglogistic", dpars = c("mu", "beta"), links = c("log", "log"), lb = c(0, 0), ub = c(NA, NA), type = "real" ) class(ll$custom_family)library(hbsaems) library(brms) # Build the loglogistic family. Stan code is loaded from # inst/stan/hbsae_loglogistic.stan — the `hbsae_` prefix avoids # collision with Stan's BUILT-IN `loglogistic_lpdf` (Stan >= 2.29). ll <- build_brms_custom_family( name = "hbsae_loglogistic", dpars = c("mu", "beta"), links = c("log", "log"), lb = c(0, 0), ub = c(NA, NA), type = "real" ) class(ll$custom_family)
Constructs a spatial weight / adjacency matrix suitable for use
as the M argument of hbm. The function accepts
either a path to a shapefile (.shp) or an sf /
Spatial* object already in memory, and returns a square numeric
matrix that is automatically validated against the theoretical
requirements of the target model class via
check_spatial_weight.
build_spatial_weight( shp, for_model = NULL, type = NULL, style = NULL, k = 4L, threshold = NULL, id_col = NULL, longlat = FALSE, validate = TRUE )build_spatial_weight( shp, for_model = NULL, type = NULL, style = NULL, k = 4L, threshold = NULL, id_col = NULL, longlat = FALSE, validate = TRUE )
shp |
Either a character path to a |
for_model |
Optional convenience argument: |
type |
Character. Neighbour definition:
|
style |
Character. |
k |
Integer. Number of nearest neighbours when
|
threshold |
Numeric. Distance threshold (in CRS units) when
|
id_col |
Optional character. Column name in the spatial
object whose values become the matrix |
longlat |
Logical. Whether coordinates are in longitude/latitude. |
validate |
Logical. Whether to run
|
Build a Spatial Weight or Adjacency Matrix from Shapefile
The combination of type (neighbour definition) and style
(matrix coding) determines whether the resulting matrix is
theoretically appropriate for a CAR or SAR model. The recommended
combinations are:
| Model | type | style | Reference |
| CAR / ICAR / BYM2 | queen or rook | B | Besag (1974, 1991), Riebler et al. (2016) |
| SAR (lag/error) | knn or distance | W | Anselin (1988), Whittle (1954) |
| CAR (irregular) | knn or distance | B | kNN auto-symmetrised when style="B"
|
Use the convenience wrapper for_model = "car" or
for_model = "sar" to set both type and style
automatically from the recommendation above. When type or
style is also supplied, the explicit argument wins (the helper
only fills in the missing one).
must be binary, symmetric, and have
zero diagonal. The full distribution is
with .
should be row-standardised so that
lies in . Symmetry is not required.
All matrices are post-checked with check_spatial_weight
and offending matrices are flagged.
A square numeric matrix with row/column names taken from
id_col (if supplied) or sequential integers, plus the
following attributes: "hbsaems_type", "hbsaems_style",
"hbsaems_for_model", "hbsaems_check" (the result of
check_spatial_weight).
KNN graphs depend on the order in which ties are broken, so when several centroids are exactly equidistant the resulting matrix may depend on feature ordering.
Anselin, L. (1988). Spatial Econometrics: Methods and Models. Kluwer Academic Publishers.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). JRSS B 36(2), 192–236.
Bivand, R. S., Pebesma, E., & Gomez-Rubio, V. (2013). Applied Spatial Data Analysis with R (2nd ed.). Springer.
check_spatial_weight, hbm,
adjacency_matrix_car, spatial_weight_sar
# Inspect a pre-built CAR adjacency matrix shipped with the package data("adjacency_matrix_car") dim(adjacency_matrix_car) check_spatial_weight(adjacency_matrix_car, spatial_model = "car", verbose = FALSE)$compatible # Build a CAR matrix from an sf object (requires sf + spdep) if (requireNamespace("sf", quietly = TRUE) && requireNamespace("spdep", quietly = TRUE)) { library(sf) # A small 2x2 grid of polygons g <- st_sf( id = LETTERS[1:4], geometry = st_sfc( st_polygon(list(rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0)))), st_polygon(list(rbind(c(1,0), c(2,0), c(2,1), c(1,1), c(1,0)))), st_polygon(list(rbind(c(0,1), c(1,1), c(1,2), c(0,2), c(0,1)))), st_polygon(list(rbind(c(1,1), c(2,1), c(2,2), c(1,2), c(1,1)))) ) ) M_car <- build_spatial_weight(g, for_model = "car") M_sar <- build_spatial_weight(g, for_model = "sar", k = 2L) }# Inspect a pre-built CAR adjacency matrix shipped with the package data("adjacency_matrix_car") dim(adjacency_matrix_car) check_spatial_weight(adjacency_matrix_car, spatial_model = "car", verbose = FALSE)$compatible # Build a CAR matrix from an sf object (requires sf + spdep) if (requireNamespace("sf", quietly = TRUE) && requireNamespace("spdep", quietly = TRUE)) { library(sf) # A small 2x2 grid of polygons g <- st_sf( id = LETTERS[1:4], geometry = st_sfc( st_polygon(list(rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0)))), st_polygon(list(rbind(c(1,0), c(2,0), c(2,1), c(1,1), c(1,0)))), st_polygon(list(rbind(c(0,1), c(1,1), c(1,2), c(0,2), c(0,1)))), st_polygon(list(rbind(c(1,1), c(2,1), c(2,2), c(1,2), c(1,1)))) ) ) M_car <- build_spatial_weight(g, for_model = "car") M_sar <- build_spatial_weight(g, for_model = "sar", k = 2L) }
Performs three independent checks on the supplied dataset and returns a
structured hbsaems_data_check object that summarises the results.
This function is intended to be called before hbm or
any of the distribution-specific wrappers.
check_data( data, response, auxiliary = NULL, area_var = NULL, spatial_var = NULL, M = NULL, trials = NULL, n_var = NULL, predictors = NULL, group = NULL, sre = NULL )check_data( data, response, auxiliary = NULL, area_var = NULL, spatial_var = NULL, M = NULL, trials = NULL, n_var = NULL, predictors = NULL, group = NULL, sre = NULL )
data |
A |
response |
Character. Name of the response variable in
|
auxiliary |
Character vector. Names of the auxiliary variables (the SAE 'X' covariates). |
area_var |
Optional character. Name of the area (random-effect grouping) variable. |
spatial_var |
Optional character. Name of the spatial-grouping
variable (column in |
M |
Optional spatial weight matrix to dimension-check
against |
trials |
Optional character. Name of the trials variable (binomial models). |
n_var |
Optional character. Name of the sample-size variable (beta / lognormal direct-estimator models). |
predictors |
Deprecated. Use |
group |
Deprecated. Use |
sre |
Deprecated. Use |
Verifies that response, every name in auxiliary, and
the optional area_var / spatial_var / trials /
n_var columns exist in data. Missing variables are
reported in $issues.
The pattern is one of:
"none"All listed columns are complete.
"y_only"Only the response has NAs.
"x_only"Only the auxiliary variables have NAs.
"both"Both Y and X have NAs.
Based on the pattern, a strategy is recommended:
"y_only"First, check whether the NA-Y rows
are non-sample (out-of-sample) areas – domains for which a
prediction is desired but no direct estimate exists. If yes, do
not delete them; fit on the complete-Y subset and pass the
NA-Y rows to sae_predict via the newdata
argument. If they are merely missing observations within sampled
areas, use handle_missing = "deleted".
"x_only"handle_missing = "multiple" – multiple
imputation via mice.
"both" (continuous outcome)handle_missing = "model"
– joint Bayesian imputation via brms::mi().
"both" (discrete outcome, binomial)handle_missing
= "multiple".
When M is supplied, verifies that it is square and that
nrow(M) matches the number of distinct levels in
data[[spatial_var]] (or nrow(data) when
spatial_var is NULL).
An object of class hbsaems_data_check with components:
n_obsNumber of rows in the data.
missing_summaryNamed integer vector: per-variable
count of NA.
missing_patternCharacter: "none", "y_only",
"x_only", or "both".
dimension_checkNamed list of dimension diagnostics.
non_sample_warningCharacter or NULL – a hint
to investigate whether NA-Y rows are non-sample (out-of-sample)
areas.
recommended_methodCharacter: suggested
handle_missing value ("deleted", "multiple",
"model", or NA when no missing values are present).
recommendation_textHuman-readable rationale.
issuesCharacter vector of fatal errors (length 0 if OK).
data("data_fhnorm") # 1. Complete data -> no warnings, no recommendation chk <- check_data(data_fhnorm, response = "y", auxiliary = c("x1", "x2", "x3")) print(chk) # 2. Missing-Y pattern -> recommends checking for non-sample areas d <- data_fhnorm d$y[1:5] <- NA chk2 <- check_data(d, response = "y", auxiliary = c("x1", "x2", "x3")) summary(chk2) # 3. Missing-X-only -> recommends multiple imputation d2 <- data_fhnorm d2$x1[10:15] <- NA chk3 <- check_data(d2, response = "y", auxiliary = c("x1", "x2", "x3")) chk3$recommended_method # 4. Spatial dimension check data("adjacency_matrix_car") chk4 <- check_data(data_fhnorm[1:5, ], response = "y", auxiliary = c("x1", "x2", "x3"), M = adjacency_matrix_car) chk4$dimension_checkdata("data_fhnorm") # 1. Complete data -> no warnings, no recommendation chk <- check_data(data_fhnorm, response = "y", auxiliary = c("x1", "x2", "x3")) print(chk) # 2. Missing-Y pattern -> recommends checking for non-sample areas d <- data_fhnorm d$y[1:5] <- NA chk2 <- check_data(d, response = "y", auxiliary = c("x1", "x2", "x3")) summary(chk2) # 3. Missing-X-only -> recommends multiple imputation d2 <- data_fhnorm d2$x1[10:15] <- NA chk3 <- check_data(d2, response = "y", auxiliary = c("x1", "x2", "x3")) chk3$recommended_method # 4. Spatial dimension check data("adjacency_matrix_car") chk4 <- check_data(data_fhnorm[1:5, ], response = "y", auxiliary = c("x1", "x2", "x3"), M = adjacency_matrix_car) chk4$dimension_check
Classifies and inspects the optional packages used by the Shiny dashboard.
The dashboard is launched by run_sae_app and the dependencies
it touches live in Suggests, not Imports, so users who only
use the modelling functions never have to install them.
check_shiny_deps(verbose = TRUE)check_shiny_deps(verbose = TRUE)
verbose |
Logical. When |
Invisibly, a named list with components:
critical_missingCharacter vector of critical packages not installed. When non-empty, the app cannot start.
optional_missingCharacter vector of optional packages not installed. When non-empty, the app starts but some panels degrade.
install_cmdA ready-to-paste install.packages()
call covering all missing packages, or NULL when none are
missing.
check_shiny_deps()check_shiny_deps()
Runs five theoretical compatibility checks on a spatial weight matrix
and reports any deviations from the standard requirements of
the chosen model class. Returns a structured object summarising the
results.
check_spatial_weight( M, spatial_model = c("car", "sar"), verbose = TRUE, sre_type = NULL )check_spatial_weight( M, spatial_model = c("car", "sar"), verbose = TRUE, sre_type = NULL )
M |
A square numeric matrix. |
spatial_model |
Character. |
verbose |
Logical. When |
sre_type |
Deprecated. Use |
For a CAR model (Besag 1974), the joint distribution
is well-defined only when is symmetric with
zero diagonal. By convention, is taken as the
binary adjacency matrix (style = "B") so that
has integer entries.
For a SAR model (Whittle 1954, Anselin 1988),
and is conventionally row-standardised
(style = "W") so that the spatial autoregressive parameter
can be interpreted as a normalised correlation in
. Symmetry is not required for SAR.
"B" (binary): all values in {0, 1}.
"W" (row-standardised): every non-zero row sums to 1.
"other": neither of the above.
Invisibly, an object of class hbsaems_spatial_check with
components:
is_squareLogical.
has_zero_diagLogical.
is_symmetricLogical.
detected_styleCharacter: "B", "W",
or "other".
n_isolatedInteger: number of areas with no neighbours.
n_componentsInteger: number of connected components.
issuesCharacter vector of fatal errors.
warningsCharacter vector of soft warnings.
compatibleLogical: TRUE if matrix is theoretically
compatible with spatial_model.
# Build a small valid CAR matrix M <- matrix(c(0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0), 4, 4) check_spatial_weight(M, spatial_model = "car") # An asymmetric matrix flagged for CAR M2 <- M; M2[1, 2] <- 2 check_spatial_weight(M2, spatial_model = "car", verbose = FALSE)$issues# Build a small valid CAR matrix M <- matrix(c(0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0), 4, 4) check_spatial_weight(M, spatial_model = "car") # An asymmetric matrix flagged for CAR M2 <- M; M2[1, 2] <- 2 check_spatial_weight(M2, spatial_model = "car", verbose = FALSE)$issues
Computes a battery of convergence tests and diagnostic plots for an
hbmfit object. This is the primary convergence diagnostic function
in hbsaems (supersedes the deprecated hbcc).
convergence_check( model, diag_tests = c("rhat", "geweke", "heidel", "raftery"), plot_types = c("trace", "dens", "acf", "nuts_energy", "rhat", "neff"), ... )convergence_check( model, diag_tests = c("rhat", "geweke", "heidel", "raftery"), plot_types = c("trace", "dens", "acf", "nuts_energy", "rhat", "neff"), ... )
model |
An |
diag_tests |
Character vector of tests to run. Any subset of
|
plot_types |
Character vector of plot types to generate. Any subset
of |
... |
Additional arguments passed to the underlying brms or coda routines. |
For each parameter , the Gelman-Rubin statistic is computed
as
where is the within-chain variance and
is the marginal posterior variance estimate.
Values close to 1 (typically below 1.1) indicate convergence.
An hbcc_results object containing:
rhat_essMatrix with columns Rhat,
Bulk_ESS, Tail_ESS.
geweke,raftery,heidelOutputs from the corresponding
coda routines, or NULL when the test fails.
plotsNamed list of ggplot/bayesplot objects.
is_converged, diagnostic_summary,
hbm_warnings
library(hbsaems) library(brms) data("data_fhnorm") # Uses brms's default MCMC settings (chains = 4, iter = 2000, # warmup = 1000). For challenging posteriors (e.g. funnel # geometries in Fay-Herriot with small D_i), consider # chains = 4, iter = 4000, warmup = 2000 and # control = list(adapt_delta = 0.99). model <- hbm(brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) diag <- convergence_check(model) summary(diag) is_converged(model) is_converged(model, threshold = 1.05)library(hbsaems) library(brms) data("data_fhnorm") # Uses brms's default MCMC settings (chains = 4, iter = 2000, # warmup = 1000). For challenging posteriors (e.g. funnel # geometries in Fay-Herriot with small D_i), consider # chains = 4, iter = 4000, warmup = 2000 and # control = list(adapt_delta = 0.99). model <- hbm(brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) diag <- convergence_check(model) summary(diag) is_converged(model) is_converged(model, threshold = 1.05)
A simulated dataset for 100 regencies under a Beta logit-normal model.
The response y is a proportion in .
data_betalogitnormdata_betalogitnorm
A data frame with 100 rows and 9 variables:
yDirect estimator of the area-level proportion.
thetaTrue area-level proportion.
x1, x2, x3
Auxiliary covariates.
nArea sample size.
deffDesign effect.
regencyRegency identifier ("regency_001" ..
"regency_100").
provinceProvince identifier ("province_01" ..
"province_05") – spatial cluster level for CAR / SAR.
Simulated.
A simulated dataset for 100 districts under a Binomial logit-normal
model. Each district provides a number of successes (y) out of
n trials.
data_binlogitnormdata_binlogitnorm
A data frame with 100 rows and 14 variables:
nNumber of trials in the district.
yNumber of successes.
pDirect proportion (y / n).
x1, x2, x3
Auxiliary covariates.
u_trueTrue area-level random effect (logit scale).
eta_trueTrue linear predictor (logit scale).
p_trueTrue success probability.
psi_iSampling variance.
y_obs, p_obs
Observed (direct) values.
districtDistrict identifier ("district_001" ..
"district_100") – the 100 small areas to estimate.
regencyRegency identifier ("regency_01" ..
"regency_05") – coarse spatial-cluster level (5 regencies
each containing 20 districts). Pair with
adjacency_matrix_car_regency.
Simulated.
A simulated dataset for 100 regencies under the Fay-Herriot Normal
small-area model. Used as the running example throughout the package
documentation and vignettes. The simulation is engineered so that the
canonical Fay-Herriot fit
(hbm(..., sampling_variance = "D")) converges with default
brms / Stan settings – no divergent transitions, no manual
tuning required.
data_fhnormdata_fhnorm
A data frame with 100 rows and 9 variables:
yDirect (survey) estimator of the area mean.
DSampling variance of the direct estimator (KNOWN from the survey design; treat as input, not as a parameter).
x1, x2, x3
Auxiliary covariates at the
area level, simulated from .
theta_trueTrue area-level latent value
.
uTrue area-level random effect .
regencyRegency identifier ("regency_001"
through "regency_100") used as the IID random-effect
grouping variable. Use with re = ~ (1 | regency) or
area_var = "regency".
provinceProvince identifier ("province_01"
through "province_05") – 20 regencies per province.
Used as the spatial random-effect grouping variable for CAR /
SAR / BYM2 examples; also serves as the higher level in the
hierarchical-area example
area_var = c("province", "regency").
Generative model. For each regency ,
with auxiliary covariates (already
standardised), area RE SD , and known
sampling variances – a realistic spread ()
that mirrors varying sample sizes across regencies.
Important: pass D as the sampling variance. In any
fit on this dataset, supply sampling_variance = "D"; otherwise
the residual and the area-RE compete to
explain the same variance, producing weak identifiability and
divergent transitions.
Simulated. Reproducible script in data-raw/data_fhnorm.R.
A simulated dataset for 100 districts under a Lognormal-Lognormal model. Suitable for strictly positive, right-skewed outcomes.
data_lnlndata_lnln
A data frame with 100 rows and 13 variables:
districtDistrict identifier ("district_001" ..
"district_100") – the 100 small areas to estimate.
x1, x2, x3
Auxiliary covariates.
u_trueTrue area-level random effect (log scale).
teta_trueTrue linear predictor (log scale).
mu_orig_trueTrue mean on the original scale.
nArea sample size.
y_obsObserved direct estimator.
lambda_dirDirect estimator scale parameter.
y_log_obsObserved value on the log scale.
psi_iSampling variance on the log scale.
regencyRegency identifier ("regency_01" ..
"regency_05") – coarse spatial-cluster level (5 regencies
each containing 20 districts). Pair with
adjacency_matrix_car_regency.
Simulated.
These functions were the main entry point in hbsaems 0.2.x.
They are retained for backwards compatibility but now call the new
primary functions and emit a deprecation warning via
.Deprecated. They will be removed in
v2.0.0.
hbcc( model, diag_tests = c("rhat", "geweke", "heidel", "raftery"), plot_types = c("trace", "dens", "acf", "nuts_energy", "rhat", "neff"), ... ) hbmc( model, model2 = NULL, ndraws_ppc = 100, moment_match = FALSE, moment_match_args = list(), reloo_args = list(), plot_types = c("pp_check", "params"), comparison_metrics = c("loo", "waic", "bf"), run_prior_sensitivity = FALSE, sensitivity_vars = NULL, ... ) hbpc(model, data = NULL, response_var = NULL, ndraws_ppc = 50, ...) hbsae(model, newdata = NULL, ...)hbcc( model, diag_tests = c("rhat", "geweke", "heidel", "raftery"), plot_types = c("trace", "dens", "acf", "nuts_energy", "rhat", "neff"), ... ) hbmc( model, model2 = NULL, ndraws_ppc = 100, moment_match = FALSE, moment_match_args = list(), reloo_args = list(), plot_types = c("pp_check", "params"), comparison_metrics = c("loo", "waic", "bf"), run_prior_sensitivity = FALSE, sensitivity_vars = NULL, ... ) hbpc(model, data = NULL, response_var = NULL, ndraws_ppc = 50, ...) hbsae(model, newdata = NULL, ...)
model |
An |
diag_tests, plot_types, ndraws_ppc, moment_match, moment_match_args, reloo_args, comparison_metrics, run_prior_sensitivity, sensitivity_vars
|
Forwarded unchanged to the replacement function. |
... |
Additional arguments forwarded to the replacement. |
model2 |
For |
data |
For |
response_var |
For |
newdata |
For |
Identical to the corresponding replacement function.
| Deprecated | Replacement |
hbcc(model, ...) |
convergence_check(model, ...) |
hbmc(model, ...) |
model_compare(model, ...) |
hbpc(model, data, response_var) |
prior_check(model, data, response_var) |
hbsae(model, ...) |
sae_predict(model, ...)
|
Extract a Diagnostic Summary
diagnostic_summary(x)diagnostic_summary(x)
x |
An |
A named list of diagnostic statistics.
Inspect a Registered HBSAE Model Specification
get_hbsae_model(key)get_hbsae_model(key)
key |
Character. A model key returned by
|
The named list spec, or NULL if not found. Useful
fields include:
family – brms family name passed to
brmsfamily.
link – default link function used by the family
("identity", "logit",
"log", ...). See
brmsfamily for the
complete set of supported links per
family.
discrete – whether the response is discrete.
supports_mi – whether brms-canonical mi()
imputation is allowed for this family
(FALSE for all discrete responses).
register_hbsae_model,
brmsfamily for the canonical brms family /
link reference.
get_hbsae_model("lognormal") get_hbsae_model("beta")$link # "logit"get_hbsae_model("lognormal") get_hbsae_model("beta")$link # "logit"
Fits a hierarchical Bayesian model for Small Area Estimation (SAE) using
the brms package (Stan back-end). The function supports fixed
effects, random effects, spatial random effects (CAR/SAR), user-defined
priors, and three strategies for handling missing data in auxiliary
(predictor) variables.
hbm( formula, family = NULL, hb_sampling = "gaussian", hb_link = "identity", link_phi = "log", re = NULL, spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, data, prior = NULL, fixed_params = NULL, sampling_variance = NULL, prior_type = "default", hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE, nonlinear = NULL, nonlinear_type = "spline", spline_k = -1L, spline_bs = "tp", gp_k = NA_integer_, gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL, handle_missing = NULL, m = 5L, mice_args = list(), measurement_error = NULL, control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", sre = NULL, sre_type = NULL, stanvars = NULL, ... )hbm( formula, family = NULL, hb_sampling = "gaussian", hb_link = "identity", link_phi = "log", re = NULL, spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, data, prior = NULL, fixed_params = NULL, sampling_variance = NULL, prior_type = "default", hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE, nonlinear = NULL, nonlinear_type = "spline", spline_k = -1L, spline_bs = "tp", gp_k = NA_integer_, gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL, handle_missing = NULL, m = 5L, mice_args = list(), measurement_error = NULL, control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", sre = NULL, sre_type = NULL, stanvars = NULL, ... )
formula |
A |
family |
The model family (primary argument, brms-consistent).
Accepts a family name string (e.g.\ |
hb_sampling |
Character string naming the distribution family of the
response variable (default: |
hb_link |
Character string specifying the link function
(default: |
link_phi |
Character string specifying the link function for the
precision/phi parameter (default: |
re |
An optional one-sided |
spatial_var |
Character. Name of the column in |
spatial_model |
Character. Type of spatial model: |
car_type |
Character. CAR subtype passed to |
sar_type |
Character. SAR subtype: |
M |
Spatial matrix supplied as |
data |
A data frame containing all variables referenced in
|
prior |
Priors specified via |
fixed_params |
Named list pinning distributional parameters to known values instead of sampling them. Each entry maps a parameter name to one of:
Internally, each pinned parameter |
sampling_variance |
Optional character. Name of a column in
Family compatibility. For non-Gaussian SAE families the analogous pinning mechanism is family-specific:
|
prior_type |
Character. Global-local shrinkage prior applied to all
regression coefficients (
If Cascading to smooth and GP terms. When the formula
contains |
hs_df |
Numeric |
hs_df_global |
Numeric |
hs_df_slab |
Numeric |
hs_scale_global |
Numeric |
hs_scale_slab |
Numeric |
hs_par_ratio |
Numeric |
hs_autoscale |
Logical. Whether |
r2d2_mean_R2 |
Numeric in |
r2d2_prec_R2 |
Numeric |
r2d2_cons_D2 |
Numeric |
r2d2_autoscale |
Logical. Whether |
nonlinear |
Character vector or |
nonlinear_type |
Character. Smooth term family to use. One of
|
spline_k |
Integer. Spline basis dimension (number of knots)
passed to |
spline_bs |
Character. Spline basis type passed to
|
gp_k |
Integer or |
gp_cov |
Character. GP covariance function passed to
|
gp_c |
Numeric |
gp_scale |
Deprecated. Use |
handle_missing |
Character or |
m |
Integer. Number of imputations when
|
mice_args |
A named list of additional arguments forwarded to
|
measurement_error |
Optional named list specifying which
auxiliary variables are measured with error and where to find
their standard errors. The list maps variable names to columns
in |
control |
A named list of sampler control parameters
(default: |
chains |
Integer. Number of MCMC chains (default: |
iter |
Integer. Total iterations per chain (default: |
warmup |
Integer. Warm-up iterations per chain
(default: |
cores |
Integer. Number of CPU cores for parallel sampling
(default: |
sample_prior |
Character. Whether to draw from the prior
distribution. One of |
sre |
Deprecated. Use |
sre_type |
Deprecated. Use |
stanvars |
An optional |
... |
Additional arguments forwarded to |
Hierarchical Bayesian Model for Small Area Estimation
**Spatial Small Area Estimation Models**
For spatially correlated areas, hbsaems extends the standard area-level SAE model (Fay-Herriot 1979) by adding a spatially structured random effect:
where is the spatial random effect for area and
the sampling error. Two families of spatial structures are
supported.
Specified by spatial_model = "car". The joint distribution of the
spatial effects is
where is a binary adjacency matrix (1 if neighbour, 0
otherwise) and . Sub-types via
car_type: "icar" (intrinsic, ; Besag 1991);
"escar", "esicar" (exact sparse formulations of
Morris et al.\ 2019); "bym2" (BYM2 reparameterisation of
Riebler et al.\ 2016, recommended for disconnected graphs).
Specified by spatial_model = "sar". The model is
where is row-standardised so that
carries an interpretable correlation
meaning. Sub-types via sar_type: "lag" (spatial lag
of the response, );
"error" (spatial error model).
Use check_spatial_weight to verify that satisfies
the theoretical requirements (square, zero diagonal, symmetry for CAR,
style-appropriate for the model class). Use build_spatial_weight
to construct from a shapefile.
**Missing Data Strategies**
The three strategies differ in scope and statistical assumptions:
"deleted"Complete-case analysis. Only rows where all response variable(s) are observed are used for model fitting. Auxiliary variables must be complete; otherwise an informative error is raised. Appropriate under MCAR (Missing Completely At Random).
"multiple"Multiple imputation via mice for auxiliary (predictor)
variables only. The response variable is never
imputed. In a Bayesian model, missing outcomes are naturally
marginalised through the posterior predictive distribution:
Imputing before fitting would replace this integral with a
single point substitute, deflate posterior uncertainty, and potentially
bias the estimates if the imputation model is misspecified. If
has missing values, those rows are excluded from model fitting (a
warning is issued) but are retained in the returned object for
subsequent prediction via hbsae. Appropriate under MAR
(Missing At Random).
"model"Model-based imputation using brms::mi(). Missing values in
auxiliary variables are jointly estimated with the model parameters.
The user must specify imputation sub-models explicitly in the
formula argument, e.g.:
bf(y | mi() ~ mi(x1) + x2) + bf(x1 | mi() ~ x2).
Only applicable to continuous distributions. Appropriate
under MAR.
If data are Missing Not At Random (MNAR), none of the above strategies applies directly; sensitivity analyses and explicit missingness models are recommended.
An object of class hbmfit, which is a named list containing:
model |
The fitted |
handle_missing |
The missing-data strategy used ( |
data |
The original data frame passed to |
hbsaems exposes three layers for pinning distributional parameters to known values, in increasing order of generality:
Family-specific sugar (least typing, most readable). Each wrapper exposes a convenience argument that maps to a well-defined Fay-Herriot-style transformation of survey design quantities:
| Wrapper | Sugar argument | Pinned dpar |
hbm(..., hb_sampling = "gaussian") |
sampling_variance = "D"
|
|
hbm_lnln() |
sampling_variance = "psi"
|
(on log scale) |
hbm_betalogitnorm() |
n = "n", deff = "deff"
|
(Liu 2009) |
hbm_binlogitnorm() |
trials = "n"
|
(sampling variance built into the family) |
Universal fixed_params (works everywhere).
A named list pinning any distributional parameter – accepts
a column name (character), a scalar (numeric of length 1),
a vector of length nrow(data), or a one-sided
formula (evaluated in data's environment). Examples:
fixed_params = list(sigma = "D") – pin sigma to a column.
fixed_params = list(phi = ~ I(n / deff - 1)) – pin phi via formula.
fixed_params = list(nu = 4) – pin Student-t df to a scalar.
Raw stanvars (for power users authoring
custom Stan code). Direct injection of Stan code blocks –
see stanvar.
Sugar arguments are simply thin wrappers around layer 2: they
validate the survey-design inputs (no NAs, strictly positive) and
translate to fixed_params before delegating to the universal
machinery. Hence the conflict policy below applies uniformly to
both sugar and explicit fixed_params.
hbm() provides four orthogonal mechanisms to influence the
prior / parameter specification of the underlying brms model:
prior – explicit brmsprior object(s).
prior_type – global-local shrinkage prior on the
regression coefficients (cascades to
"sds" / "sdgp" when
splines or GPs are present).
fixed_params – pin distributional parameters to known
values via the offset trick.
stanvars – inject custom Stan code blocks.
Plus two family-specific sugar arguments that translate to
fixed_params internally:
sampling_variance (continuous families): pins
, equivalent to
fixed_params$sigma = sqrt(data$D).
n + deff (hbm_betalogitnorm): pins
, equivalent to
fixed_params$phi = n / deff - 1.
Combining these without rules in mind can produce unidentified models
or compile-time errors. hbm() therefore enforces the
following conflict matrix, where each cell describes what
happens when the row and column inputs both target the same
distributional parameter (e.g.\ both pin sigma):
| fixed_params | prior | prior_type | stanvars | |
| sampling_variance | error | error (transitive) | no overlap | error (transitive) |
| n + deff | error | error (transitive) | no overlap | error (transitive) |
| fixed_params | -- | error (10b.i) | no overlap | error (10b.ii) |
| prior | error (10b.i) | -- | warning, user wins | no check needed |
| prior_type | no overlap | warning, user wins | -- | no check needed |
| stanvars | error (10b.ii) | no check needed | no check needed | -- |
Resolution semantics in detail:
prior vs prior_type. If the user
supplies a global (no coef =) prior on
class = "b", "sds", or "sdgp",
prior_type is silently dropped for that class and a
warning is emitted. Coefficient-specific user priors
(coef = "x1") are kept alongside the shrinkage prior
without warning.
fixed_params vs prior. A pinned
parameter is removed from the sampler; supplying a prior on
that same parameter therefore has no effect and is treated
as a user error – an informative stop() is issued.
fixed_params vs stanvars. Same logic
as above: a sampling statement in stanvars that
targets a pinned parameter would fail at Stan compile time;
hbm() catches this and stops with a clear message.
Sugar vs fixed_params on the same parameter.
The sugar translators (.translate_sampling_variance(),
.translate_n_deff_to_phi()) error out if the user has
also pre-populated fixed_params for the target
parameter – there should never be two pin sources for the
same dpar.
Sugar vs prior / stanvars transitively.
After the sugar -> fixed_params translation, the
downstream fixed_params-vs-prior / -stanvars checks
fire automatically. E.g.\ sampling_variance = "D"
plus prior = set_prior("normal(0, 1)", class = "sigma")
errors via the fixed_params vs prior rule.
The intent is to fail fast and explicitly rather than silently producing an unidentified or mis-specified model.
Common convergence pathologies in hierarchical Bayesian SAE models
and how to address them. Run convergence_check() after
fitting to inspect , effective sample size (ESS), and
divergent transitions.
1. Default sampler settings (recommended starting point).
chains = 4, iter = 4000, warmup = 2000
control = list(adapt_delta = 0.95, max_treedepth = 12)
cores = parallel::detectCores() - 1
2. Divergent transitions. Most common cause is the funnel geometry of hierarchical variance parameters.
First-line: increase adapt_delta to 0.99.
If still diverging, increase warmup and consider a
tighter prior on the area-level standard deviation
(sd class), e.g.\ set_prior("normal(0, 0.5)",
class = "sd").
For Beta/Binomial logit-normal models, prior on the random intercept SD should also be on the logit scale.
Gaussian (Fay-Herriot) only. Always supply the
known sampling variance via sampling_variance =
"<col>". Without this, the residual and the
area-RE compete to explain the same variance
component, producing weak identifiability and divergent
transitions almost regardless of adapt_delta. This
is the single most common cause of divergences in
Fay-Herriot fits and should be checked first before
any sampler-tuning.
3. Low effective sample size (ESS < 1000).
Increase iter (e.g.\ to 6000); this is the
single most reliable fix.
Centre and scale the auxiliary variables before fitting.
Check for prior–data conflict via priorsense; see
?prior_check.
4. Gaussian processes.
Exact GP scales . For
areas, set gp_k to use the Hilbert-space approximate
GP (Riutort-Mayol et al.\ 2020). A heuristic is
gp_k = ceiling(min(n / 5, 25)).
Try gp_cov = "matern25" (Matern 5/2) if the default
squared-exponential covariance is numerically unstable.
The boundary-scale factor gp_c (brms default 1.25)
may need increasing if the posterior GP is truncated at
the domain edges.
5. Splines.
Start with spline_k = -1 (auto). Increase only if
the residual diagnostics suggest under-smoothing.
For strongly correlated auxiliary variables, try
spline_bs = "cr" (cubic regression spline) for
better numerical stability than the default thin-plate.
For variable selection, use spline_bs = "cs" (cubic
with shrinkage); coefficients on irrelevant smooths shrink
toward zero.
6. Spatial models.
For CAR, car_type = "bym2" (Riebler et al.\ 2016) is
the modern recommendation; it stabilises the
spatial/IID decomposition via a single mixing parameter.
Verify the weight matrix with
check_spatial_weight(); isolated areas or
multiple disconnected components cause non-identifiability.
7. Prior predictive check first. Always call
prior_check() (sample_prior = "only") before
the full posterior run. Implausible prior predictives are the
single most common cause of slow / divergent sampling.
Achmad Syahrul Choir, Saniyyah Sri Nurhayati, and Sofi Zamzanah
Rao, J. N. K., & Molina, I. (2015). Small Area Estimation. John Wiley & Sons.
Burkner, P. C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80(1), 1–28.
Riutort-Mayol, G., Burkner, P.-C., Andersen, M. R., Solin, A., & Vehtari, A. (2023). Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming. Statistics and Computing, 33, 17. doi:10.1007/s11222-022-10167-2
Riebler, A., Sorbye, S. H., Simpson, D., & Rue, H. (2016). An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research, 25(4), 1145–1165.
van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1–67.
library(hbsaems) library(brms) data("data_fhnorm") data <- data_fhnorm # Standard brms-default MCMC settings used throughout these # examples (chains = 4, iter = 2000, warmup = 1000). For tougher # posteriors (funnel geometry, weakly identified priors), bump to # iter = 4000, warmup = 2000, control = list(adapt_delta = 0.99). FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) # -- Basic model -------------------------------------------------------------- model <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), hb_sampling = "gaussian", hb_link = "identity", re = ~(1 | regency), data = data), FAST )) summary(model) # -- Horseshoe prior (sparse coefficients) ------------------------------------ model_hs <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, prior_type = "horseshoe", hs_df = 1), FAST )) summary(model_hs) # -- R2D2 prior (prior on model R-squared) ------------------------------------- model_r2 <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, prior_type = "r2d2", r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2), FAST )) summary(model_r2) # -- Spline smooth for x1 (nonlinear) ----------------------------------------- # x1 is modelled with s(x1); x2 and x3 remain linear. model_spline <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, nonlinear = "x1", nonlinear_type = "spline"), FAST )) summary(model_spline) # -- Gaussian process for x2 (nonlinear) -------------------------------------- model_gp <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, nonlinear = "x2", nonlinear_type = "gp"), FAST )) summary(model_gp) # -- Missing data: deletion (Y missing, X complete) --------------------------- data_miss_y <- data data_miss_y$y[3:5] <- NA model_deleted <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data_miss_y, handle_missing = "deleted"), FAST )) summary(model_deleted) # -- Missing data: multiple imputation (X only -- Y is never imputed) ---------- data_miss_x <- data data_miss_x$x1[6:8] <- NA model_multiple <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data_miss_x, handle_missing = "multiple", m = 5), FAST )) summary(model_multiple) # -- Missing data: model-based imputation with mi() --------------------------- data_miss_x2 <- data data_miss_x2$x1[6:7] <- NA model_model <- do.call(hbm, c( list(formula = bf(y | mi() ~ mi(x1) + x2 + x3) + bf(x1 | mi() ~ x2 + x3), re = ~(1 | regency), data = data_miss_x2, handle_missing = "model"), FAST )) summary(model_model) # -- Spatial: CAR (Conditional Autoregressive) -------------------------------- data("adjacency_matrix_car") model_car <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), data = data, spatial_var = "province", spatial_model = "car", M = adjacency_matrix_car), FAST )) summary(model_car) # -- Spatial: SAR (Simultaneous Autoregressive) ------------------------------- # spatial_weight_sar is a 100x100 row-standardised matrix with row- # names regency_001..regency_100, so it pairs with the fine-grained # "regency" column (100 levels) -- NOT with "province" (5 levels). data("spatial_weight_sar") model_sar <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), data = data, spatial_var = "regency", spatial_model = "sar", M = spatial_weight_sar), FAST )) summary(model_sar)library(hbsaems) library(brms) data("data_fhnorm") data <- data_fhnorm # Standard brms-default MCMC settings used throughout these # examples (chains = 4, iter = 2000, warmup = 1000). For tougher # posteriors (funnel geometry, weakly identified priors), bump to # iter = 4000, warmup = 2000, control = list(adapt_delta = 0.99). FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) # -- Basic model -------------------------------------------------------------- model <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), hb_sampling = "gaussian", hb_link = "identity", re = ~(1 | regency), data = data), FAST )) summary(model) # -- Horseshoe prior (sparse coefficients) ------------------------------------ model_hs <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, prior_type = "horseshoe", hs_df = 1), FAST )) summary(model_hs) # -- R2D2 prior (prior on model R-squared) ------------------------------------- model_r2 <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, prior_type = "r2d2", r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2), FAST )) summary(model_r2) # -- Spline smooth for x1 (nonlinear) ----------------------------------------- # x1 is modelled with s(x1); x2 and x3 remain linear. model_spline <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, nonlinear = "x1", nonlinear_type = "spline"), FAST )) summary(model_spline) # -- Gaussian process for x2 (nonlinear) -------------------------------------- model_gp <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data, nonlinear = "x2", nonlinear_type = "gp"), FAST )) summary(model_gp) # -- Missing data: deletion (Y missing, X complete) --------------------------- data_miss_y <- data data_miss_y$y[3:5] <- NA model_deleted <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data_miss_y, handle_missing = "deleted"), FAST )) summary(model_deleted) # -- Missing data: multiple imputation (X only -- Y is never imputed) ---------- data_miss_x <- data data_miss_x$x1[6:8] <- NA model_multiple <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), re = ~(1 | regency), data = data_miss_x, handle_missing = "multiple", m = 5), FAST )) summary(model_multiple) # -- Missing data: model-based imputation with mi() --------------------------- data_miss_x2 <- data data_miss_x2$x1[6:7] <- NA model_model <- do.call(hbm, c( list(formula = bf(y | mi() ~ mi(x1) + x2 + x3) + bf(x1 | mi() ~ x2 + x3), re = ~(1 | regency), data = data_miss_x2, handle_missing = "model"), FAST )) summary(model_model) # -- Spatial: CAR (Conditional Autoregressive) -------------------------------- data("adjacency_matrix_car") model_car <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), data = data, spatial_var = "province", spatial_model = "car", M = adjacency_matrix_car), FAST )) summary(model_car) # -- Spatial: SAR (Simultaneous Autoregressive) ------------------------------- # spatial_weight_sar is a 100x100 row-standardised matrix with row- # names regency_001..regency_100, so it pairs with the fine-grained # "regency" column (100 levels) -- NOT with "province" (5 levels). data("spatial_weight_sar") model_sar <- do.call(hbm, c( list(formula = bf(y ~ x1 + x2 + x3), data = data, spatial_var = "regency", spatial_model = "sar", M = spatial_weight_sar), FAST )) summary(model_sar)
Convenience wrapper around hbm for SAE problems where
the response is modelled as
hbm_betalogitnorm( response, auxiliary = NULL, data, n = NULL, deff = NULL, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, link_phi = NULL, prior = NULL, stanvars = NULL, handle_missing = NULL, m = 5L, control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", fixed_params = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )hbm_betalogitnorm( response, auxiliary = NULL, data, n = NULL, deff = NULL, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, link_phi = NULL, prior = NULL, stanvars = NULL, handle_missing = NULL, m = 5L, control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", fixed_params = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )
response |
Character. Name of the response column (must lie strictly between 0 and 1). |
auxiliary |
Character vector of auxiliary (fixed-effect) variable names; corresponds to area-level covariates in SAE literature (see Rao & Molina 2015 Ch. 4). |
data |
A data frame. |
n |
Character or |
deff |
Character or |
area_var |
Character vector or |
area_re_structure |
Either |
spatial_var, spatial_model, car_type, sar_type, M
|
Spatial
random-effect arguments, forwarded to |
link_phi |
Character or |
prior |
Optional
The user may pass a partial prior: missing default classes are
filled in automatically. To put a custom prior on |
stanvars |
Optional |
handle_missing, m, control, chains, iter, warmup, cores, sample_prior, ...
|
Passed through to |
fixed_params |
Optional named list pinning distributional
parameters to known values. See |
predictors |
Deprecated. Use |
group |
Deprecated. Use |
sre |
Deprecated. Use |
sre_type |
Deprecated. Use |
The precision parameter can either be pinned to a survey
design effect (n + deff) or sampled with brms's default
weakly-informative prior .
An object of class hbmfit.
Earlier versions of this wrapper introduced a hierarchical
construction
for the random-phi mode, declaring alpha and beta as
Stan parameters with hyperpriors injected via stanvars.
Starting v1.0.0, that construction has been removed in favour of
brms's own default prior,
with lower bound 0 (mean 1, variance 100; weakly informative on
the precision scale). Three reasons:
Brittle priors on alpha/beta. The hyperprior
on has prior mean 1,
which is on the boundary of the parameter space declared
as real<lower=1>. This routinely produced divergent
transitions when the data were not informative about phi.
Parameter blow-up. Estimating two extra Stan parameters per area-level model with limited data inflated the effective posterior dimension and slowed convergence.
Cleaner brms semantics. Letting brms apply its own
default means that passing
prior = NULL now does exactly what the user expects:
“brms defaults”. No surprises.
If you need to reproduce the old behaviour, supply
stanvars yourself to declare alpha, beta and
their hyperpriors, and pass prior = set_prior("gamma(alpha,
beta)", class = "phi"). Pre-v1.0.0 code that supplied
stanvars with hyperpriors on alpha/beta
will now raise an informative error.
When the precision parameter is pinned via n +
deff (or via fixed_params$phi), the function refuses
any additional specification that would also set .
Specifically, all of the following are rejected with an informative
error at construction time:
n supplied without deff, or vice versa.
n + deff and fixed_params$phi.
n + deff and a user prior on
class = "phi".
n + deff and a stanvars
hyperprior on alpha or beta (the
hyperparameters used in
random- mode).
auxiliary and the deprecated predictors
in the same call.
Liu, B. (2009). Hierarchical Bayes Estimation and Empirical Best Prediction of Small-Area Proportions. University of Maryland.
Rao, J. N. K., & Molina, I. (2015). Small Area Estimation, 2nd ed. Wiley, p. 390.
library(hbsaems) library(brms) data("data_betalogitnorm") data <- data_betalogitnorm # -- 1. Basic model (phi random, with default hyperprior) -------------------- model1 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), area_var = "regency", data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) summary(model1) # -- 2. Fixed phi via survey design (n + deff) ------------------------------- model2 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), n = "n", deff = "deff", area_var = "regency", data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) summary(model2) # -- 3. Custom prior on phi via the standard `prior` argument ---------------- # # Starting v1.0.0, hbm_betalogitnorm() no longer declares its own # alpha / beta hyperparameters; phi uses brms's native default # Gamma(0.01, 0.01). To override that prior, pass a `brms::set_prior()` # entry to the `prior` argument -- the legacy # stanvar("alpha ~ gamma(...); beta ~ gamma(...)") pattern would now # error because alpha/beta are no longer Stan parameters. model3 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), area_var = "regency", data = data, prior = brms::set_prior("gamma(2, 0.5)", class = "phi"), chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 4. Spatial CAR model ---------------------------------------------------- data("adjacency_matrix_car") model4 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), n = "n", deff = "deff", spatial_var = "province", spatial_model = "car", M = adjacency_matrix_car, data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )library(hbsaems) library(brms) data("data_betalogitnorm") data <- data_betalogitnorm # -- 1. Basic model (phi random, with default hyperprior) -------------------- model1 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), area_var = "regency", data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) summary(model1) # -- 2. Fixed phi via survey design (n + deff) ------------------------------- model2 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), n = "n", deff = "deff", area_var = "regency", data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) summary(model2) # -- 3. Custom prior on phi via the standard `prior` argument ---------------- # # Starting v1.0.0, hbm_betalogitnorm() no longer declares its own # alpha / beta hyperparameters; phi uses brms's native default # Gamma(0.01, 0.01). To override that prior, pass a `brms::set_prior()` # entry to the `prior` argument -- the legacy # stanvar("alpha ~ gamma(...); beta ~ gamma(...)") pattern would now # error because alpha/beta are no longer Stan parameters. model3 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), area_var = "regency", data = data, prior = brms::set_prior("gamma(2, 0.5)", class = "phi"), chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 4. Spatial CAR model ---------------------------------------------------- data("adjacency_matrix_car") model4 <- hbm_betalogitnorm( response = "y", auxiliary = c("x1", "x2", "x3"), n = "n", deff = "deff", spatial_var = "province", spatial_model = "car", M = adjacency_matrix_car, data = data, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )
Convenience wrapper that fits a Hierarchical Bayesian Small Area
Estimation model with a binomial likelihood and logit link.
Internally delegates to hbm_flex with
family_key = "binomial".
hbm_binlogitnorm( response, trials, auxiliary = NULL, data, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, fixed_params = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )hbm_binlogitnorm( response, trials, auxiliary = NULL, data, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, fixed_params = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )
response |
Character. Name of the successes variable
(non-negative integer, |
trials |
Character. Name of the trials variable (positive integer). |
auxiliary |
Character vector of auxiliary (fixed-effect) variable names; corresponds to area-level covariates in SAE literature (see Rao & Molina 2015 Ch. 4). |
data |
A |
area_var |
Optional character vector. Name(s) of column(s) in
|
area_re_structure |
Either |
spatial_var |
Optional character. Name of a column identifying
the spatial cluster. Must be supplied together with
|
spatial_model |
Optional character. Spatial dependence:
|
car_type |
Optional character. CAR sub-type passed to brms:
|
sar_type |
Optional character. SAR sub-type:
|
M |
Optional numeric matrix. Spatial weight matrix.
Required when |
fixed_params |
Optional named list pinning distributional
parameters to known values. See |
predictors |
Deprecated. Use |
group |
Deprecated. Use |
sre |
Deprecated. Use |
sre_type |
Deprecated. Use |
... |
Additional arguments forwarded to |
Let denote the number of successes in area out of
trials. The model is
An object of class hbmfit.
auxiliary and the deprecated predictors
in the same call are rejected with an informative error.
handle_missing = "model" is not supported (binomial is a
discrete family; see Notes on missing data).
The binomial family does not support handle_missing = "model"
(joint Bayesian imputation via brms::mi()). When NA values are
detected and handle_missing is left NULL, the wrapper
auto-selects "multiple" (multiple imputation via mice on
the predictors only).
library(hbsaems) library(brms) data("data_binlogitnorm") # -- 1. Standard binomial logit-normal SAE --------------------------------- fit <- hbm_binlogitnorm( response = "y", trials = "n", auxiliary = c("x1", "x2", "x3"), area_var = "district", # area-level random effect (1 | district) data = data_binlogitnorm, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 2. With spatial CAR random effect ------------------------------------- data("adjacency_matrix_car_regency") fit_car <- hbm_binlogitnorm( response = "y", trials = "n", auxiliary = c("x1", "x2", "x3"), spatial_var = "regency", spatial_model = "car", M = adjacency_matrix_car_regency, data = data_binlogitnorm, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )library(hbsaems) library(brms) data("data_binlogitnorm") # -- 1. Standard binomial logit-normal SAE --------------------------------- fit <- hbm_binlogitnorm( response = "y", trials = "n", auxiliary = c("x1", "x2", "x3"), area_var = "district", # area-level random effect (1 | district) data = data_binlogitnorm, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 2. With spatial CAR random effect ------------------------------------- data("adjacency_matrix_car_regency") fit_car <- hbm_binlogitnorm( response = "y", trials = "n", auxiliary = c("x1", "x2", "x3"), spatial_var = "regency", spatial_model = "car", M = adjacency_matrix_car_regency, data = data_binlogitnorm, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )
Bundles the MCMC sampler arguments of hbm (and the
hbm_* family wrappers) into a single named list. Use this when
you want a reusable sampler profile or to reduce the size of long
hbm() calls.
hbm_control( chains = 4L, iter = 4000L, warmup = NULL, thin = 1L, cores = 1L, seed = NULL, refresh = NULL, adapt_delta = NULL, max_treedepth = NULL, control = NULL )hbm_control( chains = 4L, iter = 4000L, warmup = NULL, thin = 1L, cores = 1L, seed = NULL, refresh = NULL, adapt_delta = NULL, max_treedepth = NULL, control = NULL )
chains |
Integer. Number of Markov chains (default |
iter |
Integer. Total iterations per chain (default |
warmup |
Integer. Warm-up iterations per chain. Default
|
thin |
Integer. Thinning interval (default |
cores |
Integer. Number of cores for parallel chains
(default |
seed |
Optional integer seed for reproducibility. |
refresh |
Integer. Stan progress refresh frequency
(default |
adapt_delta |
Numeric in |
max_treedepth |
Integer. Max tree depth, included in
|
control |
Optional |
This is entirely opt-in: the flat signatures of hbm(),
hbm_lnln(), etc.\ continue to work exactly as before. Pass the
result directly to hbm() – it is auto-spliced via
....
A named list whose elements are valid arguments of
hbm.
hbm_priors, hbm_nonlinear,
hbm
# Build a reusable "high-quality" profile hq <- hbm_control(chains = 4, iter = 8000, cores = 4, adapt_delta = 0.99, seed = 1) str(hq) # Quick draft profile draft <- hbm_control(chains = 2, iter = 1000)# Build a reusable "high-quality" profile hq <- hbm_control(chains = 4, iter = 8000, cores = 4, adapt_delta = 0.99, seed = 1) str(hq) # Quick draft profile draft <- hbm_control(chains = 2, iter = 1000)
Return the Data Used to Fit an hbmfit
hbm_data(model)hbm_data(model)
model |
An |
The original data.frame passed to the fitting function.
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) head(hbm_data(model))library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) head(hbm_data(model))
Flexible factory that fits an HBSAE model using any distribution
currently registered in the family registry, together with the full
set of cross-cutting features: spatial random effects (CAR/SAR),
shrinkage priors (horseshoe, R2D2), smooth terms (penalised splines,
Gaussian processes), auxiliary-parameter hyperpriors, and
missing-data strategies. Distribution-specific wrappers
(hbm_lnln, hbm_binlogitnorm,
hbm_betalogitnorm) are thin signature shims around this
function with preset family_key values. Advanced users can
also call it directly once a custom family has been registered via
register_hbsae_model.
hbm_flex( family = NULL, response, auxiliary = NULL, data, family_key = NULL, addition_var = NULL, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, prior = NULL, fixed_params = NULL, sampling_variance = NULL, prior_type = "default", hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE, nonlinear = NULL, nonlinear_type = "spline", spline_k = -1L, spline_bs = "tp", gp_k = NA_integer_, gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL, handle_missing = NULL, m = 5L, mice_args = list(), control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", link = NULL, aux_args = NULL, stanvars = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )hbm_flex( family = NULL, response, auxiliary = NULL, data, family_key = NULL, addition_var = NULL, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, prior = NULL, fixed_params = NULL, sampling_variance = NULL, prior_type = "default", hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE, nonlinear = NULL, nonlinear_type = "spline", spline_k = -1L, spline_bs = "tp", gp_k = NA_integer_, gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL, handle_missing = NULL, m = 5L, mice_args = list(), control = list(), chains = 4L, iter = 4000L, warmup = floor(iter/2), cores = 1L, sample_prior = "no", link = NULL, aux_args = NULL, stanvars = NULL, predictors = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )
family |
The model family (primary argument). Accepts EITHER a
character family name (e.g.\ |
response |
Character. Name of the response variable column. |
auxiliary |
Character vector. Names of auxiliary (fixed-effect) variables; corresponds to area-level covariates in SAE literature. |
data |
A |
family_key |
Character. Backward-compatible alias for
|
addition_var |
Character or |
area_var |
Character vector or
|
area_re_structure |
Either |
spatial_var, spatial_model, car_type, sar_type, M
|
Spatial
random-effect arguments forwarded to |
prior, prior_type, hs_df, hs_df_global, hs_df_slab, hs_scale_global, hs_scale_slab, hs_par_ratio, hs_autoscale, r2d2_mean_R2, r2d2_prec_R2, r2d2_cons_D2, r2d2_autoscale
|
Shrinkage prior arguments forwarded to |
fixed_params |
Optional named list pinning distributional
parameters to known values. See |
sampling_variance |
Optional character. Name of a column in
|
nonlinear |
Character vector of variable names to include as
smooth/nonlinear terms (rather than linear). Variables listed here
that also appear in |
nonlinear_type |
Character. |
spline_k |
Integer. Spline basis dimension passed to
|
spline_bs |
Character. Spline basis type passed to
|
gp_k |
Integer or |
gp_cov |
Character. Covariance function: |
gp_c |
Numeric. Hilbert-space GP boundary-scale factor passed
to |
gp_scale |
Deprecated. Use |
handle_missing, m, mice_args
|
Missing-data arguments. When
|
control, chains, iter, warmup, cores, sample_prior, link
|
Sampler and
model-spec arguments forwarded to |
aux_args |
Optional named list of family-specific auxiliary
arguments (e.g.\ |
stanvars |
Optional |
predictors |
Deprecated. Use |
group |
Deprecated. Use |
sre |
Deprecated. Use |
sre_type |
Deprecated. Use |
... |
Additional arguments forwarded to |
The factory performs five duties that were previously duplicated across wrappers:
Validate that response, auxiliary, and optional
variables exist in data.
Run the family's response_check on the response and
report a human-readable error on failure.
Auto-select a missing-data strategy that respects the family's
supports_mi flag (e.g.\ binomial cannot use "model").
Build the brms formula with optional addition terms and apply spline/GP transformations.
Invoke the family's aux_param_hyperprior callback (if
defined) so distributions with a hyperprior on auxiliary parameters
– e.g.\ phi for the Beta family, shape for Gamma, nu for
Student-t – can inject Stan code without writing a thick wrapper
file.
An object of class hbmfit.
register_hbsae_model,
list_hbsae_models, hbm
library(hbsaems) data("data_lnln") # Equivalent to hbm_lnln(...) fit <- hbm_flex( family = "lognormal", response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", # area-level random effect: (1 | district) data = data_lnln, chains = 4, iter = 2000, refresh = 0 )library(hbsaems) data("data_lnln") # Equivalent to hbm_lnln(...) fit <- hbm_flex( family = "lognormal", response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", # area-level random effect: (1 | district) data = data_lnln, chains = 4, iter = 2000, refresh = 0 )
Returns a one-page summary of the fitted model's metadata: number of observations, family, link function, formula, MCMC settings, missing-data strategy, and so on.
hbm_info(model)hbm_info(model)
model |
An |
A named list of metadata.
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) hbm_info(model)library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) hbm_info(model)
Convenience wrapper that fits a Hierarchical Bayesian Small Area
Estimation model with a lognormal likelihood. Internally
delegates to hbm_flex with
family_key = "lognormal".
hbm_lnln( response, auxiliary = NULL, data, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, sampling_variance = NULL, fixed_params = NULL, predictors = NULL, sampling_var = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )hbm_lnln( response, auxiliary = NULL, data, area_var = NULL, area_re_structure = c("nested", "crossed"), spatial_var = NULL, spatial_model = NULL, car_type = NULL, sar_type = NULL, M = NULL, sampling_variance = NULL, fixed_params = NULL, predictors = NULL, sampling_var = NULL, group = NULL, sre = NULL, sre_type = NULL, ... )
response |
Character. Name of the response column (must be strictly positive). |
auxiliary |
Character vector of auxiliary (fixed-effect) variable names; corresponds to area-level covariates in SAE literature (see Rao & Molina 2015 Ch. 4). |
data |
A |
area_var |
Optional character vector. Name(s) of a column (or
columns) in |
area_re_structure |
Either |
spatial_var |
Optional character. Name of a column in |
spatial_model |
Optional character. Type of spatial dependence:
|
car_type |
Optional character. CAR sub-type passed to brms:
|
sar_type |
Optional character. SAR sub-type:
|
M |
Optional numeric matrix. Spatial weight matrix. Required
when |
sampling_variance |
Optional character. Name of a column in |
fixed_params |
Optional named list pinning distributional
parameters to known values. See |
predictors |
Deprecated. Use |
sampling_var |
Deprecated. Use |
group |
Deprecated. Use |
sre |
Deprecated. Use |
sre_type |
Deprecated. Use |
... |
Additional arguments forwarded to |
The response in area is assumed to follow
with the log-mean linked to auxiliary variables via
When the user supplies sampling_variance = "psi_i" (the column name of
the known per-area sampling variance ), is pinned for each area via an offset. This recovers
the Fay-Herriot-style lognormal model in which residual variability is
fully determined by the survey design.
An object of class hbmfit.
When the residual standard deviation
is pinned via sampling_variance (or via
fixed_params$sigma), the function refuses any additional
specification that would also set . Specifically, all
of the following are rejected with an informative error at
construction time:
sampling_variance and fixed_params$sigma.
sampling_variance and a user prior on
class = "sigma".
sampling_variance and a stanvars sampling
statement involving sigma.
auxiliary and the deprecated predictors
in the same call.
library(hbsaems) data("data_lnln") # -- 1. Standard lognormal SAE with area random effect ---------------------- fit1 <- hbm_lnln( response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", data = data_lnln, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 2. Fay-Herriot style with known sampling variance ---------------------- # (assumes psi_i column is available) fit2 <- hbm_lnln( response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", sampling_variance = "psi_i", data = data_lnln, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )library(hbsaems) data("data_lnln") # -- 1. Standard lognormal SAE with area random effect ---------------------- fit1 <- hbm_lnln( response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", data = data_lnln, chains = 4, iter = 2000, warmup = 1000, refresh = 0 ) # -- 2. Fay-Herriot style with known sampling variance ---------------------- # (assumes psi_i column is available) fit2 <- hbm_lnln( response = "y_obs", auxiliary = c("x1", "x2", "x3"), area_var = "district", sampling_variance = "psi_i", data = data_lnln, chains = 4, iter = 2000, warmup = 1000, refresh = 0 )
Bundles the nonlinear-term arguments of hbm into a
single named list. Same opt-in pattern as hbm_control.
hbm_nonlinear( terms, type = c("spline", "gp"), k = -1L, spline_bs = "tp", gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL )hbm_nonlinear( terms, type = c("spline", "gp"), k = -1L, spline_bs = "tp", gp_cov = "exp_quad", gp_c = NULL, gp_scale = NULL )
terms |
Character vector of predictor names to be wrapped in
|
type |
Character. |
k |
Integer. Basis dimension. For splines: passed to
|
spline_bs |
Character. Spline basis type ( |
gp_cov |
Character. GP covariance function: |
gp_c |
Numeric or NULL. HSGP boundary-scale factor for
|
gp_scale |
Deprecated. Use |
A named list of arguments for hbm, with class
c("hbm_config_nonlinear", "hbm_config", "list").
# Penalised regression spline (auto-chosen basis dimension): nl_spline <- hbm_nonlinear(c("x1", "x3"), type = "spline") # Spline with cubic-regression basis and fixed k: nl_cr <- hbm_nonlinear("x1", type = "spline", k = 8, spline_bs = "cr") # Hilbert-space approximate GP with Matern 5/2 covariance (recommended): nl_gp <- hbm_nonlinear("x2", type = "gp", k = 20, gp_cov = "matern25")# Penalised regression spline (auto-chosen basis dimension): nl_spline <- hbm_nonlinear(c("x1", "x3"), type = "spline") # Spline with cubic-regression basis and fixed k: nl_cr <- hbm_nonlinear("x1", type = "spline", k = 8, spline_bs = "cr") # Hilbert-space approximate GP with Matern 5/2 covariance (recommended): nl_gp <- hbm_nonlinear("x2", type = "gp", k = 20, gp_cov = "matern25")
Bundles the shrinkage-prior arguments of hbm into a
single named list. Same opt-in pattern as hbm_control.
hbm_priors( prior_type = c("default", "horseshoe", "r2d2"), prior = NULL, hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE )hbm_priors( prior_type = c("default", "horseshoe", "r2d2"), prior = NULL, hs_df = 1, hs_df_global = 1, hs_df_slab = 4, hs_scale_global = NULL, hs_scale_slab = 2, hs_par_ratio = NULL, hs_autoscale = TRUE, r2d2_mean_R2 = 0.5, r2d2_prec_R2 = 2, r2d2_cons_D2 = NULL, r2d2_autoscale = TRUE )
prior_type |
Character. One of |
prior |
Optional |
hs_df, hs_df_global, hs_df_slab, hs_scale_global, hs_scale_slab, hs_par_ratio, hs_autoscale
|
Horseshoe-prior hyperparameters; see |
r2d2_mean_R2, r2d2_prec_R2, r2d2_cons_D2, r2d2_autoscale
|
R2D2-prior
hyperparameters; see |
A named list of arguments for hbm.
hbm_control, hbm_nonlinear,
hbm
p_hs <- hbm_priors(prior_type = "horseshoe", hs_df = 1, hs_df_slab = 4) p_r2d2 <- hbm_priors(prior_type = "r2d2", r2d2_mean_R2 = 0.5)p_hs <- hbm_priors(prior_type = "horseshoe", hs_df = 1, hs_df_slab = 4) p_r2d2 <- hbm_priors(prior_type = "r2d2", r2d2_mean_R2 = 0.5)
Inspects the fitted model for common convergence problems and returns a
character vector of human-readable warnings. When no warnings apply, the
string "No warnings detected." is returned.
hbm_warnings(model)hbm_warnings(model)
model |
An |
A character vector of warning messages.
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) hbm_warnings(model)library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) hbm_warnings(model)
Quick getters for an hbmfit object: metadata, training data, and
diagnostic warnings.
Convenience constructor that calls new_hbmfit then
validate_hbmfit – the safe public entry-point if you
ever need to build an hbmfit object manually (e.g.\ when
adapting a non-brms model). In normal usage you do not
need to call this: hbm and the distribution-specific
wrappers do it for you.
hbmfit(model, data, missing_method = NULL)hbmfit(model, data, missing_method = NULL)
model |
A |
data |
The |
missing_method |
Character scalar or |
A validated hbmfit object.
new_hbmfit, validate_hbmfit,
hbm
# Uses brms-default MCMC settings (chains = 4, iter = 2000, # warmup = 1000) -- this toy data is only for verifying the # hbmfit class structure, not for inference. raw <- brms::brm(y ~ x1, data = data.frame(y = rnorm(10), x1 = 1:10), chains = 4, iter = 2000, warmup = 1000, refresh = 0) fit <- hbmfit(model = raw, data = data.frame(y = rnorm(10), x1 = 1:10), missing_method = NULL) validate_hbmfit(fit)# Uses brms-default MCMC settings (chains = 4, iter = 2000, # warmup = 1000) -- this toy data is only for verifying the # hbmfit class structure, not for inference. raw <- brms::brm(y ~ x1, data = data.frame(y = rnorm(10), x1 = 1:10), chains = 4, iter = 2000, warmup = 1000, refresh = 0) fit <- hbmfit(model = raw, data = data.frame(y = rnorm(10), x1 = 1:10), missing_method = NULL) validate_hbmfit(fit)
hbmfit is the result class returned by all model-fitting functions
in hbsaems: hbm, hbm_betalogitnorm,
hbm_binlogitnorm, and hbm_lnln. It wraps a
brmsfit object together with the original data and fitting metadata.
modelA brmsfit (or brmsfit_multiple) object.
missing_methodCharacter scalar – the missing-data strategy
used ("deleted", "model", "multiple") or
NULL when the data were complete.
dataThe original data.frame passed to the
fitting function before any row deletion or imputation. Downstream
functions (e.g.\ sae_predict) need all rows – including
those with missing – to produce area-level predictions.
handle_missingBackwards-compatibility alias for
missing_method; identical value.
These methods allow hbmfit objects to be used with familiar base-R
generics – summary(), plot(), coef(), etc. – in the
same way as brmsfit objects from brms.
object, x
|
An |
type |
Plot type. See Details for available options. |
newdata |
Optional new data frame for predictions. |
... |
Additional arguments passed to the underlying brms method. |
Varies by method; see brms documentation for the underlying return types.
Returns a single TRUE / FALSE based on the Gelman-Rubin
statistic.
is_converged(model, threshold = 1.1, ...)is_converged(model, threshold = 1.1, ...)
model |
An |
threshold |
|
... |
Currently unused. |
A single logical.
Lightweight type-checking predicates for all five result classes produced by hbsaems. Each predicate returns a single logical.
is.hbmfit(x) is.hbcc_results(x) is.hbmc_results(x) is.hbpc_results(x) is.hbsae_results(x)is.hbmfit(x) is.hbcc_results(x) is.hbmc_results(x) is.hbpc_results(x) is.hbsae_results(x)
x |
Any R object. |
A single logical value.
is.hbmfit("not a model") # FALSEis.hbmfit("not a model") # FALSE
Predicate for the "hbsaems_check" base class. All pre-fit
inspection functions in hbsaems (check_data,
check_spatial_weight, check_shiny_deps)
return an object that inherits from this class, enabling generic
handling.
is.hbsaems_check(x)is.hbsaems_check(x)
x |
Any R object. |
A single logical.
check_data, check_spatial_weight,
check_shiny_deps
chk <- check_data(data.frame(y = 1:5, x = 1:5), response = "y", auxiliary = "x") is.hbsaems_check(chk) # TRUE is.hbsaems_check("not a check") # FALSEchk <- check_data(data.frame(y = 1:5, x = 1:5), response = "y", auxiliary = "x") is.hbsaems_check(chk) # TRUE is.hbsaems_check("not a check") # FALSE
Returns the keys of all model specs currently registered in the
hbsaems model registry. Built-in models ("gaussian",
"beta", "binomial", "lognormal", etc.) are always
included; user-registered models appear in addition.
list_hbsae_models(verbose = FALSE)list_hbsae_models(verbose = FALSE)
verbose |
Logical. If |
Character vector of model keys, or a data.frame with
columns key, family, link, discrete, supports_mi when
verbose = TRUE.
register_hbsae_model, hbm_flex
list_hbsae_models() list_hbsae_models(verbose = TRUE)list_hbsae_models() list_hbsae_models(verbose = TRUE)
Density, distribution function, quantile function, and random generation
for the loglogistic (Fisk) distribution with scale parameter
mu > 0 and shape parameter beta > 0.
dloglogistic(x, mu = 1, beta = 1, log = FALSE) ploglogistic(q, mu = 1, beta = 1, lower.tail = TRUE, log.p = FALSE) qloglogistic(p, mu = 1, beta = 1, lower.tail = TRUE, log.p = FALSE) rloglogistic(n, mu = 1, beta = 1)dloglogistic(x, mu = 1, beta = 1, log = FALSE) ploglogistic(q, mu = 1, beta = 1, lower.tail = TRUE, log.p = FALSE) qloglogistic(p, mu = 1, beta = 1, lower.tail = TRUE, log.p = FALSE) rloglogistic(n, mu = 1, beta = 1)
x, q
|
Vector of quantiles ( |
mu |
Scale parameter ( |
beta |
Shape parameter ( |
log, log.p
|
Logical; if |
lower.tail |
Logical; if |
p |
Vector of probabilities ( |
n |
Number of random draws. |
Numeric vector of the same length as the input.
This implementation follows the canonical Wikipedia / flexsurv / eha parameterisation (Jackson 2016; Bennett 1983):
with probability density function
cumulative distribution function
median , and mean
when
. Equivalently, .
Why not match the brms lognormal convention? The
brms::lognormal() family parameterises on the log
scale (so is unconstrained and uses an identity link).
Doing the same for the log-logistic would require redefining
– which deviates from every
standard R reference (flexsurv, eha, Wolfram, scipy,
Stata). We deliberately follow the survival-analysis convention
instead: is the median (positive, log link), keeping
interpretation simple and posterior summaries comparable with the
rest of the R survival ecosystem.
Bennett, S. (1983). Log-logistic regression models for survival data. Journal of the Royal Statistical Society, Series C, 32(2), 165-171. doi:10.2307/2347295
Jackson, C. H. (2016). flexsurv: A platform for parametric survival modelling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
Kleiber, C., & Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. Wiley.
dloglogistic(c(0.5, 1, 2), mu = 1, beta = 2) ploglogistic(c(0.5, 1, 2), mu = 1, beta = 2) qloglogistic(c(0.25, 0.75), mu = 1, beta = 2) set.seed(1); rloglogistic(5, mu = 1, beta = 2)dloglogistic(c(0.5, 1, 2), mu = 1, beta = 2) ploglogistic(c(0.5, 1, 2), mu = 1, beta = 2) qloglogistic(c(0.25, 0.75), mu = 1, beta = 2) set.seed(1); rloglogistic(5, mu = 1, beta = 2)
Averages the area-level predictions across multiple fitted HBMs.
Weights may be supplied manually or computed automatically
from leave-one-out cross-validation via
loo_model_weights – the canonical Bayesian
stacking / pseudo-BMA approach of Yao et al.\ (2018).
model_average( ..., weights = NULL, method = c("manual", "stacking", "pseudobma"), predict_type = c("epred", "response", "linpred", "proportion"), newdata = NULL )model_average( ..., weights = NULL, method = c("manual", "stacking", "pseudobma"), predict_type = c("epred", "response", "linpred", "proportion"), newdata = NULL )
... |
Two or more |
weights |
Numeric weights of the same length as the number of
models, or |
method |
Character. Weighting method: |
predict_type |
Passed to |
newdata |
Optional new |
Three weighting modes are supported:
method = "manual" (default when weights is supplied):
use the user-supplied weights vector directly. Internally
normalised to sum to 1.
method = "stacking": weights are obtained from
loo::loo_model_weights(loo_list, method = "stacking"),
which optimises a log-score over a simplex. Recommended when
models are well-specified but capture different features of
the data (Yao et al.\ 2018).
method = "pseudobma": weights are obtained from
loo::loo_model_weights(loo_list, method = "pseudobma"),
a smoothed pseudo-BMA+ that resembles classical BMA but uses
PSIS-LOO log scores. Use as a robust default when one model
is much better than the others.
Internally calls sae_predict on each model and then
sae_aggregate with method = "weighted".
An hbsae_results object of averaged predictions. The
computed weights are attached as an attribute "weights".
Yao, Y., Vehtari, A., Simpson, D., & Gelman, A. (2018). Using stacking to average Bayesian predictive distributions (with discussion). Bayesian Analysis, 13(3), 917–1007. doi:10.1214/17-BA1091
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432.
# See ?model_compare_all for the full example fitting m1 / m2. # Manual: avg <- model_average(m1, m2, weights = c(0.6, 0.4)) # Stacking: avg <- model_average(m1, m2, method = "stacking") # Pseudo-BMA: avg <- model_average(m1, m2, method = "pseudobma")# See ?model_compare_all for the full example fitting m1 / m2. # Manual: avg <- model_average(m1, m2, weights = c(0.6, 0.4)) # Stacking: avg <- model_average(m1, m2, method = "stacking") # Pseudo-BMA: avg <- model_average(m1, m2, method = "pseudobma")
Computes LOO, WAIC, and posterior predictive check diagnostics. With a single model this gives goodness-of-fit metrics; with two models it adds a pairwise comparison.
model_compare( model, model2 = NULL, ndraws_ppc = 100, moment_match = FALSE, moment_match_args = list(), reloo_args = list(), plot_types = c("pp_check", "params"), comparison_metrics = c("loo", "waic", "bf"), run_prior_sensitivity = FALSE, sensitivity_vars = NULL, ... )model_compare( model, model2 = NULL, ndraws_ppc = 100, moment_match = FALSE, moment_match_args = list(), reloo_args = list(), plot_types = c("pp_check", "params"), comparison_metrics = c("loo", "waic", "bf"), run_prior_sensitivity = FALSE, sensitivity_vars = NULL, ... )
model |
An |
model2 |
Optional second |
ndraws_ppc |
Number of draws for the posterior-predictive plot
(default |
moment_match |
Logical; use moment matching for LOO
(default |
moment_match_args |
Named list of arguments for moment matching. |
reloo_args |
Named list of arguments for |
plot_types |
Character vector. Any subset of
|
comparison_metrics |
Character vector. Any subset of
|
run_prior_sensitivity |
Logical; run prior sensitivity analysis using
priorsense (default |
sensitivity_vars |
Variables for the sensitivity analysis. |
... |
Additional arguments. |
An hbmc_results object with components loo1,
waic1, pp_check, params, and – when model2
is given – also loo2, waic2, bf, and
model2.
model_compare_all, model_average
library(hbsaems) library(brms) data("data_fhnorm") FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) m1 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm), FAST)) m2 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2), data = data_fhnorm), FAST)) model_compare(m1) # single-model goodness-of-fit model_compare(m1, m2) # pairwise comparisonlibrary(hbsaems) library(brms) data("data_fhnorm") FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0) m1 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm), FAST)) m2 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2), data = data_fhnorm), FAST)) model_compare(m1) # single-model goodness-of-fit model_compare(m1, m2) # pairwise comparison
Ranks N models by LOO and/or WAIC, returning a sorted hbm_table.
Analogous to loo_compare.
model_compare_all(..., criterion = c("loo", "waic", "both"))model_compare_all(..., criterion = c("loo", "waic", "both"))
... |
Named |
criterion |
One of |
A hbm_table (a sorted data.frame) with columns
Model, ELPD_LOO, LOO_SE, LOO_rank (and
analogous *_WAIC* columns when requested).
library(hbsaems) library(brms) data("data_fhnorm") FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) m1 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1), data = data_fhnorm), FAST)) m2 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2), data = data_fhnorm), FAST)) model_compare_all(simple = m1, medium = m2)library(hbsaems) library(brms) data("data_fhnorm") FAST <- list(chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) m1 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1), data = data_fhnorm), FAST)) m2 <- do.call(hbm, c(list(formula = brms::bf(y ~ x1 + x2), data = data_fhnorm), FAST)) model_compare_all(simple = m1, medium = m2)
Primary model-comparison functions in hbsaems.
model_compare handles one or two models (supersedes deprecated
hbmc); model_compare_all handles N models (analogous
to loo_compare).
Wraps brms::mcmc_plot() and brms::pp_check() with named plot
types.
## S3 method for class 'hbmfit' plot( x, type = c("trace", "density", "acf", "nuts_energy", "rhat", "neff", "pp_check"), ... )## S3 method for class 'hbmfit' plot( x, type = c("trace", "density", "acf", "nuts_energy", "rhat", "neff", "pp_check"), ... )
x |
An |
type |
Plot type, one of: |
... |
Additional arguments passed to the underlying brms plotting function. |
A ggplot or bayesplot object.
Extract Posterior Draws as a Matrix
posterior_draws(model, params = NULL, ...)posterior_draws(model, params = NULL, ...)
model |
An |
params |
Optional character vector of parameter names; |
... |
Additional arguments forwarded to
|
A draws matrix (rows = MCMC iterations, columns = parameters).
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) draws <- posterior_draws(model) dim(draws)library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) draws <- posterior_draws(model) dim(draws)
The posterior_interval generic is re-exported
from rstantools and an S3 method is provided that dispatches on
hbmfit objects. This lets users call
posterior_interval(fit) on the return value of
hbm just as they would on a brmsfit.
## S3 method for class 'hbmfit' posterior_interval(object, prob = 0.95, params = NULL, ...)## S3 method for class 'hbmfit' posterior_interval(object, prob = 0.95, params = NULL, ...)
object |
An |
prob |
Coverage probability in |
params |
Optional character vector of parameter names to keep. |
... |
Additional arguments forwarded to
|
A matrix with two rows giving lower and upper bounds.
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) posterior_interval(model, prob = 0.90)library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) posterior_interval(model, prob = 0.90)
Returns fixed effects, random effects, and model-fit statistics in a single named list.
posterior_summary_hbm(model, probs = c(0.025, 0.975), ...)posterior_summary_hbm(model, probs = c(0.025, 0.975), ...)
model |
An |
probs |
Probability bounds for credible intervals
(default |
... |
Additional arguments forwarded to the underlying brms functions. |
A named list with components fixed_effects,
random_effects, and model_fit (containing loo,
waic, and R2).
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) s <- posterior_summary_hbm(model) s$fixed_effectslibrary(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) s <- posterior_summary_hbm(model) s$fixed_effects
Convenience wrappers around brms draw extractors.
Generates prior predictive samples from a model fit with
sample_prior = "only" and compares them to the observed data.
This is the primary prior-check function (supersedes the deprecated
hbpc).
prior_check(model, data = NULL, response_var = NULL, ndraws_ppc = 50, ...)prior_check(model, data = NULL, response_var = NULL, ndraws_ppc = 50, ...)
model |
An |
data |
A |
response_var |
Character scalar naming the response column.
Optional since 1.1.0: if |
ndraws_ppc |
Integer. Number of prior predictive draws to overlay
on the plot (default |
... |
Currently unused; reserved for future extensions. |
The prior predictive distribution is
the marginal distribution of new data under the prior alone. Comparing it to the observed data is a fast sanity check: if the prior predictive places no mass anywhere near the data, the priors are likely too tight or in the wrong location.
An hbpc_results object with components:
prior_predictive_plotA ggplot from
pp_check, or NULL if it could not be
generated.
prior_drawsA draws matrix from
posterior_predict sized
ndraws_ppc \times nrow(data).
observedThe observed response vector.
When data is omitted it is taken from model$data (the model
frame stored on the fit). When response_var is omitted it is read
from the model formula via brmsterms; if the formula
has no left-hand side (so no response can be determined), an error is
raised asking the caller to supply response_var explicitly.
library(hbsaems) library(brms) data("data_fhnorm") model_prior <- hbm( formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, sample_prior = "only", prior = c( brms::prior(normal(0, 1), class = "b"), brms::prior(normal(0, 5), class = "Intercept") ), chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 42, refresh = 0 ) # Explicit (as before): pc <- prior_check(model_prior, data = data_fhnorm, response_var = "y") # New in 1.1.0 -- data and response auto-detected from the model: pc <- prior_check(model_prior) print(pc) plot(pc)library(hbsaems) library(brms) data("data_fhnorm") model_prior <- hbm( formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, sample_prior = "only", prior = c( brms::prior(normal(0, 1), class = "b"), brms::prior(normal(0, 5), class = "Intercept") ), chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 42, refresh = 0 ) # Explicit (as before): pc <- prior_check(model_prior, data = data_fhnorm, response_var = "y") # New in 1.1.0 -- data and response auto-detected from the model: pc <- prior_check(model_prior) print(pc) plot(pc)
The prior_draws generic is re-exported from
brms and an S3 method is provided that dispatches on
hbmfit objects. Requires the model to have been fit with
sample_prior = "yes" or sample_prior = "only".
## S3 method for class 'hbmfit' prior_draws(x, ...)## S3 method for class 'hbmfit' prior_draws(x, ...)
x |
An |
... |
Additional arguments forwarded to
|
A data.frame of prior draws or NULL if no prior
samples were stored in the model.
library(hbsaems) library(brms) data("data_fhnorm") # `sample_prior = "yes"` works best when all coefficients have a # proper prior; supply explicit priors on the regression class. model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect sample_prior = "yes", prior = c( brms::prior(normal(0, 1), class = "b"), brms::prior(normal(0, 5), class = "Intercept") ), chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) pd <- prior_draws(model) head(pd)library(hbsaems) library(brms) data("data_fhnorm") # `sample_prior = "yes"` works best when all coefficients have a # proper prior; supply explicit priors on the regression class. model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect sample_prior = "yes", prior = c( brms::prior(normal(0, 1), class = "b"), brms::prior(normal(0, 5), class = "Intercept") ), chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) pd <- prior_draws(model) head(pd)
Computes prior and likelihood power-scaling sensitivity diagnostics for
a fitted hbmfit model using the priorsense package.
Useful for assessing whether posterior conclusions are driven by the
prior or the data – a critical step in any principled Bayesian SAE
workflow.
prior_sensitivity(model, ...)prior_sensitivity(model, ...)
model |
An |
... |
Additional arguments forwarded to
|
Prior sensitivity analysis answers the question: “If I had used a slightly different prior, would the substantive conclusions change?”. The power-scaling approach of Kallioinen et al.\ (2023) detects:
Prior–likelihood conflict: the posterior moves non-negligibly when the prior is up- or down-weighted. Often indicates an overly informative or misspecified prior.
Weak likelihood: the posterior is dominated by the prior. Common in SAE for areas with few sampled units.
Reported diagnostics include the Kullback–Leibler divergence between
the original posterior and the power-scaled posterior (prior,
likelihood) and a categorical flag (prior-data conflict,
strong prior, -).
Computational cost. No re-sampling is required: importance sampling reuses the existing posterior draws. Hence a typical run costs only a few seconds even for large hierarchical models.
A powerscale_sensitivity_summary object (data frame)
with one row per monitored parameter and columns
variable, prior, likelihood, diagnosis.
NULL (with a message) when the priorsense package is
not installed.
Always. Specifically:
After every model fit, before drawing substantive conclusions.
Whenever convergence diagnostics from
convergence_check() are clean but the posterior
seems implausibly narrow or implausibly wide.
When comparing models with shrinkage priors – horseshoe and R2D2 are both informative, and small differences in their hyperparameters can move estimates noticeably.
Kallioinen, N., Paananen, T., Burkner, P.-C., & Vehtari, A.\ (2024). Detecting and diagnosing prior and likelihood sensitivity with power-scaling. Statistics and Computing, 34, 57. doi:10.1007/s11222-023-10366-5
prior_check, convergence_check,
model_compare
if (requireNamespace("priorsense", quietly = TRUE)) { data("data_fhnorm") fit <- hbm(brms::bf(y ~ x1 + x2), data = data_fhnorm, re = ~(1 | regency), chains = 4, iter = 2000, refresh = 0) ps <- prior_sensitivity(fit) print(ps) }if (requireNamespace("priorsense", quietly = TRUE)) { data("data_fhnorm") fit <- hbm(brms::bf(y ~ x1 + x2), data = data_fhnorm, re = ~(1 | regency), chains = 4, iter = 2000, refresh = 0) ps <- prior_sensitivity(fit) print(ps) }
Loads the contents of inst/stan/<name>.stan from the installed
hbsaems (or from the local source tree when running tests). This is
a thin wrapper around readLines() that resolves the package path
once and validates that the file exists.
read_stan_function(name)read_stan_function(name)
name |
Character. The distribution name – must match a
|
A character scalar containing the Stan code (functions block).
build_brms_custom_family,
register_hbsae_brms_custom.
code <- read_stan_function("hbsae_loglogistic") cat(code)code <- read_stan_function("hbsae_loglogistic") cat(code)
Wraps a brms::custom_family + stanvars pair into a model
spec usable by hbm and the flexible factory
hbm_flex. The custom likelihood is then available
throughout the package – in run_sae_app, in
sae_predict, in the Shiny application, and so on – as if it
were a built-in family.
register_hbsae_brms_custom( key, custom_family, stanvars, response_check = NULL, response_check_msg = NULL, supports_mi = FALSE, discrete = FALSE, overwrite = FALSE )register_hbsae_brms_custom( key, custom_family, stanvars, response_check = NULL, response_check_msg = NULL, supports_mi = FALSE, discrete = FALSE, overwrite = FALSE )
key |
Character. Unique registry key (e.g.\ |
custom_family |
A |
stanvars |
A |
response_check |
Optional function |
response_check_msg |
Character. Error message if
|
supports_mi |
Logical. Whether |
discrete |
Logical. Whether the family is discrete (default
|
overwrite |
Logical. If |
After registration, the family is usable in any of the following ways:
# Direct via the registry key
fit <- hbm_flex("loglogistic", response = "y",
auxiliary = c("x1", "x2"),
data = d, area_var = "area")
# Direct via hbm() (canonical):
fit <- hbm(brms::bf(y ~ x1 + x2 + (1 | area)),
data = d,
hb_sampling = "loglogistic")
Internally, hbm detects that the registered family is a
brms::custom_family and (i) passes the family object directly to
brms::brm() instead of constructing a built-in
brms::brmsfamily(), and (ii) merges the family's stanvars with
any user-supplied stanvars.
Invisibly returns the registered model spec (a list).
register_hbsae_model,
brms_custom_loglogistic,
brms_custom_shifted_loglogistic,
brms vignette on custom families.
library(hbsaems) library(brms) # Loglogistic (built in to hbsaems 1.0.0; this just shows the # registration mechanism). We use a key with a "_user" suffix to # avoid shadowing the built-in "loglogistic" entry, and unregister # at the end so re-running this example block keeps the registry # clean. ll <- brms_custom_loglogistic() if ("loglogistic_user" %in% list_hbsae_models()) { # Idempotent rerun: remove any previous registration first. rm("loglogistic_user", envir = hbsaems:::.hbsae_model_env) } register_hbsae_brms_custom( key = "loglogistic_user", custom_family = ll$custom_family, stanvars = ll$stanvars_family, response_check = function(y) all(y > 0, na.rm = TRUE), response_check_msg = "Loglogistic response must be positive." ) "loglogistic_user" %in% list_hbsae_models() # Cleanup so the registry is unchanged after the example runs. rm("loglogistic_user", envir = hbsaems:::.hbsae_model_env)library(hbsaems) library(brms) # Loglogistic (built in to hbsaems 1.0.0; this just shows the # registration mechanism). We use a key with a "_user" suffix to # avoid shadowing the built-in "loglogistic" entry, and unregister # at the end so re-running this example block keeps the registry # clean. ll <- brms_custom_loglogistic() if ("loglogistic_user" %in% list_hbsae_models()) { # Idempotent rerun: remove any previous registration first. rm("loglogistic_user", envir = hbsaems:::.hbsae_model_env) } register_hbsae_brms_custom( key = "loglogistic_user", custom_family = ll$custom_family, stanvars = ll$stanvars_family, response_check = function(y) all(y > 0, na.rm = TRUE), response_check_msg = "Loglogistic response must be positive." ) "loglogistic_user" %in% list_hbsae_models() # Cleanup so the registry is unchanged after the example runs. rm("loglogistic_user", envir = hbsaems:::.hbsae_model_env)
Adds a new model spec to the hbsaems model registry so that it
can be used by hbm and the flexible factory
hbm_flex. Useful for extending the package with new
likelihoods (e.g.\ Gamma, Tweedie, Skew-Normal),
new link functions, or new auxiliary-parameter hyperpriors without
modifying hbsaems source.
register_hbsae_model( key, family, link = "identity", discrete = FALSE, supports_mi = !discrete, has_addition_term = FALSE, addition_template = NULL, response_check = function(y) TRUE, response_check_msg = NULL, default_priors = function(...) NULL, aux_param_hyperprior = NULL, overwrite = FALSE )register_hbsae_model( key, family, link = "identity", discrete = FALSE, supports_mi = !discrete, has_addition_term = FALSE, addition_template = NULL, response_check = function(y) TRUE, response_check_msg = NULL, default_priors = function(...) NULL, aux_param_hyperprior = NULL, overwrite = FALSE )
key |
Character. Unique identifier (e.g.\ |
family |
Character. The brms family name passed to
|
link |
Character. Default link function (default
|
discrete |
Logical. Is the response discrete? Affects whether
|
supports_mi |
Logical. Whether |
has_addition_term |
Logical. Whether the LHS uses an addition term
such as |
addition_template |
Character. An |
response_check |
Function |
response_check_msg |
Character. Error message displayed when
|
default_priors |
Function |
aux_param_hyperprior |
Optional function
|
overwrite |
Logical. Permit overwriting an existing key
(default |
After registering, you can fit a model directly with
hbm_flex:
register_hbsae_model(
key = "gamma_log",
family = "Gamma",
link = "log",
discrete = FALSE,
supports_mi = TRUE,
response_check = function(y) all(y > 0, na.rm = TRUE),
response_check_msg = "Gamma response must be strictly positive."
)
fit <- hbm_flex(
family_key = "gamma_log",
response = "expenditure",
auxiliary = c("x1", "x2"),
data = my_data
)
Invisibly returns the registered model spec (a named list).
Opens the interactive HBSAE application in the default web browser. The application provides a graphical interface for data upload, exploratory analysis, model specification, fitting, and result visualisation, covering all modelling functions in hbsaems.
run_sae_app(check_deps = TRUE)run_sae_app(check_deps = TRUE)
check_deps |
Logical. Whether to verify dependencies before
launching (default |
Launch the HBSAE Shiny Application
The application is located in inst/shiny/sae_app/app.R within the
installed package. It depends on several packages that are listed in
Suggests (rather than Imports) so they are not required for
users who only need the modelling functions.
Two classes of dependencies are checked at launch:
Packages without which the app cannot start (shinydashboard, DT). Missing critical packages raise an error.
Packages that enable individual panels and features (shinyWidgets, readxl, energy, minerva, sf, spdep, bridgesampling). Missing optional packages produce a warning and an in-app banner; the corresponding feature degrades gracefully.
Use check_shiny_deps() to inspect dependency status without
launching the app.
Does not return a value; called for its side effect of launching a Shiny server in the current R session.
if (interactive()) { run_sae_app() }if (interactive()) { run_sae_app() }
Combines area-level predictions across multiple hbsae_results
objects. All objects must report predictions for the same number of areas
(in the same order).
sae_aggregate(..., method = c("mean", "median", "weighted"), weights = NULL)sae_aggregate(..., method = c("mean", "median", "weighted"), weights = NULL)
... |
Two or more |
method |
One of |
weights |
Numeric vector of weights, required when
|
An hbsae_results object containing the combined predictions.
p1 <- structure(list(result_table = data.frame(Prediction = 1:3, RSE_percent = c(5, 5, 5)), rse_model = 5, pred = 1:3), class = "hbsae_results") p2 <- structure(list(result_table = data.frame(Prediction = 2:4, RSE_percent = c(4, 4, 4)), rse_model = 4, pred = 2:4), class = "hbsae_results") sae_aggregate(p1, p2, method = "mean") sae_aggregate(p1, p2, method = "weighted", weights = c(0.6, 0.4))p1 <- structure(list(result_table = data.frame(Prediction = 1:3, RSE_percent = c(5, 5, 5)), rse_model = 5, pred = 1:3), class = "hbsae_results") p2 <- structure(list(result_table = data.frame(Prediction = 2:4, RSE_percent = c(4, 4, 4)), rse_model = 4, pred = 2:4), class = "hbsae_results") sae_aggregate(p1, p2, method = "mean") sae_aggregate(p1, p2, method = "weighted", weights = c(0.6, 0.4))
Adjusts area-level predictions so that their weighted sum matches one or more known aggregate “benchmark” totals (e.g.\ official provincial or national figures). Two modes are supported:
sae_benchmark( predictions, target, weights = NULL, target_type = c("total", "mean"), method = c("ratio", "difference", "raking"), groups = NULL, posterior = NULL, probs = c(0.025, 0.5, 0.975), max_iter = 100L, tol = 1e-08 )sae_benchmark( predictions, target, weights = NULL, target_type = c("total", "mean"), method = c("ratio", "difference", "raking"), groups = NULL, posterior = NULL, probs = c(0.025, 0.5, 0.975), max_iter = 100L, tol = 1e-08 )
predictions |
An |
target |
Numeric. For |
weights |
Optional numeric vector of length equal to the number
of areas in |
target_type |
Character. Either |
method |
Character. One of:
|
groups |
Optional integer/character vector of length equal to the
number of areas, assigning each area to a benchmarking group.
Required for |
posterior |
Optional logical or matrix. Controls Bayesian mode:
|
probs |
Numeric vector of quantile probabilities to summarise the
adjusted posterior with (default |
max_iter |
Integer. Maximum iterations for raking (default
|
tol |
Numeric. Convergence tolerance for raking (default
|
Applies the adjustment only to the posterior mean. The RSE column is left unchanged as a working approximation.
posterior = TRUE or supply
a posterior matrix)Applies the adjustment to every posterior draw and recomputes SD, quantiles, and RSE from the adjusted draws. This is the statistically correct procedure and produces proper uncertainty intervals after benchmarking.
Benchmarking is widely used in official statistics to ensure consistency between model-based small-area estimates and published aggregate totals from a more reliable source.
Applying a deterministic adjustment factor to a posterior point estimate distorts the uncertainty structure: a multiplicative factor scales both the mean and the SD by the same amount (so the RSE percentage is preserved), but an additive shift does not change the SD, so the RSE percentage shrinks at small areas and grows at large ones. Neither of these post-hoc fixes is guaranteed to be correct in general. In Bayesian mode every draw is benchmarked independently, so the resulting posterior carries the right uncertainty.
An hbsae_results object with benchmarked
Prediction values, plus an additional element
$benchmark_info that records method, target,
weights, the implied adjustment factor, and
converged (logical, raking only). In Bayesian mode the
result_table also contains updated SD and
RSE_percent columns reflecting the post-benchmark posterior
uncertainty, plus quantile columns named after probs.
Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science 28(1), 40–68.
Wang, J., Fuller, W. A., & Qu, Y. (2008). Small area estimation under a restriction. Survey Methodology 34, 29–36.
sae_predict, sae_aggregate,
model_average
# Synthetic predictions (point-estimate mode) p <- structure( list( result_table = data.frame(Prediction = c(10, 12, 9, 11), SD = c(1, 1, 1, 1), RSE_percent = c(10, 8, 11, 9)), rse_model = 9.5, pred = c(10, 12, 9, 11) ), class = "hbsae_results" ) bm1 <- sae_benchmark(p, target = 50, method = "ratio") # Fully Bayesian mode with user-supplied draws set.seed(1) D <- 1000 draws <- matrix(rnorm(D * 4, mean = c(10, 12, 9, 11), sd = 1), nrow = D, byrow = TRUE) bm2 <- sae_benchmark(p, target = 50, method = "ratio", posterior = draws) bm2$result_table # SD, RSE updated from draws# Synthetic predictions (point-estimate mode) p <- structure( list( result_table = data.frame(Prediction = c(10, 12, 9, 11), SD = c(1, 1, 1, 1), RSE_percent = c(10, 8, 11, 9)), rse_model = 9.5, pred = c(10, 12, 9, 11) ), class = "hbsae_results" ) bm1 <- sae_benchmark(p, target = 50, method = "ratio") # Fully Bayesian mode with user-supplied draws set.seed(1) D <- 1000 draws <- matrix(rnorm(D * 4, mean = c(10, 12, 9, 11), sd = 1), nrow = D, byrow = TRUE) bm2 <- sae_benchmark(p, target = 50, method = "ratio", posterior = draws) bm2$result_table # SD, RSE updated from draws
Filter SAE Predictions by a Logical Condition
sae_filter(x, condition)sae_filter(x, condition)
x |
An |
condition |
Logical vector of length equal to the number of areas. |
A new hbsae_results object containing only rows where
condition is TRUE.
p <- structure(list(result_table = data.frame(Prediction = 1:5, RSE_percent = rep(5, 5)), rse_model = 5, pred = 1:5), class = "hbsae_results") sae_filter(p, p$pred > 2)p <- structure(list(result_table = data.frame(Prediction = 1:5, RSE_percent = rep(5, 5)), rse_model = 5, pred = 1:5), class = "hbsae_results") sae_filter(p, p$pred > 2)
Primary SAE prediction function in hbsaems (supersedes deprecated
hbsae). Computes area-level posterior predictive means,
standard deviations, and relative standard errors (RSE).
sae_predict( model, newdata = NULL, predict_type = c("epred", "response", "linpred", "proportion"), ... )sae_predict( model, newdata = NULL, predict_type = c("epred", "response", "linpred", "proportion"), ... )
model |
An |
newdata |
Optional new |
predict_type |
Character; the posterior quantity to summarise. One of:
Binomial note. For a binomial family |
... |
Additional arguments forwarded to
|
For each area , the function computes
where are posterior draws of the area-mean target
(predict_type = "epred", the default) – or of a new observation
when predict_type = "response" – and is
the number of draws. The relative standard
error is .
An hbsae_results object with components:
result_tableA data.frame with columns
Prediction, SD, RSE_percent.
rse_modelMean of RSE_percent across all areas.
predNumeric vector of point predictions (= result_table$Prediction).
sae_aggregate, model_average,
sae_transform, sae_scale,
sae_filter
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm( formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0 ) est <- sae_predict(model) summary(est) plot(est, type = "predictions") plot(est, type = "uncertainty")library(hbsaems) library(brms) data("data_fhnorm") model <- hbm( formula = brms::bf(y ~ x1 + x2 + x3), data = data_fhnorm, chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 123, refresh = 0 ) est <- sae_predict(model) summary(est) plot(est, type = "predictions") plot(est, type = "uncertainty")
Standardise SAE Predictions
sae_scale(x, center = TRUE, scale = TRUE)sae_scale(x, center = TRUE, scale = TRUE)
x |
An |
center |
Logical or numeric centering (passed to |
scale |
Logical or numeric scaling (passed to |
A new hbsae_results object with standardised predictions.
p <- structure(list(result_table = data.frame(Prediction = 1:5, RSE_percent = rep(5, 5)), rse_model = 5, pred = 1:5), class = "hbsae_results") sae_scale(p)p <- structure(list(result_table = data.frame(Prediction = 1:5, RSE_percent = rep(5, 5)), rse_model = 5, pred = 1:5), class = "hbsae_results") sae_scale(p)
Apply a Transformation to SAE Predictions
sae_transform(x, fun, ...)sae_transform(x, fun, ...)
x |
An |
fun |
A function applied element-wise to the predictions. |
... |
Additional arguments passed to |
A new hbsae_results object.
p <- structure(list(result_table = data.frame(Prediction = c(2, 4, 8), RSE_percent = c(5, 5, 5)), rse_model = 5, pred = c(2, 4, 8)), class = "hbsae_results") sae_transform(p, log)p <- structure(list(result_table = data.frame(Prediction = c(2, 4, 8), RSE_percent = c(5, 5, 5)), rse_model = 5, pred = c(2, 4, 8)), class = "hbsae_results") sae_transform(p, log)
Density, distribution function, quantile function and random generation
for the shifted log-logistic (generalised log-logistic) distribution
with location mu (real), scale sigma > 0 and shape
xi (real). The two-parameter logistic distribution is
recovered as xi -> 0.
dshifted_loglogistic(x, mu = 0, sigma = 1, xi = 0, log = FALSE) pshifted_loglogistic( q, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) qshifted_loglogistic( p, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) rshifted_loglogistic(n, mu = 0, sigma = 1, xi = 0)dshifted_loglogistic(x, mu = 0, sigma = 1, xi = 0, log = FALSE) pshifted_loglogistic( q, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) qshifted_loglogistic( p, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) rshifted_loglogistic(n, mu = 0, sigma = 1, xi = 0)
x, q
|
Numeric vector of quantiles. |
mu |
Location parameter (real; equals the median). |
sigma |
Scale parameter ( |
xi |
Shape parameter (real; |
log, log.p
|
Logical. See |
lower.tail |
Logical. See |
p |
Vector of probabilities. |
n |
Number of random draws. |
Numeric vector.
This implementation uses the GEV-style parameterisation of
Hosking & Wallis (1997) and the Flood Estimation Handbook (Robson &
Reed 1999), in which is a pure location parameter (the
median), a pure scale parameter and a pure
shape parameter:
with corresponding density
The support depends on :
: (bounded below).
: (bounded above).
: (logistic limit).
The median is always ; the mean exists when
and is ,
. Reducing further, the family contains:
the standard log-logistic when (reparameterised);
the logistic distribution as ;
the generalised Pareto family at .
Why this parameterisation? An alternative
"simple-shift" form, ,
exists in the literature (Geskus 2001) and is closer in spirit to
brms::shifted_lognormal()'s positive shift ndt. We
deliberately follow the GEV-style parameterisation because
it provides a smooth limit to the logistic
distribution at ;
the parameters are
orthogonally interpretable (location / scale / shape);
it is the canonical form in hydrology and extreme-value applications (Hosking & Wallis 1997).
Geskus, R. B. (2001). Methods for estimating the AIDS incubation time distribution when date of seroconversion is censored. Statistics in Medicine, 20(5), 795-812.
Hosking, J. R. M., & Wallis, J. R. (1997). Regional Frequency Analysis: An Approach Based on L-Moments. Cambridge University Press. ISBN 0-521-43045-3.
Robson, A., & Reed, D. (1999). Flood Estimation Handbook, Volume 3: Statistical Procedures for Flood Frequency Estimation. Institute of Hydrology, Wallingford, UK.
dshifted_loglogistic(c(1, 2, 5), mu = 0, sigma = 1, xi = 0.5) pshifted_loglogistic(c(1, 2, 5), mu = 0, sigma = 1, xi = 0.5) qshifted_loglogistic(c(0.25, 0.75), mu = 0, sigma = 1, xi = 0.5) set.seed(1); rshifted_loglogistic(5, mu = 0, sigma = 1, xi = 0.5)dshifted_loglogistic(c(1, 2, 5), mu = 0, sigma = 1, xi = 0.5) pshifted_loglogistic(c(1, 2, 5), mu = 0, sigma = 1, xi = 0.5) qshifted_loglogistic(c(0.25, 0.75), mu = 0, sigma = 1, xi = 0.5) set.seed(1); rshifted_loglogistic(5, mu = 0, sigma = 1, xi = 0.5)
A row-standardised spatial weight matrix for 100 regencies, used for
fitting Simultaneous Autoregressive (SAR) random effects. Pairs with
data_fhnorm's regency column.
spatial_weight_sarspatial_weight_sar
A numeric matrix with row- and
column-names regency_001 .. regency_100 and row sums
equal to one (when at least one neighbour is present).
Simulated.
Provides a fall-back summary that simply calls the underlying
print() method. Subclasses that need a more detailed summary
(e.g.\ summary.hbsaems_data_check) override this.
## S3 method for class 'hbsaems_check' summary(object, ...)## S3 method for class 'hbsaems_check' summary(object, ...)
object |
An object inheriting from |
... |
Unused. |
The input object, invisibly.
Looks up a translation key in the hbsaems translation dictionary. When the requested language does not contain the key, it falls back to English; when English also lacks the key, it returns the key itself wrapped in brackets so missing strings stand out during development.
tr(key, lang = "en")tr(key, lang = "en")
key |
Character. Translation key (e.g. |
lang |
Character. Language code; currently |
A character scalar with the translated UI string.
tr("menu_home", "en") # "Home" tr("menu_home", "id") # "Beranda" tr("nonexistent", "id") # "[nonexistent]"tr("menu_home", "en") # "Home" tr("menu_home", "id") # "Beranda" tr("nonexistent", "id") # "[nonexistent]"
Useful when adding a new language: enumerates every key that needs a translation.
tr_keys(lang = "en")tr_keys(lang = "en")
lang |
Character. Reference language (default |
Sorted character vector of translation keys.
head(tr_keys())head(tr_keys())
List Available Languages
tr_langs()tr_langs()
Character vector of supported language codes.
tr_langs()tr_langs()
Refits an hbmfit model with one or more arguments changed. Useful
for re-running with longer chains, more iterations, or new data without
retyping the full hbm call.
update_hbm( object, newdata = NULL, formula. = NULL, iter = NULL, warmup = NULL, chains = NULL, cores = NULL, control = NULL, ... )update_hbm( object, newdata = NULL, formula. = NULL, iter = NULL, warmup = NULL, chains = NULL, cores = NULL, control = NULL, ... )
object |
An |
newdata |
Optional replacement |
formula. |
Optional new formula (note the trailing dot, following
|
iter |
Optional new total number of iterations. |
warmup |
Optional new warm-up length. |
chains |
Optional new number of MCMC chains. |
cores |
Optional new number of cores. |
control |
Optional new control list (e.g.\ |
... |
Additional arguments forwarded to |
An updated hbmfit object.
When you supply a new formula. that references variables not in
the original model frame, brms's default update.brmsfit
refuses with "New variables found ...; supply data again via
newdata". update_hbm catches this specific error and
automatically retries with newdata = object$data (the data
frame stored on the original hbmfit). Pass an explicit
newdata to override this behaviour.
Models fitted with sampling_variance, n + deff
(in hbm_betalogitnorm), or fixed_params carry hidden
offset columns named .hbsaems_<par>_fixed in their data
frames. When update_hbm receives a newdata that
does not have these columns, brms refuses to refit with
"variables can neither be found in 'data' nor in 'data2'".
update_hbm now detects this case and:
warns the user when offset columns are present in the
original data but missing in newdata;
either copies the columns over from the original
object$data (when nrow(newdata) ==
nrow(object$data), which is the typical "same areas,
updated covariates" case), or
raises an informative error pointing the user to either
supply the columns in newdata explicitly or refit
from scratch with a fresh hbm() call.
library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) # Re-run with slightly more iterations (no data change needed). # In production you would jump to e.g. iter = 4000, warmup = 2000. model2 <- update_hbm(model, iter = 1000, warmup = 500) # Add a predictor: the auto-fallback transparently retries with the # stored data frame. Equivalent to passing newdata = data_fhnorm. model3 <- update_hbm(model, formula. = . ~ . + x2)library(hbsaems) library(brms) data("data_fhnorm") model <- hbm(brms::bf(y ~ x1), data = data_fhnorm, re = ~ (1 | regency), # area-level random effect chains = 4, iter = 2000, warmup = 1000, cores = 1, seed = 1, refresh = 0) # Re-run with slightly more iterations (no data change needed). # In production you would jump to e.g. iter = 4000, warmup = 2000. model2 <- update_hbm(model, iter = 1000, warmup = 500) # Add a predictor: the auto-fallback transparently retries with the # stored data frame. Equivalent to passing newdata = data_fhnorm. model3 <- update_hbm(model, formula. = . ~ . + x2)
Public validator for the hbmfit-class{hbmfit} S3 class.
Runs all invariants that the cheap constructor (new_hbmfit)
is permitted to skip. Useful when reconstructing an hbmfit
object manually, reading one back from disk, or testing custom
family wrappers.
validate_hbmfit(x)validate_hbmfit(x)
x |
An object to validate. |
Invariants verified:
Object is a list with class "hbmfit".
Has mandatory slots: model, missing_method,
data.
model inherits from brmsfit or
brmsfit_multiple.
missing_method is NULL or a single character
string in c("deleted", "multiple", "model").
data is a data.frame with row.
The handle_missing alias (if present) equals
missing_method.
The input x, invisibly, when all invariants hold.
Otherwise raises an informative error.
# Minimal example without area-level RE (fixed-effects baseline) -- # suppress the area-RE advisory because this 5-row toy dataset cannot # meaningfully estimate a random effect. Uses brms-default MCMC # settings (chains = 4, iter = 2000, warmup = 1000); on this toy # data the fit is only used to verify the hbmfit class structure, # not for inference. fit <- suppressWarnings( hbm(brms::bf(y ~ x1), data = data.frame(y = rnorm(5), x1 = 1:5), chains = 4, iter = 2000, warmup = 1000, refresh = 0) ) validate_hbmfit(fit)# Minimal example without area-level RE (fixed-effects baseline) -- # suppress the area-RE advisory because this 5-row toy dataset cannot # meaningfully estimate a random effect. Uses brms-default MCMC # settings (chains = 4, iter = 2000, warmup = 1000); on this toy # data the fit is only used to verify the hbmfit class structure, # not for inference. fit <- suppressWarnings( hbm(brms::bf(y ~ x1), data = data.frame(y = rnorm(5), x1 = 1:5), chains = 4, iter = 2000, warmup = 1000, refresh = 0) ) validate_hbmfit(fit)