Minor release with Bayesian-workflow improvements. Read the behaviour change below before upgrading production scripts.
sae_predict() now defaults to predict_type = "epred" (the posterior
distribution of the area mean theta_i), replacing the previous
posterior-predictive-of-a-new-observation behaviour. This is the
theoretically correct target for small area estimation: the reported SD
/ RSE_percent now measure the uncertainty of the estimator theta_i and
are therefore smaller than in 1.0.x (which included observation-level
likelihood variance). Every RSE number your scripts produce will
change. To reproduce the old numbers exactly, pass
predict_type = "response". A third option, "linpred", returns the
linear predictor on the response scale.Four SAE-oriented families added to the model registry, usable via
hbm_flex() / hbm() and list_hbsae_models():
gamma (positive continuous, log link -- incomes/expenditures),
skew_normal (skewed continuous area means),
student (heavy-tailed, robust Fay-Herriot), and
hurdle_lognormal (positive continuous with excess zeros). Each
carries the appropriate response-domain check (e.g. Gamma requires y > 0;
hurdle-lognormal allows y >= 0).
hbm_flex()'s family argument is now the primary way to choose the
family. It accepts a string (e.g. "gaussian", exactly like the old
family_key), a brms family object (e.g. gaussian(), binomial()), or a
registered custom family object (e.g.
brms_custom_loglogistic()$custom_family). The family name is matched to a
registered spec, so custom distributions reuse the same Stan code
(stanvars); an unregistered custom family raises an informative error. The
family_key argument is retained as a backward-compatible alias (not
deprecated). Supply either family or family_key, not both.
hbm() gains a family argument mirroring hbm_flex(): pass a string
(e.g. "gaussian") or a brms family object (e.g. gaussian(),
Gamma(link = "log")); a link carried on the object is honoured when
hb_link is left at its default. hb_sampling is retained as a
backward-compatible alias. Supply either family or hb_sampling, not
both. family is thus the uniform distribution selector across hbm()
and hbm_flex().
sae_predict() gains the predict_type = c("epred", "response", "linpred", "proportion") argument (see above). model_average() and
sae_benchmark() honour it. For a binomial family, posterior_epred()
returns the expected COUNT (n_i * p_i), which is not comparable across areas
with unequal trial counts; predict_type = "proportion" returns the area
proportion p_i, and predict_type = "epred" on a binomial model now
auto-converts to p_i (with a warning). Use "response" for the count.
convergence_check() now reports divergent transitions, E-BFMI per
chain, and the max-treedepth hit rate as numbers in summary() (not
just the energy plot), and warns when divergences occur or E-BFMI < 0.3.
model_compare() / model_compare_all() now check Pareto-k for
PSIS-LOO reliability: a n_high_k column is added to the ranking table and
a warning is issued when any k > 0.7. The previously dead reloo_args
argument is now functional -- supplying it triggers brms::reloo() to
refit the high-k folds.
hbm() now defaults to control = list(adapt_delta = 0.95, max_treedepth = 12) (the values long documented as the recommended
starting point) instead of brms's adapt_delta = 0.8. Hierarchical SAE
funnels need the higher value to avoid spurious divergences. User-supplied
control values are always respected. Seed-level results may differ
slightly from 1.0.x.convergence_check() no longer reports Tail_ESS = Bulk_ESS in the
fallback path when the 'posterior' package is unavailable; it reports
NA rather than falsely implying the tails were checked.model_compare() (and the deprecated hbmc()) accept a list of fitted
models again. When model_compare() became an S3 generic in 1.0.0 it
dispatched only on hbmfit/brmsfit, so the pre-1.0.0 convention
hbmc(model = list(m1, m2)) failed with "no applicable method for
'model_compare' applied to an object of class 'list'". A
model_compare.list() method now unwraps a 1-2 element list to the
single/pairwise path (hbmc_results), routes a 3+ element list to
model_compare_all() (ranked hbm_table), and rejects an empty or
non-model list with an informative error.First stable release. hbsaems fits area-level Hierarchical
Bayesian Small Area Estimation models following the methodological
tradition of Rao and Molina (2015) doi:10.1002/9781118735855, with
computational implementation adapted to the parameterisation and
prior-specification conventions of the 'brms' package
(Buerkner 2017 doi:10.18637/jss.v080.i01) and Stan back-end. The
package is designed to support the principled Bayesian workflow
advocated by Gelman et al.\ (2020)
doi:10.48550/arXiv.2011.01808 -- prior predictive checks, MCMC
convergence diagnostics, posterior predictive checks, leave-one-out
cross-validation, Bayesian model comparison and averaging, prior
sensitivity analysis, and design-consistent benchmarking are all part
of the standard pipeline.
This entry consolidates the changelog for every development cycle since the 0.1.2 maintenance release; the package now follows semantic versioning and the 1.0.0 line constitutes a stable user-facing function set whose signatures will not change before v2.0.0.
Shiny app: memory management for multi-model comparison. The dashboard's "snapshot library" lets users compare or average predictions across several fitted models in one session. Each snapshot is a full \code{hbmfit} object (50-200 MB for a typical Fay-Herriot fit), so a session that snapshots 5+ models could easily blow past 1 GB of RAM. Three composable mitigations ship in v1.0.0: \itemize{ \item \strong{Snapshot shrinker} (\code{.shrink_hbmfit()}): drops the heaviest \pkg{rstan} internals (the compiled Stan module handle and posterior summary cache) before the snapshot is stored. The shrunken fit is still usable for everything the Shiny app does (\code{brms::loo()}, \code{posterior_predict()}, \code{summary()}); only \code{update()} of the Stan program is unavailable. \item \strong{LRU eviction} (\code{.multimodel_library_add()}): the library is capped at 5 snapshots by default (\code{MAX_SNAPSHOTS} constant near the top of the observer). When the user adds a 6th, the oldest is evicted with a clear notification. The user can also press the new "Clear library" button to free everything. \item \strong{Aggressive GC} (\code{.hbsaems_gc()}): a full \code{gc(full = TRUE)} runs after each model fit and after each library clear. Important on long-running sessions where intermediate Stan compilation artifacts can accumulate. } The multi-model panel now also shows a per-snapshot size-in-MB readout below the comparison buttons, so users can see exactly how much memory each snapshot is costing.
Shiny app now exposes the Fay-Herriot sampling_variance
argument. The interactive run_sae_app() previously offered
the survey-design route to fixing $\phi_i$ in the Beta-Logitnormal
workflow (via n + deff) but did NOT expose the equivalent
sampling_variance = "psi_i" slot for the Lognormal-Lognormal
workflow (where it pins $\sigma_i = \sqrt{\psi_i}$) nor for the
Custom (Gaussian / Lognormal / Student) workflow. Both gaps have
been closed: a new "Sampling Variance (psi_i)" dropdown appears
on the Lognormal-Lognormal panel and a "Sampling Variance (D_i)"
dropdown appears under the Custom panel when the user picks a
family that supports it. Selecting a column wires it straight
through to hbm_lnln() / hbm(), which then apply the
link-override fix from this release transparently. This means
the classical Fay-Herriot SAE workflow is now end-to-end usable
from the GUI without dropping back to scripted calls.
Shiny app exposes measurement_error sugar (Ybarra-Lohr 2008).
A new collapsible "Measurement-error covariates" box on the
Modeling tab lets the user mark one auxiliary covariate as
noisily-measured and supply the column holding its standard
error. hbsaems then rewrites \code{y ~ x} into
\code{y ~ mi(x, se_x)} transparently.
Shiny app exposes the generic fixed_params slot. A new
collapsible "Fixed distributional parameters (advanced)" box on
the Modeling tab accepts an R expression evaluating to a named
list (e.g.\ \code{list(sigma = "D")} or
\code{list(shape = 2)}). Power users can now pin any
distributional parameter (sigma, phi, shape, nu, ...) from the
GUI without writing scripts.
Shiny app exposes the sae_* post-processing helpers. A new
"Post-processing" sub-tab under Results lets the user apply
\code{sae_transform()} (log / exp), \code{sae_scale()} (centre +
std, or centre only), and \code{sae_filter()} (RSE-threshold
cutoff) to the most recent prediction and download the
transformed table.
Shiny app exposes multi-model workflows. A new "Multi-model" sub-tab under Results provides an in-app library of named model snapshots. The user clicks "Snapshot current fit" to save a copy of \code{model_fit()} under a custom name, then selects 2+ snapshots to feed into \code{model_compare_all()} (LOO/WAIC) or \code{model_average()} (Bayesian model averaging). This closes the long-standing gap between the scripted multi-model API and the GUI workflow.
Generic fixed_params mechanism in hbm() and hbm_flex().
Lets the user pin any distributional parameter to a value derived
from a column, scalar, vector, or formula evaluated against the
data. Centralises a pattern that previously had to be coded
per-wrapper.
hbm_betalogitnorm() refactor. Four operational modes:
random $\phi$ with hyperprior; fixed $\phi$ from survey
$n$ and $\mathrm{deff}$; user-overridable hyperprior on
$\alpha,\beta$ via stanvars; and generic fixed_params for
power users. Default priors filled in automatically and follow
Liu (2009).
hbm_lnln(sampling_variance = ...) for the Fay--Herriot lognormal
model. Pins $\sigma_i = \sqrt{\psi_i}$ from a known per-area
sampling variance.
Custom brms response distributions. Built-in support for
Loglogistic and Shifted Loglogistic, plus a public extension
framework (register_hbsae_brms_custom(),
read_stan_function(), build_brms_custom_family()). Stan code
lives in inst/stan/ as plain .stan files. Each registered
family ships log-likelihood, posterior-predict and posterior-epred
hooks so loo(), posterior_predict(), and
posterior_epred() work out of the box.
Spatial random effects. CAR (ICAR / proper / BYM2) and SAR
(lag / error) via spatial_var, spatial_model, car_type,
sar_type, and M. Weight matrices can be constructed with the
bundled build_spatial_weight() from a shapefile or coordinates.
Missing-data handling. Three strategies (deleted,
multiple via mice, model via brms::mi()) with auto-selection
when handle_missing = NULL.
Shrinkage priors. Horseshoe (regularised, Piironen & Vehtari
2017) and R2D2 (Zhang et al.\ 2022) selectable via prior_type.
Nonlinear smooth terms with full brms-canonical API. Penalised regression splines via \pkg{mgcv} and Gaussian processes via \pkg{brms}:
spline_k (basis dim) and spline_bs (basis type:
"tp", "cr", "cs", "ps").gp_k (Hilbert-space approximate GP basis dimension --
Riutort-Mayol et al.\ 2023), gp_cov (covariance function:
"exp_quad", "matern15", "matern25", "exponential"),
gp_c (boundary-scale factor).gp_k value.gp_scale deprecated in favour of gp_c; removal scheduled
for v2.0.0.Bilingual Shiny dashboard (run_sae_app()). English /
Indonesian, dedicated spatial setup tab, in-app code preview, and
CSV / RDS data upload. Source under
inst/shiny/sae_app/.
Benchmarking helpers (sae_benchmark(),
sae_predict()). Pfeffermann-style design-consistent benchmarking
and out-of-sample prediction for unsampled areas.
Bayesian model averaging with LOO weights. model_average()
now accepts a method argument: "manual" (default, user weights),
"stacking", or "pseudobma" (both via loo::loo_model_weights,
the canonical Bayesian stacking / pseudo-BMA+ of Yao et al.\ 2018).
Power-scale prior sensitivity diagnostics via
prior_sensitivity(): thin wrapper around
priorsense::powerscale_sensitivity() (Kallioinen et al.\ 2024) for
detecting prior-data conflict and weak likelihood in fitted models.
Hierarchical area_var for multi-stage SAE. hbm_flex(),
hbm_lnln(), hbm_betalogitnorm(), and hbm_binlogitnorm() now
accept area_var as a character vector (highest level first, e.g.
c("province", "regency")) and a companion area_re_structure
argument that selects between "nested" (default; produces
(1 | province / regency)) and "crossed" random intercepts.
Length-1 input behaves exactly as in earlier releases, so existing
code continues to work unchanged.
Custom Stan family name prefix hbsae_. To avoid a symbol
collision with Stan's built-in loglogistic_lpdf (Stan >= 2.29),
the Stan function definitions for the loglogistic and shifted
loglogistic families are now named
hbsae_loglogistic_lpdf / hbsae_shifted_loglogistic_lpdf and live
in inst/stan/hbsae_loglogistic.stan / inst/stan/hbsae_shifted_loglogistic.stan.
The user-facing R helpers (dloglogistic, brms_custom_loglogistic,
etc.) and the registry keys ("loglogistic", "shifted_loglogistic")
are unchanged.
hbm() now accepts sampling_variance = "<col>" as the
Fay-Herriot sugar (previously only available in hbm_lnln()).
Pins \eqn{\sigma_i = \sqrt{D_i}} via offset and is the canonical
way to fit a Gaussian Fay-Herriot model. Without it the residual
\eqn{\sigma} and the area-RE \eqn{\sigma_u} compete to explain the
same variance, producing weak identifiability and divergent
transitions almost regardless of adapt_delta. All vignettes
using data_fhnorm have been updated accordingly.
Family compatibility check: sampling_variance is only valid
for continuous families that expose a residual SD parameter named
sigma (gaussian, lognormal, student, skew_normal, exgaussian,
asym_laplace). Passing it with Beta / Binomial / Poisson /
Gamma / Weibull families now raises an explicit error pointing the
user at the appropriate family-specific mechanism (e.g.
fixed_params$phi for Beta via design effect, trials for
Binomial).
Centralised sugar-to-fixed_params translation. The previously
duplicated translation logic in hbm(), hbm_lnln(), and
hbm_betalogitnorm() is now consolidated in two internal helpers
(\code{.translate_sampling_variance}, \code{.translate_n_deff_to_phi})
with consistent validation, conflict checks against
fixed_params, and error messages. Behaviour is unchanged from the
user's point of view; the refactor eliminates ~60 lines of
duplicated code.
data_fhnorm regenerated with a deterministic, well-identified
simulation (set.seed(20260518L)). Covariates are standardised to
\eqn{\mathcal{N}(0, 1)}, sigma_u = 1.0, and \eqn{D_i \sim
\mathrm{Gamma}(4, 4)} so that vignettes fit cleanly with default
brms / Stan settings. See data-raw/data_fhnorm.R for the
reproducible generator.
measurement_error sugar (Ybarra and Lohr 2008). hbm() and
the hbm_* wrappers accept
\code{measurement_error = list(x1 = "se_x1", ...)} which rewrites
the brmsformula on the fly to wrap the listed auxiliary variables
with \code{mi(var, se_col)}. Validation enforces non-negative,
NA-free standard errors and that the named variables are part of
auxiliary.
Automatic mi() / me() detection. When the user writes
\code{mi(...)} or \code{me(...)} explicitly in the formula,
hbm() no longer demands handle_missing be set and no longer
drops rows with NA: brms's joint-modelling / measurement-error
framework handles the imputation internally. Internally
handle_missing is silently set to "model" in that case.
Conditional link_phi resolution in hbm_betalogitnorm().
Default is now NULL and resolves automatically: "identity"
when phi is pinned via fixed_params$phi (the survey-design
mode), "log" (brms default) when phi is estimated via a
hyperprior. This eliminates a class of divergent transitions
caused by NUTS proposing negative \eqn{\phi} on the identity
scale. Manually setting "identity" in random mode emits a
warning.
Simplified phi prior in hbm_betalogitnorm() random mode.
The pre-existing hierarchical construction
\eqn{\phi \sim \mathrm{Gamma}(\alpha, \beta),
\alpha \sim \mathrm{Gamma}(1,1),
\beta \sim \mathrm{Gamma}(1,1)}
has been replaced by brms's own default
\eqn{\phi \sim \mathrm{Gamma}(0.01, 0.01)} (lower bound 0).
The wrapper no longer declares \code{alpha} or \code{beta} as
Stan parameters and no longer injects \code{stanvars} sampling
statements for them. Rationale: the \eqn{\mathrm{Gamma}(1, 1)}
prior on \eqn{\alpha} (declared as \code{real<lower=1>}) was on
the boundary of its support, producing divergent transitions on
weakly-informative data; the extra layer also inflated the
effective posterior dimension for what is essentially one scalar
parameter per area model. \strong{Migration:} users who relied
on the old construction can build it manually via the
\code{stanvars} + \code{prior} arguments; see
\code{?hbm_betalogitnorm} for the migration note. Legacy code
that still passes \code{stanvars} containing sampling statements
on \code{alpha} / \code{beta} now raises an informative error
pointing the user at the new mechanism.
sae_benchmark() defaults made semantically explicit. New
argument target_type = c("total", "mean") (default "total")
is consulted only when weights = NULL, choosing a safe default
weighting (rep(1, n) for total, rep(1 / n, n) for mean). An
informational message is emitted so the chosen weighting is
always visible. Production users should still pass explicit
weights = N_i (population size per area).
predictors -> auxiliary. The new name
aligns with Small Area Estimation literature (Rao & Molina 2015,
Pfeffermann 2013). Old usage continues to work with a one-time
soft-deprecation warning and is scheduled for removal in
v2.0.0.hbm_generic() -> hbm_flex(). Old name
removed.hbcc(), hbmc(),
hbpc(), hbsae(). All four emit a soft-deprecation warning
pointing at hbm(), convergence_check(), prior_check(),
sae_aggregate(), or sae_predict() respectively.
Scheduled for removal in v2.0.0.hbsaems-modelling (overview), hbsaems-lnln-model,
hbsaems-betalogitnorm-model, hbsaems-binlogitnorm-model,
hbsaems-spatial, hbsaems-handle-missing,
hbsaems-run_sae_app. All follow a CRAN-safe pattern: heavy
Stan fits are not evaluated at build time; representative outputs
are printed as illustrations with an explicit disclaimer.complete-workflow,
advanced-features, migration-guide.hbsaems-betalogitnorm-model vignette now documents two modes
for $\phi$ (Random and Fixed) instead of three, and includes a
legacy-reproduction recipe for users who need the pre-v1.0.0
hierarchical hyperprior. The migration-guide vignette gained
two new sections covering the phi prior simplification and the
critical link function override fix. The hbsaems-lnln-model
vignette adds an implementation note explaining why the
sampling_variance offset is now interpreted on the natural
(untransformed) scale.CRAN vignette now carries real output. The
complete-workflow vignette previously used
knitr::opts_chunk$set(eval = FALSE) globally and showed all R
code as display-only. This kept CRAN's vignette build short but
also meant readers could not see what any of the seven workflow
steps actually produces.
The revised vignette evaluates a curated subset of chunks:
\itemize{
\item library(hbsaems) -- loads the package (instant).
\item data("data_fhnorm"); str(...) -- prints the structure
of the bundled Fay-Herriot dataset, so readers can see
column names, types, and the first few rows (instant).
\item A small demonstration fit (chains = 1, iter = 200, warmup = 100) is run once, cached via knitr, and then
reused by the convergence-diagnostic and small-area-
estimate chunks below. The chunk header clearly labels
these as toy settings unsuitable for inference.
\item convergence_check() + is_converged() +
hbm_warnings() are run on the demo fit so readers see
the actual structure of the diagnostic objects.
\item sae_predict() + head(est$result_table) shows a real
small-area-estimate table.
\item sessionInfo() for reproducibility.
}
The production-grade hbm(..., chains = 4, iter = 4000) calls
for the prior-predictive check, the headline model fit, the
model-comparison, and model-averaging are kept as display-only
\code{```r} blocks (lowercase, not evaluated by knitr) -- those
would push the CRAN vignette build well past its time budget.
The text now explicitly tells readers to use those full settings
in their own analyses, not the toy iter = 200 of the demo.
A Stan-toolchain probe in the setup chunk silently falls back to display-only mode when Boost / a C++ compiler is unavailable (e.g.\ on a minimal CI container), so the vignette renders cleanly everywhere -- it just has fewer outputs on Stan-less systems.
GitHub Actions: automated pkgdown deployment. Three workflows
live in .github/workflows/:
\itemize{
\item R-CMD-check.yaml -- run R CMD check --as-cran on five
OS x R-version combinations on every push and PR. This
existed in earlier development snapshots and is kept here
for completeness.
\item vignettes.yaml -- path-filtered render check of both
the CRAN vignette and the pkgdown-only articles, with the
rendered HTMLs uploaded as a CI artifact for PR review.
Catches Rmd-level breakage in seconds, well before the
full website build.
\item pkgdown.yaml -- on every push to main and on each
release, build the full pkgdown site (covering the one
CRAN vignette plus the ten website articles) and deploy
to the gh-pages branch using the official
\code{pkgdown::build_site_github_pages()} +
\code{JamesIves/github-pages-deploy-action} flow. A
verification step fails the build if any of the core
articles (\code{complete-workflow},
\code{ast-formula-manipulation}, \code{hbsaems-modelling},
\code{advanced-features}) was not rendered, so an empty
or truncated site never reaches production. Pull-request
runs render the site as a downloadable artifact for
review without deploying.
}
A new \code{PKGDOWN_DEPLOY_GUIDE.md} at the repository root
explains first-time setup (GitHub Pages settings, Actions write
permissions, URL field in DESCRIPTION) and the local
pkgdown::build_site() preview workflow. The guide is
Rbuildignored so it does not bloat the CRAN tarball.
Vignette strategy reorganised for the CRAN release. Through
v1.0.0 development we built up an extensive collection of
vignettes covering every distribution-specific wrapper, every
advanced topic, and the internals. Eleven vignettes was a great
documentation set but a heavy load for a CRAN tarball; CRAN's
guidance for new submissions favours a small set of essential
vignettes. We now ship only one vignette in the tarball,
complete-workflow, which is the canonical end-to-end SAE
pipeline using hbm() and the standard Bayesian workflow
diagnostics. The remaining ten files are kept in the source
repository under vignettes/articles/ (a pkgdown convention)
and are rendered as articles on the package website at
https://madsyair.github.io/hbsaems/. R CMD build skips
vignettes/articles/ via .Rbuildignore, so the CRAN tarball
is leaner (-46 KB) without losing any documentation -- users
who install from CRAN still get the website link via the
URL: field in DESCRIPTION.
New article: "AST-based Formula Manipulation". A walkthrough
of how hbsaems rewrites user formulas internally to apply the
nonlinear, measurement_error, area_var, sampling_variance,
and handle_missing sugar -- and why a regex-based approach
would silently corrupt many legitimate formulas. Available at
https://madsyair.github.io/hbsaems/articles/ast-formula-manipulation.html.
sae_predict(model, newdata = X) failed with "variables can
neither be found in 'data' nor in 'data2'" when the model was
fit with sampling_variance = or fixed_params =. Those
sugar arguments inject internal offset columns named
\code{.hbsaems_
\code{sae_predict()} now repopulates the offset columns automatically when the user-supplied \code{newdata} has the same number of rows as the training data (the typical case of "predict at the same areas, possibly with updated covariates"). When \code{nrow(newdata)} differs, sae_predict() raises an informative error telling the user either to align the row count or to compute the offset column themselves (e.g.\ \code{newdata$.hbsaems_sigma_fixed <- sqrt(newdata$D)}).
The same fix benefits \code{model_average()}, which internally calls \code{sae_predict(m, newdata = newdata)} for each candidate model.
Seven regression tests cover the in-place copy path, the pass-through paths (no offset cols, or already populated by user), the nrow-mismatch error, and the NULL-training-data guard.
GitHub Actions R-CMD-check --as-cran failed with "Boost not
found" on Linux runners. CRAN's build farm ships the full Stan
toolchain (BH / RcppEigen / RcppParallel / StanHeaders) by
default, but GitHub's minimal R container does not -- and the
r-lib/actions/setup-r-dependencies@v2 step had no way to know
which transitive C++ header packages our \donttest{} examples
needed. All three workflows (R-CMD-check.yaml,
vignettes.yaml, pkgdown.yaml) now declare these packages
explicitly via extra-packages:, so the rstan compile path
works.
MCMC settings in \donttest{} examples standardised to
brms defaults. Examples now consistently use
chains = 4, iter = 2000, warmup = 1000 (the brms defaults
documented in ?brms::brm). Each example block also carries a
brief comment noting that production-grade inference on tougher
posteriors (funnel geometry, weakly identified priors) may
require iter = 4000, warmup = 2000 and
control = list(adapt_delta = 0.99), with cross-references to
the complete-workflow vignette.
GitHub Actions R-CMD-check --as-cran failed with "Boost not
found" on Linux runners. CRAN's build farm ships the full Stan
toolchain (BH / RcppEigen / RcppParallel / StanHeaders) by
default, but GitHub's minimal R container does not -- and the
r-lib/actions/setup-r-dependencies@v2 step had no way to know
which transitive C++ header packages our \donttest{} examples
needed. All three workflows (R-CMD-check.yaml,
vignettes.yaml, pkgdown.yaml) now declare these packages
explicitly via extra-packages:, so the rstan compile path
works.
MCMC sampling in \donttest{} examples was unnecessarily
heavy. Many examples ran chains = 2, iter = 2000 (some
chains = 4, iter = 4000), which added up to several minutes of
example-run time during R CMD check --run-donttest. CRAN's
guidance is that examples should run in a few seconds total per
function. All such examples have been retuned to
chains = 1, iter = 500, warmup = 250 (a few seconds each) and
prefixed with a comment explaining that production settings
should be used in real analyses. The displayed code in
vignettes still references the full production settings.
vignettes/articles/hbsaems-betalogitnorm-model.Rmd showed a
legacy code block that no longer runs. The article had a
"Reproducing the legacy hierarchical hyperprior" section that
demonstrated the pre-v1.0.0 \eqn{\phi \sim \mathrm{Gamma}(\alpha,
\beta)} construction by calling
\code{hbm_betalogitnorm(..., stanvars = stanvar("alpha ~
gamma(1, 1);", ...) + stanvar("beta ~ gamma(1, 1);", ...))}.
As of v1.0.0, \code{hbm_betalogitnorm()} explicitly rejects
that pattern at construction time with an informative error
(\code{alpha}/\code{beta} are no longer declared as Stan
parameters by the wrapper). Users who copy-pasted the
article's example would have hit the runtime error.
The section has been rewritten: \itemize{ \item the recommended v1.0.0 pattern (a phi-prior via \code{prior = brms::set_prior("gamma(2, 0.05)", class = "phi")}) is shown first; \item the legacy-reproduction example now uses \code{hbm_flex()} instead of \code{hbm_betalogitnorm()}, which does not apply the wrapper-specific guard, so the code actually runs. The generated Stan code includes the expected \code{lprior += gamma_lpdf(phi | alpha, beta);} declaration, verified by inspecting \code{brms::make_stancode()} output. } The migration-guide article has been updated to cross-reference the new corrected example.
Article YAML headers used rmarkdown::html_vignette. Files
under \code{vignettes/articles/} are pkgdown-only and no longer
carry a \\VignetteIndexEntry{} field (that field is what
\code{html_vignette} relies on for the page title). Manually
rendering these files with \code{rmarkdown::render()} would emit
a benign warning about the missing field. We now use
\code{rmarkdown::html_document} for articles, which works
identically when rendered by pkgdown but is silent under direct
\code{rmarkdown::render()} too -- helpful for contributors
previewing changes locally.
?register_hbsae_brms_custom example was non-idempotent.
The example registered \code{loglogistic_user} every time it ran
but never cleaned up. Running the example a second time in the
same R session (which R CMD check's example runner can do, and
which interactive users routinely do) would error with
\emph{"Model 'loglogistic_user' is already registered. Pass
overwrite = TRUE to replace it"}. The example now removes any
pre-existing \code{loglogistic_user} entry before registering
and unregisters at the end, so re-runs are clean.
\code{inst/examples/} subdirectory removed. The \code{inst/examples/hbsaems-examples.Rmd} file (31 KB) shipped with the package because it lived under \code{inst/}. Its content was redundant with the curated articles on the package website and bloated the CRAN tarball. Also removed three internal maintainer documents that had drifted into \code{inst/}: \code{inst/PKGDOWN_DEPLOY_GUIDE.md} (duplicate of the root version), \code{inst/YAML_FILES_GUIDE.md} (internal note), and \code{inst/DEPRECATED.md} (moved to repository root and Rbuildignored). Only the runtime-essential \code{inst/WORDLIST}, \code{inst/shiny/}, and \code{inst/stan/} remain. Net reduction in installed package size: ~50 KB.
?hbm_betalogitnorm example 3 used the removed
alpha/beta hyperprior pattern. R CMD check
\code{--run-donttest} caught a leftover example block that
attempted to declare priors on \code{alpha} and \code{beta}
via \code{stanvars}, both of which were Stan parameters in
the pre-v1.0.0 hierarchical phi construction but were removed
in this release (see the migration note at the top of
\code{?hbm_betalogitnorm}). Example 3 has been rewritten to
demonstrate the supported v1.0.0 pattern -- a non-default phi
prior is passed via brms's standard
\code{prior = brms::set_prior("gamma(2, 0.5)", class = "phi")}
rather than via the legacy stanvars sampling statements. The
resulting Stan code uses \code{phi ~ gamma(2, 0.5)} as
intended.
R CMD check NOTE: tests/spelling.Rout.save comparison.
R CMD check was diffing the live spelling test output against
a saved snapshot, which broke any time a new acceptable term
appeared in NEWS or vignettes (e.g.\ \code{pred} from an
internal predictions data structure). We have removed the
\code{tests/spelling.Rout.save} comparison file and rely solely
on \code{spelling::spell_check_test(skip_on_cran = TRUE)}, which
uses the proper \code{inst/WORDLIST} mechanism. This is the
approach recommended by Jeroen Ooms (author of \pkg{spelling})
for packages that want spelling checks during development but
not on CRAN's machines.
AST-based formula rewriting silently dropped offset() terms.
Both \code{.replace_nl_in_formula()} (used by the
\code{nonlinear=} sugar) and \code{.apply_measurement_error()}
(used by the \code{measurement_error=} sugar) decompose the
formula RHS via \code{stats::terms()} and reassemble it with
\code{stats::reformulate()}. The reassembly step iterated only
over \code{attr(terms, "term.labels")}, which is the list of
predictor terms \emph{excluding} \code{offset()} -- those live in
a separate \code{attr(terms, "offset")} attribute as integer
positions into \code{attr(terms, "variables")}. As a result, a
user formula like
\code{y ~ x1 + offset(log(pop)) + x2} called with
\code{nonlinear = "x1"} would silently lose the offset, producing
\code{y ~ s(x1) + x2}.
Both helpers now extract the offset terms and splice them back
into the assembled RHS verbatim. Eight regression tests cover
offset preservation, substring-name safety, wrapper preservation
(I, poly, me), interaction-term preservation, and the
hierarchical \code{.build_area_re_formula()}.
Conflict protection: custom brms families could silently shadow built-in or brms-native families. Three classes of conflict were previously possible: \enumerate{ \item Overwriting hbsaems built-in custom families. \code{register_hbsae_brms_custom(key = "loglogistic", ...)} silently replaced the package-curated \code{loglogistic} family because the override check only inspected the brms-native list and missed \code{loglogistic}/\code{shifted_loglogistic} (which were registered separately by \code{.onLoad()}). \item Shadowing brms-native families. A user calling \code{register_hbsae_brms_custom(key = "weibull", ...)} inserted a custom record that \code{.resolve_family_object()} would consult \emph{before} falling back to \code{brms::brmsfamily()}, so subsequent \code{hbm(hb_sampling = "weibull")} calls silently dispatched to the user's stub rather than to brms's built-in weibull family. \item Stale custom registrations leaking past \code{.init_model_registry()} resets. }
This release introduces a comprehensive \code{.builtin_keys()} list that covers both brms-native AND hbsaems-bundled custom families, makes the override-protection logic in both \code{register_hbsae_model()} and \code{register_hbsae_brms_custom()} consult it, adds a \code{.is_brms_native_family()} probe so registrations that would shadow a brms-native family emit an informative warning pointing the user at the \code{hbsae_} prefix convention, and rewrites \code{.init_model_registry()} to clear non-builtin keys before re-asserting the built-ins.
An internal option (\code{hbsaems.in_builtin_registration}) lets \code{.register_builtin_custom_families()} bypass the new check during package load, avoiding a chicken-and-egg circular dependency.
Six regression tests cover all paths.
dloglogistic() errored when mu or beta was NA.
\code{stopifnot(all(mu > 0))} treats \code{NA} as failure, so
\code{dloglogistic(1, mu = NA, beta = 1)} stopped with
\emph{"all(mu > 0) is not TRUE"}. Now NAs in parameters
propagate as NA in the result (matching the v1.0.0 NA-in-x fix
and the wider R density-function convention), while finite
non-positive parameters still error explicitly.
Stan function names safely prefixed with hbsae_. Audited
\code{inst/stan/hbsae_loglogistic.stan} and
\code{inst/stan/hbsae_shifted_loglogistic.stan} to confirm every
Stan-side function declaration uses the \code{hbsae_} prefix,
avoiding collision with Stan 2.29+'s built-in
\code{loglogistic_lpdf}. This was already correct in the code;
the audit added documentation comments and a verifying dev-test.
dloglogistic() silently coerced NA inputs to 0. Base R
density functions follow the convention that
\code{dnorm(NA, 0, 1) = NA}, but our \code{dloglogistic(NA, ...)}
returned 0 (the density at \eqn{x \le 0}, which is also 0). The
vectorised case was particularly misleading because a partial NA
vector produced no warning even though one position was being
silently replaced with 0. We now propagate NA from any of
\code{x}, \code{mu}, or \code{beta} -- matching the
\code{dshifted_loglogistic} behaviour and the wider R density-
function convention.
run_sae_app() produced a cryptic error when \pkg{shiny} was
missing. Because \pkg{shiny} lives in \code{Suggests}, users
who installed hbsaems with only the modelling dependencies would
see "could not find function runApp" rather than an actionable
error. We now check \code{requireNamespace("shiny")} at the top
of \code{run_sae_app()} and emit an informative error pointing
to \code{install.packages("shiny")}.
run_sae_app(check_deps = FALSE) carried stale optional_missing
state across sessions. If the user previously launched with
\code{check_deps = TRUE} (which writes
\code{options(hbsaems.shiny_missing_optional = ...)}) and then
re-launched with \code{check_deps = FALSE}, the GUI banner would
still display the cached missing-package list. We now clear the
option to \code{character(0L)} in the \code{FALSE} branch.
DESCRIPTION Suggests: missing energy and minerva. The
Shiny app's exploratory-correlation panel (Distance correlation,
MIC) depends on these two packages, but \code{check_shiny_deps()}
was the only place that referenced them. Without listing them in
Suggests, downstream automated checks (e.g.\ \code{R CMD check
--as-cran}) would have flagged the unconditional
\code{requireNamespace()} call as a NOTE. Both packages are now
in Suggests as documented in \code{check_shiny_deps()}.
Bundled data: HIERARCHICAL CONSISTENCY across datasets. All
four bundled datasets now use the same administrative-level
labels with the same meanings:
\itemize{
\item province -- Level 1 (Indonesian provinsi).
\item regency -- Level 2 (Indonesian kabupaten/kota; analogue
of US county).
\item district -- Level 3 (Indonesian kecamatan; analogue of
US census tract).
}
What differs between datasets is the analysis resolution -- i.e.
at which level the 100 small-area observations live:
\itemize{
\item data_fhnorm, data_betalogitnorm: 100 regencies nested
within 5 provinces (analysis at regency level).
\item data_binlogitnorm, data_lnln: 100 districts nested
within 5 regencies (analysis at district level; "province"
omitted because it would be a single value above 5
regencies and contribute no variation).
}
Label format conventions disambiguate at a glance:
three-digit suffixes (\code{regency_001..100},
\code{district_001..100}) for the 100-level fine analysis areas
and two-digit suffixes (\code{province_01..05},
\code{regency_01..05}) for the 5-level coarse spatial clusters.
Through earlier v1.0.0 iterations the same role was variously
called "regency" (creating dual-semantics confusion), "kabupaten"
(Indonesian-only), and "county" (US-only); none of those reached
CRAN. The matrix has been correspondingly renamed
adjacency_matrix_car_regency (5x5, rownames
\code{regency_01..05}). Total impact on data folder vs original
pre-audit state is +4 bytes (29822 -> 29826) -- effectively
neutral.
Documentation: incorrect spatial pairings in three examples.
Two SAR examples paired spatial_weight_sar (100x100,
regency-named) with spatial_var = "province" (5 levels) --
this would error at fit time because brms cannot resolve a
5-level grouping factor against a 100-row weight matrix. One
BYM2 example paired data_binlogitnorm (which has no
province column) with spatial_var = "province". All three
examples in ?hbm, vignette('hbsaems-spatial'), and inline
documentation have been corrected.
print.hbcc_results() reported PASS on all-NA Rhat. Same
pattern as the \code{is_converged()} fix, but in the print
method: \code{all(rh < 1.1, na.rm = TRUE)} returns TRUE on
\code{numeric(0)} and silently displayed \emph{"R-hat < 1.1 :
PASS"} for degenerate fits with no finite Rhat values. Now
reports \emph{"R-hat < 1.1 : ?? (all NA -- model fit may be
degenerate)"}.
print.hbsae_results() printed NaN% and Inf to -Inf.
When \code{rse_model} is NA / NaN (all areas had zero predictions
-- see the v1.0.0 RSE guard in \code{sae_predict()}) the print
method would output literal \code{"NaN %"} and \code{"Inf to -Inf"}
for the prediction range. Both are now handled with explicit
branches that label them as \code{"NA (no finite predictions)"}.
Bundled-data integrity now checked at every test run. A new \code{test-data-integrity.R} test file verifies that: \itemize{ \item the four datasets retain their documented columns and dimensions, \item all key columns are non-NA, finite, and within their domains (\code{y} in \eqn{(0, 1)} for the Beta data, \code{y <= n} for the binomial data, \code{psi_i > 0} for the lognormal data, etc.), \item the documented pairings between adjacency matrices and datasets are satisfied at the level of their row-name / factor-level coincidence. } These tests would have caught the historical "regency means 100 levels in two datasets and 5 levels in two others" mismatch before it shipped.
hbm_warnings() failed to flag degenerate model fits. The
R-hat convergence check used \code{!all(rh < 1.1, na.rm = TRUE)},
which evaluates to \code{!TRUE} (i.e.\ no warning fires) when all
Rhat values are NA -- the textbook signature of a sampler that
produced no useful draws. Same issue affected the
\code{neff_ratio} check. Both now detect the all-NA / all-non-
finite case explicitly and emit a dedicated
\dQuote{model fit appears degenerate} warning.
register_hbsae_model() silently overwrote built-in families.
Passing \code{overwrite = TRUE} for a built-in key
(\code{"gaussian"}, \code{"beta"}, \code{"lognormal"},
\code{"binomial"}, etc.) silently replaced the package-curated
spec, losing the validation rules and link handling that the
built-in provides. Without \code{overwrite = TRUE} the function
now refuses to touch a built-in; with \code{overwrite = TRUE} it
emits a warning so the choice is conscious.
register_hbsae_model() accepted invalid registry keys. Keys
with whitespace (\code{"my key"}), leading digits
(\code{"3factor"}), or other punctuation that would force
backtick-quoting are not usable as R names and broke downstream
registry access. Keys are now validated with \code{make.names()}.
Empty-string keys are also rejected.
register_hbsae_model() did not validate
\code{response_check_msg} type. Non-character or vector-of-
strings values would either error cryptically downstream or
produce garbled \code{stop()} messages. Now validated as
\code{NULL} or a single character string.
register_hbsae_brms_custom() had thinner validation than its
sibling. The brms-custom-family registration path used a bare
\code{stopifnot()} for the key, didn't validate
\code{response_check}, \code{response_check_msg},
\code{supports_mi}, or \code{discrete}, and let callers silently
shadow built-in families. Validation now matches
\code{register_hbsae_model()}.
is_converged() returned silent TRUE on all-NA Rhat. Both
S3 methods relied on \code{all(rhat < threshold, na.rm = TRUE)},
which returns TRUE for \code{numeric(0)} after the NA's are
stripped -- producing a degenerate "converged" signal whenever
the underlying brmsfit had no finite Rhat values (e.g.\ a
collapsed sampler, all chains diverged, mock object).
\code{is_converged()} now returns \code{NA} with an informative
warning in that case.
is_converged() did not validate the threshold argument.
A string threshold (\code{"1.1"}), \code{NULL}, length>1 vector,
negative value, or non-finite value all silently returned TRUE
due to R's coercion rules in \code{<} and \code{all()}. We now
reject those inputs explicitly via \code{.check_rhat_threshold()}.
sae_benchmark() silently produced NA / Inf when target was
non-finite. Passing \code{target = NA} or \code{target = Inf}
propagated through the ratio / difference computation and
produced an all-NA / all-Inf benchmarked table. Both are
now rejected with informative errors. Likewise, non-finite
values in \code{predictions$pred} now raise an early error
pointing the user back to the model-fit step.
Legacy \code{aux_param_hyperprior} callback removed from the Beta family registry. The pre-v1.0.0 hierarchical hyperprior $\phi \sim \mathrm{Gamma}(\alpha, \beta)$ with $\alpha, \beta \sim \mathrm{Gamma}(1, 1)$ was previously hard-wired in the registry under \code{models-registry.R}. That made \code{hbm_flex(family_key = "beta", aux_args = list(n = "n", deff = "deff"))} silently re-inject the old construction even after \code{hbm_betalogitnorm()} was simplified to use brms's default \eqn{\mathrm{Gamma}(0.01, 0.01)} prior in \href{#}{phi prior simplification}. The callback has been removed from the built-in spec; custom families can still register one via \code{\link{register_hbsae_model}}.
hbm_flex() length-1 area_var validation crashed on length>1 vectors.
When the user supplied an area_var character vector of length 2 or
more (the hierarchical-area mode), the validation
!is.null(area_var) && !(area_var %in% names(data)) raised the R 4.2+
error \emph{"'length = 2' in coercion to 'logical(1)'"} -- the &&
operator does not coerce length>1 logicals to scalar. Replaced with
a setdiff()-based check that produces an informative error listing
every missing column. Same idiom applied defensively to
spatial_var (which is always length-1 by API design).
re = ~ (1 | x/y) nested syntax rejected by re validator.
.build_area_re_formula() synthesises nested random-effect terms
with the lme4 sugar \code{(1 | g1/g2)}, but the regex in
hbm() only accepted \code{(1 | g)} and \code{(1 | g1 + g2)}.
The regex now accepts the / separator as well, so hierarchical-
area models built via area_var = c("province", "regency"), area_re_structure = "nested" reach brms with the correct formula.
sae_aggregate(method = "weighted") with all-zero weights produced NaN.
The internal normalisation weights / sum(weights) silently
produced NaN when all weights were zero (and Inf when sum was
negative). We now validate that the weight sum is strictly
positive and that all individual weights are finite, raising
informative errors otherwise.
sae_transform() recycled scalar return values silently.
A user passing a reducer function (e.g.\ fun = sum, fun = mean)
instead of an element-wise transform (fun = log, fun = exp) saw
silent scalar recycling -- every area's prediction became identical.
We now validate that fun(pred) returns a numeric vector of the
same length as the input and emit an informative error otherwise.
update_hbm() failed silently when newdata lacked offset columns.
Models fitted with \code{sampling_variance}, \code{n} + \code{deff}
(in \code{hbm_betalogitnorm}), or \code{fixed_params} attach hidden
offset columns (\code{.hbsaems_
Multivariate (mi() / joint) models failed in
\code{sae_predict()}, \code{prior_check()}, and
\code{model_compare()}. brms's \code{posterior_predict()} returns
a 3-D array of shape (draws x obs x responses) for multivariate
formulas (e.g.\ \code{bf(y | mi() ~ ...) + bf(x1 | mi() ~ ...)});
\code{colMeans()} and \code{apply()} downstream produced 2-D
outputs that broke the construction of \code{result_table}.
Additionally, \code{brms::pp_check()} requires an explicit
\code{resp =
CRITICAL: .add_fixed_pforms() failed on multivariate formulas.
When the user combined \code{mi()} joint-imputation formulas
(e.g.\ \code{bf(y | mi() ~ mi(x1) + x2) + bf(x1 | mi() ~ x2)})
with \code{sampling_variance} / \code{fixed_params}, the helper
attempted to append \code{sigma ~ 0 + offset(.hbsaems_sigma_fixed)}
via \code{bform + brms::lf(rhs)}. brms refused this because the
dpar formula did not specify which response (\code{y} or
\code{x1}) the \code{sigma} applies to, producing
\emph{"Don't know how to add a list object without the response
variable name"}. The helper now detects \code{mvbrmsformula}
objects, extracts the primary response (first sub-formula by
convention), and passes \code{resp =
CRITICAL: sampling_variance /
\code{fixed_params$sigma} silently corrupted by brms's default
log link on sigma. When the user supplied
\code{sampling_variance = "D"}, the wrapper stored
\eqn{\sqrt{D_i}} in \code{.hbsaems_sigma_fixed} and attached it
to the brms formula as \code{sigma ~ 0 + offset(.hbsaems_sigma_fixed)}.
However, brms applies the dpar's link function before plugging
the linear predictor into the likelihood; for Gaussian /
Lognormal / Student families the default \code{link_sigma = "log"}
caused the Stan model to compute \code{sigma = exp(sqrt(D_i))}
instead of \eqn{\sqrt{D_i}}. E.g.\ \eqn{D_i = 4} should give
\eqn{\sigma_i = 2.0} but actually produced \eqn{\sigma_i \approx
e^{2.0} \approx 7.39} -- a catastrophic miscalibration of the
Fay-Herriot model. Same bug affected any user pinning sigma via
the generic \code{fixed_params = list(sigma = ...)}, and would
also have affected phi for Beta if \code{hbm_betalogitnorm()}
had not already manually set \code{link_phi = "identity"}. The
fix forces \code{link_
posterior_interval() and prior_draws() now re-export the
upstream generics from \pkg{rstantools} and \pkg{brms}
respectively, rather than defining new generics with conflicting
signatures. This fixes a name-collision crash that occurred when
\pkg{brms} was attached together with \pkg{hbsaems} and the user
called \code{posterior_interval(fit)} on an \code{hbmfit} object:
the error message
\emph{"For the default method 'object' should be a matrix"} no
longer occurs. The fix follows the standard R package design
pattern used by \pkg{brms} itself.
sae_benchmark() scale corruption fix. The previous default
weights = rep(1 / n, n) silently assumed target was a population
mean. When users instead passed a population total, the implied
ratio adjustment came out roughly \eqn{n} times too large --
benchmarked estimates were scale-corrupt by a factor of \eqn{n}.
Fixed by introducing the explicit target_type argument (see New
features) and emitting a message when the default kicks in.
sae_scale() zero-variance NaN propagation. When all area
predictions were identical, base::scale() produced NaN
throughout, silently corrupting result_table. Now detected
with a warning; the method returns centred-only (or raw)
predictions instead.
hbm_betalogitnorm() divergent-transitions on random phi.
The prior default link_phi = "identity" together with the
hyperprior \eqn{\phi \sim \mathrm{Gamma}(\alpha, \beta)} let NUTS
propose negative \eqn{\phi}, triggering \eqn{\log(0)} evaluations.
Default is now resolved conditionally (see New features).
dev-tests library(hbsaems) auto-attach. The helper file
tests/testthat/dev-tests/helper-dev-setup.R now attaches the
package automatically, fixing 150+ spurious could not find function errors when dev-tests were run via
testthat::test_dir().
hbm_betalogitnorm() early validation of spatial_var.
When the user supplied spatial_var referencing a column that
did not exist in data, the error previously surfaced only
inside brms with the unfriendly message "variables can neither
be found in 'data' nor in 'data2'". The wrapper now validates
the column up front and emits the friendlier "Spatial variable
'data."area_var.
hbm() data2 collision with user-supplied .... Calling
hbm() with both spatial_model = "car" (or "sar") and a
user-supplied data2 = list(...) via ... crashed with the
R-level error "formal argument 'data2' matched by multiple
actual arguments". hbm() builds an internal data2 = list(M = M)
for the spatial weight matrix and also splices ... into the
brms::brm() call, producing two data2 keys at the
\code{do.call()} site. The fix extracts any user-supplied
\code{data2} from \code{...} before constructing the brms argument
list and merges it into the internal \code{data2} via
\code{modifyList()} -- the spatial-matrix slot (\code{M}) supplied
by hbsaems wins on collision, but additional user keys
(e.g.\ auxiliary matrices for nonlinear models) are preserved.
As a related hardening, attempts to override other internally-
managed brms arguments (\code{save_pars}, \code{stanvars},
\code{prior}, \code{control}, etc.) via \code{...} now raise an
informative error pointing the user to the dedicated hbsaems
argument.
Test suite reorganisation. Per the recommendation of an external code review, validation-only tests (those exercising only input validation, deprecation warnings, and error paths) have been split out from the heavy integration tests and migrated to \code{tests/testthat/}. Each \code{test_that()} block was classified by static analysis: blocks whose only \code{hbm}/ \code{hbm_*} call appears inside \code{expect_error}/ \code{expect_warning} are CRAN-safe and now run on every check; the remaining ~60 integration tests that actually compile a Stan model stay in \code{tests/testthat/dev-tests/} and are gated by \code{NOT_CRAN=true} (or \code{R_RUN_DEV_TESTS=true}). Net effect: CRAN test count rose from 391 to ~466 with no measurable runtime increase. Internal helpers \code{.brm_stub()} and \code{.hbm_stub()} in \code{tests/testthat/helper-mocks.R} provide brms-compatible shells for the few tests that need to exercise the deprecation pipeline without compiling Stan.
Project polish. Added CONTRIBUTING.md documenting the
development workflow (test tiers, mock-stub usage, AST-based
formula manipulation patterns, spelling regeneration); added a
light .lintr configuration for contributors; updated
cran-comments.md to describe the first-submission test
environment and intentional notes.
All exported functions documented with full roxygen2 blocks
including @param, @return, @examples, and references where
appropriate.
Custom-distribution Stan code stored as separate .stan files in
inst/stan/ (one source of truth, syntax highlighting, no
string-escaping noise).
Test suite reorganised into CRAN-safe unit tests
(tests/testthat/) and heavy integration tests
(tests/testthat/dev-tests/, gated by skip_on_cran() and
excluded via .Rbuildignore).
hbm(), hbcc(),
hbmc(), hbpc(), hbsae(), and three example datasets
(data_fhnorm, data_lnln, data_betalogitnorm,
data_binlogitnorm).