| Title: | Hierarchical Probit Estimation for Dichotomized Data |
|---|---|
| Description: | Provides likelihood-based and hierarchical estimation methods for thresholded (binomial-probit) data. Supports fixed-mean and random-mean models with maximum likelihood estimation (MLE), generalized linear mixed model (GLMM), and Bayesian Markov chain Monte Carlo (MCMC) implementations. For methodological background, see Albert and Chib (1993) <doi:10.1080/01621459.1993.10476321> and McCulloch (1994) <doi:10.2307/2297959>. |
| Authors: | Zhaoze Liu [aut], Longwen Shang [aut], Mary Lesperance [aut], Shuqing Zhou [aut], Xuekui Zhang [aut, cre, fnd] |
| Maintainer: | Xuekui Zhang <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-05-15 09:14:27 UTC |
| Source: | https://github.com/cran/bin2norm |
This function handles two data-collection settings for estimating normal parameters from threshold-based (dichotomized) data:
Single-threshold per study:
Each of studies reports one threshold , a sample size ,
and the observed proportion of samples above that threshold.
We assume one normal distribution across all studies.
Methods include "MLE" and "probit".
Multiple-thresholds per study:
Each study reports thresholds , each with an
observed proportion . We assume the study-specific mean
and within-study variance .
Because each study has multiple cutpoints, one can estimate .
Methods include "MLE_integration", "GLMM", or "Bayesian" (MCMC).
bin2norm( scenario = c("single_threshold", "multiple_thresholds"), method = NULL, n_i = NULL, c_i = NULL, p_i_obs = NULL, data_list = NULL, ... )bin2norm( scenario = c("single_threshold", "multiple_thresholds"), method = NULL, n_i = NULL, c_i = NULL, p_i_obs = NULL, data_list = NULL, ... )
scenario |
character string, either |
method |
character string indicating which estimation method to use.
|
n_i, c_i, p_i_obs
|
used only if |
data_list |
used only if
|
... |
additional arguments passed to lower-level functions (e.g. |
A list of estimated parameters, depending on the data-collection setting
(scenario) and the chosen method. Typically includes:
mu or mu0
sigma
tau (only for multiple-threshold methods)
# Single-threshold example n_i <- c(100, 120, 80) c_i <- c(1.2, 1.0, 1.5) p_i_obs <- c(0.30, 0.25, 0.40) bin2norm(scenario="single_threshold", method="MLE", n_i=n_i, c_i=c_i, p_i_obs=p_i_obs) # Multiple-thresholds example data_list <- list( n_i = c(100, 120), c_ij = list(c(1.0,1.2), c(0.8,1.5,2.0)), p_ij_obs = list(c(0.20,0.30), c(0.15,0.40,0.55)) ) # MLE with numeric integration bin2norm(scenario="multiple_thresholds", method="MLE_integration", data_list=data_list, gh_points=5) # GLMM approximation # library(lme4) bin2norm(scenario="multiple_thresholds", method="GLMM", data_list=data_list, use_lme4=TRUE) # Bayesian MCMC approach # library(rstan) bin2norm(scenario="multiple_thresholds", method="Bayesian", data_list=data_list, iter=1000, chains=2)# Single-threshold example n_i <- c(100, 120, 80) c_i <- c(1.2, 1.0, 1.5) p_i_obs <- c(0.30, 0.25, 0.40) bin2norm(scenario="single_threshold", method="MLE", n_i=n_i, c_i=c_i, p_i_obs=p_i_obs) # Multiple-thresholds example data_list <- list( n_i = c(100, 120), c_ij = list(c(1.0,1.2), c(0.8,1.5,2.0)), p_ij_obs = list(c(0.20,0.30), c(0.15,0.40,0.55)) ) # MLE with numeric integration bin2norm(scenario="multiple_thresholds", method="MLE_integration", data_list=data_list, gh_points=5) # GLMM approximation # library(lme4) bin2norm(scenario="multiple_thresholds", method="GLMM", data_list=data_list, use_lme4=TRUE) # Bayesian MCMC approach # library(rstan) bin2norm(scenario="multiple_thresholds", method="Bayesian", data_list=data_list, iter=1000, chains=2)
Get initial values from data
estimate_initial_values_from_data(data_list)estimate_initial_values_from_data(data_list)
data_list |
your inputs |
a named list of initial values
Creates a single data frame stacking all thresholds from all studies, then
calls lme4::glmer(..., family=binomial(link='probit')) to fit a random-intercept
model:
with .
Interpreting results: ,
,
(if not forced to 0).
estimate_multiThresh_GLMM(data_list, use_lme4 = TRUE)estimate_multiThresh_GLMM(data_list, use_lme4 = TRUE)
data_list |
same structure: |
use_lme4 |
logical; if |
A list with mu0, sigma, tau, method="GLMM_probit".
Builds an inline Stan model for multiple thresholds per study. The user must have
the rstan package installed. We place random effects
and use a binomial likelihood for each threshold. By default, uses simple weakly
informative priors.
estimate_multiThresh_MCMC(data_list, iter = 2000, chains = 2)estimate_multiThresh_MCMC(data_list, iter = 2000, chains = 2)
data_list |
same structure as above: |
iter |
number of total iterations for each chain (default 2000) |
chains |
number of MCMC chains (default 2) |
a list containing stan_fit (the full Stan fit object), plus
mu0_est, sigma_est, tau_est as posterior means, and
method="Bayesian_MCMC".
Each study has thresholds , each with an observed proportion
. We assume and
. The log-likelihood integrates out
via Gauss-Hermite quadrature.
estimate_multiThresh_MLE(data_list, gh_points = 20)estimate_multiThresh_MLE(data_list, gh_points = 20)
data_list |
A list with:
|
gh_points |
integer; number of Gauss-Hermite points (default 12). |
A list with mu0, sigma, tau, method="MLE_integration".
Treats the count of "above threshold" in study as binomial with probability
. This uses numerical optimization (optim)
to maximize the binomial likelihood. Optionally uses Weighted OLS estimates as starting
values to improve convergence.
estimate_singleThresh_MLE(n_i, c_i, p_i_obs, use_wols_init = TRUE)estimate_singleThresh_MLE(n_i, c_i, p_i_obs, use_wols_init = TRUE)
n_i |
numeric vector of sample sizes |
c_i |
numeric vector of thresholds |
p_i_obs |
numeric vector of observed proportions above threshold |
use_wols_init |
logical; if |
A list with mu, sigma, method="MLE".
For each group , we assume the data follows:
where is a known threshold, and is the standard normal CDF (the probit link).
The function reconstructs individual binary outcomes based on observed probabilities,
and estimates the parameters using generalized linear modeling with a probit link.
estimate_singleThresh_probit(n_i, c_i, p_i_obs)estimate_singleThresh_probit(n_i, c_i, p_i_obs)
n_i |
numeric vector |
c_i |
numeric vector |
p_i_obs |
numeric vector |
A list with mu, sigma, method="probit".
Implements the formula
in a weighted least-squares sense, with weights = .
estimate_singleThresh_WOLS(n_i, c_i, p_i_obs)estimate_singleThresh_WOLS(n_i, c_i, p_i_obs)
n_i |
numeric vector |
c_i |
numeric vector |
p_i_obs |
numeric vector |
A list with mu, sigma.
Returns (nodes, weights) for approximating ,
ignoring any normalizing constant. This is a simple demonstration; for serious
applications, more robust libraries or expansions might be used.
gaussHermite(n)gaussHermite(n)
n |
integer number of quadrature points |
list with nodes and weights