CausalELM
CausalELM.CausalELM
— ModuleMacros, functions, and structs for applying Ensembles of extreme learning machines to causal inference tasks where the counterfactual is unavailable or biased and must be predicted. Supports causal inference via interrupted time series designs, parametric G-computation, double machine learning, and S-learning, T-learning, X-learning, R-learning, and doubly robust estimation.
For more details on Extreme Learning Machines see: Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Types
CausalELM.InterruptedTimeSeries
— TypeInterruptedTimeSeries(X₀, Y₀, X₁, Y₁; kwargs...)
Initialize an interrupted time series estimator.
Arguments
X₀::Any
: AbstractArray or Tables.jl API compliant data structure of covariates from the pre-treatment period.Y₁::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes from the pre-treatment period.X₁::Any
: AbstractArray or Tables.jl API compliant data structure of covariates from the post-treatment period.Y₁::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes from the post-treatment period.
Keywords
activation::Function=swish
: activation function to use.sample_size::Integer=size(X₀, 1)
: number of bootstrapped samples for the extreme learner.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X₀, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For a simple linear regression-based tutorial on interrupted time series analysis see: Bernal, James Lopez, Steven Cummins, and Antonio Gasparrini. "Interrupted time series regression for the evaluation of public health interventions: a tutorial." International journal of epidemiology 46, no. 1 (2017): 348-355.
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> m2 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁; regularized=false)
julia> x₀_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100))
julia> y₀_df = DataFrame(y=rand(100))
julia> x₁_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100))
julia> y₁_df = DataFrame(y=rand(100))
julia> m3 = InterruptedTimeSeries(x₀_df, y₀_df, x₁_df, y₁_df)
CausalELM.GComputation
— TypeGComputation(X, T, Y; kwargs...)
Initialize a G-Computation estimator.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
quantity_of_interest::String
: ATE for average treatment effect or ATT for average treatment effect on the treated.activation::Function=swish
: activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for the extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For a good overview of G-Computation see: Chatton, Arthur, Florent Le Borgne, Clémence Leyrat, Florence Gillaizeau, Chloé Rousseau, Laetitia Barbin, David Laplaud, Maxime Léger, Bruno Giraudeau, and Yohann Foucher. "G-computation, propensity score-based methods, and targeted maximum likelihood estimator for causal inference with different covariates sets: a comparative simulation study." Scientific reports 10, no. 1 (2020): 9219.
Examples
julia> X, T, Y = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = GComputation(X, T, Y)
julia> m2 = GComputation(X, T, Y; task="regression")
julia> m3 = GComputation(X, T, Y; task="regression", quantity_of_interest="ATE)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m5 = GComputation(x_df, t_df, y_df)
CausalELM.DoubleMachineLearning
— TypeDoubleMachineLearning(X, T, Y; kwargs...)
Initialize a double machine learning estimator with cross fitting.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates of interest.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for teh extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75, * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.folds::Integer
: number of folds to use for cross fitting.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For more information see: Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. "Double/debiased machine learning for treatment and structural parameters." (2016): C1-C68.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = DoubleMachineLearning(X, T, Y)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m2 = DoubleMachineLearning(x_df, t_df, y_df)
CausalELM.SLearner
— TypeSLearner(X, T, Y; kwargs...)
Initialize a S-Learner.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: the activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for eth extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For an overview of S-Learners and other metalearners see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = SLearner(X, T, Y)
julia> m2 = SLearner(X, T, Y; task="regression")
julia> m3 = SLearner(X, T, Y; task="regression", regularized=true)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m4 = SLearner(x_df, t_df, y_df)
CausalELM.TLearner
— TypeTLearner(X, T, Y; kwargs...)
Initialize a T-Learner.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: the activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for eth extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For an overview of T-Learners and other metalearners see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = TLearner(X, T, Y)
julia> m2 = TLearner(X, T, Y; regularized=false)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m3 = TLearner(x_df, t_df, y_df)
CausalELM.XLearner
— TypeXLearner(X, T, Y; kwargs...)
Initialize an X-Learner.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: the activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for eth extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For an overview of X-Learners and other metalearners see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = XLearner(X, T, Y)
julia> m2 = XLearner(X, T, Y; regularized=false)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m3 = XLearner(x_df, t_df, y_df)
CausalELM.RLearner
— TypeRLearner(X, T, Y; kwargs...)
Initialize an R-Learner.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates of interest.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: the activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for eth extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For an explanation of R-Learner estimation see: Nie, Xinkun, and Stefan Wager. "Quasi-oracle estimation of heterogeneous treatment effects." Biometrika 108, no. 2 (2021): 299-319.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = RLearner(X, T, Y)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m2 = RLearner(x_df, t_df, y_df)
CausalELM.DoublyRobustLearner
— TypeDoublyRobustLearner(X, T, Y; kwargs...)
Initialize a doubly robust CATE estimator.
Arguments
X::Any
: AbstractArray or Tables.jl API compliant data structure of covariates of interest.T::Any
: AbstractArray or Tables.jl API compliant data structure of treatment statuses.Y::Any
: AbstractArray or Tables.jl API compliant data structure of outcomes.
Keywords
activation::Function=swish
: the activation function to use.sample_size::Integer=size(X, 1)
: number of bootstrapped samples for eth extreme learners.num_machines::Integer=50
: number of extreme learning machines for the ensemble.num_feats::Integer=Int(round(0.75 * size(X, 2)))
: number of features to bootstrap for each learner in the ensemble.num_neurons::Integer
: number of neurons to use in the extreme learning machines.
Notes
To reduce the computational complexity you can reduce samplesize, nummachines, or num_neurons.
References
For an explanation of doubly robust cate estimation see: Kennedy, Edward H. "Towards optimal doubly robust estimation of heterogeneous causal effects." Electronic Journal of Statistics 17, no. 2 (2023): 3008-3049.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = DoublyRobustLearner(X, T, Y)
julia> x_df = DataFrame(x1=rand(100), x2=rand(100), x3=rand(100), x4=rand(100))
julia> t_df, y_df = DataFrame(t=rand(0:1, 100)), DataFrame(y=rand(100))
julia> m2 = DoublyRobustLearner(x_df, t_df, y_df)
julia> w = rand(100, 6)
julia> m3 = DoublyRobustLearner(X, T, Y, W=w)
CausalELM.CausalEstimator
— TypeAbstract type for GComputation and DoubleMachineLearning
CausalELM.Metalearner
— TypeAbstract type for metalearners
CausalELM.ExtremeLearner
— TypeExtremeLearner(X, Y, hidden_neurons, activation)
Construct an ExtremeLearner for fitting and prediction.
Notes
While it is possible to use an ExtremeLearner for regression, it is recommended to use RegularizedExtremeLearner, which imposes an L2 penalty, to reduce multicollinearity.
References
For more details see: Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Examples
julia> x, y = [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0], [0.0, 1.0, 0.0, 1.0]
julia> m1 = ExtremeLearner(x, y, 10, σ)
CausalELM.ELMEnsemble
— TypeELMEnsemble(X, Y, sample_size, num_machines, num_neurons)
Initialize a bagging ensemble of extreme learning machines.
Arguments
X::Array{Float64}
: array of features for predicting labels.Y::Array{Float64}
: array of labels to predict.sample_size::Integer
: how many data points to use for each extreme learning machine.num_machines::Integer
: how many extreme learning machines to use.num_feats::Integer
: how many features to consider for eac exreme learning machine.num_neurons::Integer
: how many neurons to use for each extreme learning machine.activation::Function
: activation function to use for the extreme learning machines.
Notes
ELMEnsemble uses the same bagging approach as random forests when the labels are continuous but uses the average predicted probability, rather than voting, for classification.
Examples
julia> X, Y = rand(100, 5), rand(100)
julia> m1 = ELMEnsemble(X, Y, 10, 50, 5, 5, CausalELM.relu)
CausalELM.Nonbinary
— TypeAbstract type used to dispatch risk_ratio on nonbinary treatments
CausalELM.Binary
— TypeType used to dispatch risk_ratio on binary treatments
CausalELM.Count
— TypeType used to dispatch risk_ratio on count treatments
CausalELM.Continuous
— TypeType used to dispatch risk_ratio on continuous treatments
Activation Functions
CausalELM.binary_step
— Functionbinary_step(x)
Apply the binary step activation function.
Examples
julia> binary_step(1)
1
julia> binary_step([-1000, 100, 1, 0, -0.001, -3])
6-element Vector{Int64}:
0
1
1
1
0
0
CausalELM.σ
— Functionσ(x)
Apply the sigmoid activation function.
Examples
julia> σ(1)
0.7310585786300049
julia> σ([1.0, 0.0])
2-element Vector{Float64}:
0.7310585786300049
0.5
CausalELM.tanh
— Functiontanh(x)
Apply the hyperbolic tangent activation function.
Examples
julia> CausalELM.tanh([1.0, 0.0])
2-element Vector{Float64}:
0.7615941559557649
0.0
CausalELM.relu
— Functionrelu(x)
Apply the ReLU activation function.
Examples
julia> relu(1)
1
julia> relu([1.0, 0.0, -1.0])
3-element Vector{Float64}:
1.0
0.0
0.0
CausalELM.leaky_relu
— Functionleaky_relu(x)
Apply the leaky ReLU activation function to a number.
Examples
julia> leaky_relu(1)
1
julia> leaky_relu([-1.0, 0.0, 1.0])
3-element Vector{Float64}:
-0.01
0.0
1.0
CausalELM.swish
— Functionswish(x)
Apply the swish activation function to a number.
Examples
julia> swish(1)
0.7310585786300049
julia> swish([1.0, -1.0])
2-element Vector{Float64}:
0.7310585786300049
-0.2689414213699951
CausalELM.softmax
— Functionsoftmax(x)
Apply the softmax activation function to a number.
Examples
julia> softmax(1)
1.0
julia> softmax([1.0, 2.0, 3.0])
3-element Vector{Float64}:
0.09003057317038045
0.24472847105479764
0.6652409557748219
julia> softmax([1.0 2.0 3.0; 4.0 5.0 6.0])
2×3 Matrix{Float64}:
0.0900306 0.244728 0.665241
0.0900306 0.244728 0.665241
CausalELM.softplus
— Functionsoftplus(x)
Apply the softplus activation function to a number.
Examples
julia> softplus(1)
1.3132616875182228
julia> softplus([1.0, -1.0])
2-element Vector{Float64}:
1.3132616875182228
0.3132616875182228
CausalELM.gelu
— Functiongelu(x)
Apply the GeLU activation function to a number.
Examples
julia> gelu(1)
0.8411919906082768
julia> gelu([-1.0, 0.0])
2-element Vector{Float64}:
-0.15880800939172324
0.0
CausalELM.gaussian
— Functiongaussian(x)
Apply the gaussian activation function to a real number.
Examples
julia> gaussian(1)
0.36787944117144233
julia> gaussian([1.0, -1.0])
2-element Vector{Float64}:
0.3678794411714423
0.3678794411714423
CausalELM.hard_tanh
— Functionhard_tanh(x)
Apply the hard_tanh activation function to a number.
Examples
julia> hard_tanh(-2)
-1
julia> hard_tanh([-2.0, 0.0, 2.0])
3-element Vector{Real}:
-1
0.0
1
CausalELM.elish
— Functionelish(x)
Apply the ELiSH activation function to a number.
Examples
julia> elish(1)
0.7310585786300049
julia> elish([-1.0, 1.0])
2-element Vector{Float64}:
-0.17000340156854793
0.7310585786300049
CausalELM.fourier
— Functionfourrier(x)
Apply the Fourier activation function to a real number.
Examples
julia> fourier(1)
0.8414709848078965
julia> fourier([-1.0, 1.0])
2-element Vector{Float64}:
-0.8414709848078965
0.8414709848078965
Average Causal Effect Estimators
CausalELM.g_formula!
— Functiong_formula!(g)
Compute the G-formula for G-computation and S-learning.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = GComputation(X, T, Y)
julia> g_formula!(m1)
julia> m2 = SLearner(X, T, Y)
julia> g_formula!(m2)
CausalELM.predict_residuals
— Functionpredict_residuals(D, x_train, x_test, y_train, y_test, t_train, t_test, Δ)
Predict treatment, outcome, and marginal effect residuals for double machine learning or R-learning.
Notes
This method should not be called directly.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> x_train, x_test = X[1:80, :], X[81:end, :]
julia> y_train, y_test = Y[1:80], Y[81:end]
julia> t_train, t_test = T[1:80], T[81:100]
julia> m1 = DoubleMachineLearning(X, T, Y)
julia> predict_residuals(m1, x_tr, x_te, y_tr, y_te, t_tr, t_te, zeros(100), 1e-5)
CausalELM.moving_average
— Functionmoving_average(x)
Calculates a cumulative moving average.
Examples
julia> moving_average([1, 2, 3])
Metalearners
CausalELM.doubly_robust_formula!
— Functiondoubly_robust_formula!(DRE, X, T, Y)
Estimate the CATE for a single cross fitting iteration via doubly robust estimation.
Notes
This method should not be called directly.
Arguments
DRE::DoublyRobustLearner
: the DoubleMachineLearning struct to estimate the effect for.X
: a vector of three covariate folds.T
: a vector of three treatment folds.Y
: a vector of three outcome folds.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(6, 100)
julia> m1 = DoublyRobustLearner(X, T, Y)
julia> doubly_robust_formula!(m1, X, T, Y)
CausalELM.stage1!
— Functionstage1!(x)
Estimate the first stage models for an X-learner.
Notes
This method should not be called by the user.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = XLearner(X, T, Y)
julia> stage1!(m1)
CausalELM.stage2!
— Functionstage2!(x)
Estimate the second stage models for an X-learner.
Notes
This method should not be called by the user.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = XLearner(X, T, Y)
julia> stage1!(m1)
julia> stage2!(m1)
CausalELM.weight_trick
— Functionweight_trick(R, T̃, Ỹ)
Use the weight trick to estimate the causal effect in the final stage of an R-learner.
Notes
This method should not be called directly.
Arguments
R::RLearner
: the RLearner struct to estimate the effect for.T̃
: a vector of residuals from predicting the treatment assignment.Ỹ
: a vector of residuals from predicting the outcome.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100), rand(6, 100)
julia> r_learner = RLearner(X, T, Y)
julia> X, T̃, Ỹ = generate_folds(r_learner.X, DRE.T, DRE.Y, DRE.folds)
julia> X_train, X_test = reduce(vcat, X[1:end .!== f]), X[f]
julia> Y_train, Y_test = reduce(vcat, Ỹ[1:end .!== f]), Ỹ[f]
julia> T_train, T_test = reduce(vcat, T̃[1:end .!== f]), T̃[f]
julia> Ỹ[f], T̃[f], _, _ = predict_residuals(
julia> r_learner, X_train, X_test, Y_train, Y_test, T_train, T_test, Δ
)
julia> weight_trick(r_learner, T̃, Ỹ)
Common Methods
CausalELM.estimate_causal_effect!
— Functionestimate_causal_effect!(its)
Estimate the effect of an event relative to a predicted counterfactual.
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> estimate_causal_effect!(m1)
estimate_causal_effect!(g)
Estimate a causal effect of interest using G-Computation.
Notes
If treatents are administered at multiple time periods, the effect will be estimated as the average difference between the outcome of being treated in all periods and being treated in no periods. For example, given that ividuals 1, 2, ..., i ∈ I recieved either a treatment or a placebo in p different periods, the model would estimate the average treatment effect as E[Yᵢ|T₁=1, T₂=1, ... Tₚ=1, Xₚ] - E[Yᵢ|T₁=0, T₂=0, ... Tₚ=0, Xₚ].
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = GComputation(X, T, Y)
julia> estimate_causal_effect!(m1)
estimate_causal_effect!(DML)
Estimate a causal effect of interest using double machine learning.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = DoubleMachineLearning(X, T, Y)
julia> estimate_causal_effect!(m1)
julia> W = rand(100, 6)
julia> m2 = DoubleMachineLearning(X, T, Y, W=W)
julia> estimate_causal_effect!(m2)
estimate_causal_effect!(s)
Estimate the CATE using an S-learner.
References
For an overview of S-learning see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m4 = SLearner(X, T, Y)
julia> estimate_causal_effect!(m4)
estimate_causal_effect!(t)
Estimate the CATE using an T-learner.
References
For an overview of T-learning see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m5 = TLearner(X, T, Y)
julia> estimate_causal_effect!(m5)
estimate_causal_effect!(x)
Estimate the CATE using an X-learner.
References
For an overview of X-learning see: Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for estimating heterogeneous treatment effects using machine learning." Proceedings of the national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = XLearner(X, T, Y)
julia> estimate_causal_effect!(m1)
estimate_causal_effect!(R)
Estimate the CATE using an R-learner.
References
For an overview of R-learning see: Nie, Xinkun, and Stefan Wager. "Quasi-oracle estimation of heterogeneous treatment effects." Biometrika 108, no. 2 (2021): 299-319.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = RLearner(X, T, Y)
julia> estimate_causal_effect!(m1)
estimate_causal_effect!(DRE)
Estimate the CATE using a doubly robust learner.
References
For details on how this method estimates the CATE see: Kennedy, Edward H. "Towards optimal doubly robust estimation of heterogeneous causal effects." Electronic Journal of Statistics 17, no. 2 (2023): 3008-3049.
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = DoublyRobustLearner(X, T, Y)
julia> estimate_causal_effect!(m1)
Inference
CausalELM.summarize
— Functionsummarize(mod, kwargs...)
Get a summary from a CausalEstimator or Metalearner.
Arguments
mod::Union{CausalEstimator, Metalearner}
: a model to summarize.
Keywords
n::Int=1000
: the number of iterations to generate the numll distribution for randomization inference if inference is true.inference::Bool
=false: wheteher calculate p-values and standard errors.mean_effect::Bool=true
: whether to estimate the mean or cumulative effect for an interrupted time series estimator.
Notes
p-values and standard errors are estimated using approximate randomization inference. If set to true, this procedure takes a long time due to repeated matrix inversions. You can greatly speed this up by setting to a lower number and launching Julia with more threads.
References
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> X, T, Y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(100)
julia> m1 = GComputation(X, T, Y)
julia> estimate_causal_effect!(m1)
julia> summarize(m1)
julia> m2 = RLearner(X, T, Y)
julia> estimate_causal_effect(m2)
julia> summarize(m2)
julia> m3 = SLearner(X, T, Y)
julia> estimate_causal_effect!(m3)
julia> summarise(m3) # British spelling works too!
CausalELM.generate_null_distribution
— Functiongenerate_null_distribution(mod, n)
generate_null_distribution(mod, n, mean_effect)
Generate a null distribution for the treatment effect of G-computation, double machine learning, or metalearning.
Arguments
mod::Any
: model to summarize.n::Int=100
: number of iterations to generate the null distribution for randomization inference.mean_effect::Bool=true
: whether to estimate the mean or cumulative effect for an interrupted time series estimator.
Notes
This method estimates the same model that is provided using random permutations of the treatment assignment to generate a vector of estimated effects under different treatment regimes. When mod is a metalearner the null statistic is the difference is the ATE.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
Examples
julia> x, t, y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(1:100, 100, 1)
julia> g_computer = GComputation(x, t, y)
julia> estimate_causal_effect!(g_computer)
julia> generate_null_distribution(g_computer, 500)
julia> x₀, y₀, x₁, y₁ = rand(1:100, 100, 5), rand(100), rand(10, 5), rand(10)
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> generate_null_distribution(its, 10)
CausalELM.quantities_of_interest
— Functionquantities_of_interest(mod, n)
quantities_of_interest(mod, n, mean_effect)
Generate a p-value and standard error through randomization inference
This method generates a null distribution of treatment effects by reestimating treatment effects from permutations of the treatment vector and estimates a p-value and standard from the generated distribution.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x, t, y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(1:100, 100, 1)
julia> g_computer = GComputation(x, t, y)
julia> estimate_causal_effect!(g_computer)
julia> quantities_of_interest(g_computer, 1000)
julia> x₀, y₀, x₁, y₁ = rand(1:100, 100, 5), rand(100), rand(10, 5), rand(10)
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> quantities_of_interest(its, 10)
CausalELM.confidence_interval
— Functionconfidence_interval(null_dist, effect)
Compute 95% confidence intervals via randomization inference.
This function should not be called directly by the user.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x, t, y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(1:100, 100, 1)
julia> g_computer = GComputation(x, t, y)
julia> estimate_causal_effect!(g_computer)
julia> null_dist = CausalELM.generate_null_distribution(g_computer, 1000)
julia> confidence_interval(null_dist, g_computer.causal_effect)
(-0.45147664642089147, 0.45147664642089147)
CausalELM.p_value_and_std_err
— Functionp_value_and_std_err(null_dist, test_stat)
Compute the p-value for a given test statistic and null distribution.
This is an approximate method based on randomization inference that does not assume any parametric form of the null distribution.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x, t, y = rand(100, 5), [rand()<0.4 for i in 1:100], rand(1:100, 100, 1)
julia> g_computer = GComputation(x, t, y)
julia> estimate_causal_effect!(g_computer)
julia> null_dist = CausalELM.generate_null_distribution(g_computer, 1000)
julia> p_value_and_std_err(null_dist, CausalELM.mean(null_dist))
(0.3758916871866841, 0.1459779344550146)
Model Validation
CausalELM.validate
— Functionvalidate(its; kwargs...)
Test the validity of an estimated interrupted time series analysis.
Arguments
its::InterruptedTimeSeries
: an interrupted time seiries estimator.
Keywords
n::Int
: number of times to simulate a confounder.low::Float64
=0.15: minimum proportion of data points to include before or after the tested break in the Wald supremum test.high::Float64=0.85
: maximum proportion of data points to include before or after the tested break in the Wald supremum test.
Notes
This method coducts a Chow Test, a Wald supremeum test, and tests the model's sensitivity to confounders. The Chow Test tests for structural breaks in the covariates between the time before and after the event. p-values represent the proportion of times the magnitude of the break in a covariate would have been greater due to chance. Lower p-values suggest a higher probability the event effected the covariates and they cannot provide unbiased counterfactual predictions. The Wald supremum test finds the structural break with the highest Wald statistic. If this is not the same as the hypothesized break, it could indicate an anticipation effect, a confounding event, or that the intervention or policy took place in multiple phases. p-values represent the proportion of times we would see a larger Wald statistic if the data points were randomly allocated to pre and post-event periods for the predicted structural break. Ideally, the hypothesized break will be the same as the predicted break and it will also have a low p-value. The omitted predictors test adds normal random variables with uniform noise as predictors. If the included covariates are good predictors of the counterfactual outcome, adding irrelevant predictors should not have a large effect on the predicted counterfactual outcomes or the estimated effect.
This method does not implement the second test in Baicker and Svoronos because the estimator in this package models the relationship between covariates and the outcome and uses an extreme learning machine instead of linear regression, so variance in the outcome across different bins is not much of an issue.
References
For more details on the assumptions and validity of interrupted time series designs, see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> estimate_causal_effect!(m1)
julia> validate(m1)
validate(m; kwargs)
Arguments
m::Union{CausalEstimator, Metalearner}
: model to validate/test the assumptions of.
Keywords
devs=::Any
: iterable of deviations from which to generate noise to simulate violations of the counterfactual consistency assumption.num_iterations=10::Int: number of times to simulate a violation of the counterfactual consistency assumption.
min::Float64
=1.0e-6: minimum probability of treatment for the positivity assumption.high::Float64=1-min
: maximum probability of treatment for the positivity assumption.
Notes
This method tests the counterfactual consistency, exchangeability, and positivity assumptions required for causal inference. It should be noted that consistency and exchangeability are not directly testable, so instead, these tests do not provide definitive evidence of a violation of these assumptions. To probe the counterfactual consistency assumption, we simulate counterfactual outcomes that are different from the observed outcomes, estimate models with the simulated counterfactual outcomes, and take the averages. If the outcome is continuous, the noise for the simulated counterfactuals is drawn from N(0, dev) for each element in devs, otherwise the default is 0.25, 0.5, 0.75, and 1.0 standard deviations from the mean outcome. For discrete variables, each outcome is replaced with a different value in the range of outcomes with probability ϵ for each ϵ in devs, otherwise the default is 0.025, 0.05, 0.075, 0.1. If the average estimate for a given level of violation differs greatly from the effect estimated on the actual data, then the model is very sensitive to violations of the counterfactual consistency assumption for that level of violation. Next, this methods tests the model's sensitivity to a violation of the exchangeability assumption by calculating the E-value, which is the minimum strength of association, on the risk ratio scale, that an unobserved confounder would need to have with the treatment and outcome variable to fully explain away the estimated effect. Thus, higher E-values imply the model is more robust to a violation of the exchangeability assumption. Finally, this method tests the positivity assumption by estimating propensity scores. Rows in the matrix are levels of covariates that have a zero probability of treatment. If the matrix is empty, none of the observations have an estimated zero probability of treatment, which implies the positivity assumption is satisfied.
References
For a thorough review of casual inference assumptions see: Hernan, Miguel A., and James M. Robins. Causal inference what if. Boca Raton: Taylor and Francis, 2024.
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, t, y = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]), vec(rand(1:100, 100, 1))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> validate(g_computer)
CausalELM.covariate_independence
— Functioncovariate_independence(its; kwargs..)
Test for independence between covariates and the event or intervention.
Arguments
its::InterruptedTImeSeries
: an interrupted time series estimator.
Keywords
n::Int
: number of permutations for assigning observations to the pre and post-treatment periods.
This is a Chow Test for covariates with p-values estimated via randomization inference, which does not assume a distribution for the outcome variable. The p-values are the proportion of times randomly assigning observations to the pre or post-intervention period would have a larger estimated effect on the the slope of the covariates. The lower the p-values, the more likely it is that the event/intervention effected the covariates and they cannot provide an unbiased prediction of the counterfactual outcomes.
For more information on using a Chow Test to test for structural breaks see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)),
randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> covariate_independence(its)
CausalELM.omitted_predictor
— Functionomitted_predictor(its; kwargs...)
See how an omitted predictor/variable could change the results of an interrupted time series analysis.
Arguments
its::InterruptedTImeSeries
: interrupted time seiries estimator.
Keywords
n::Int
: number of times to simulate a confounder.
Notes
This method reestimates interrupted time series models with uniform random variables. If the included covariates are good predictors of the counterfactual outcome, adding a random variable as a covariate should not have a large effect on the predicted counterfactual outcomes and therefore the estimated average effect.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)), randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> omitted_predictor(its)
CausalELM.sup_wald
— Functionsup_wald(its; kwargs)
Check if the predicted structural break is the hypothesized structural break.
Arguments
its::InterruptedTimeSeries
: interrupted time seiries estimator.
Keywords
n::Int
: number of times to simulate a confounder.low::Float64
=0.15: minimum proportion of data points to include before or after the tested break in the Wald supremum test.high::Float64=0.85
: maximum proportion of data points to include before or after the tested break in the Wald supremum test.
Notes
This method conducts Wald tests and identifies the structural break with the highest Wald statistic. If this break is not the same as the hypothesized break, it could indicate an anticipation effect, confounding by some other event or intervention, or that the intervention or policy took place in multiple phases. p-values are estimated using approximate randomization inference and represent the proportion of times we would see a larger Wald statistic if the data points were randomly allocated to pre and post-event periods for the predicted structural break.
References
For more information on using a Chow Test to test for structural breaks see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)),
randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> sup_wald(its)
CausalELM.p_val
— Functionp_val(x, y, β; kwargs...)
Estimate the p-value for the hypothesis that an event had a statistically significant effect on the slope of a covariate using randomization inference.
Arguments
x::Array{<:Real}
: covariates.y::Array{<:Real}
: outcome.β::Array{<:Real}
=0.15: fitted weights.
Keywords
two_sided::Bool=false
: whether to conduct a one-sided hypothesis test.
Examples
julia> x, y, β = reduce(hcat, (float(rand(0:1, 10)), ones(10))), rand(10), 0.5
julia> p_val(x, y, β)
julia> p_val(x, y, β; n=100, two_sided=true)
CausalELM.counterfactual_consistency
— Functioncounterfactual_consistency(m; kwargs...)
Arguments
m::Union{CausalEstimator, Metalearner}
: model to validate/test the assumptions of.
Keywords
num_devs=(0.25, 0.5, 0.75, 1.0)::Tuple
: number of standard deviations from which to generate noise from a normal distribution to simulate violations of the counterfactual consistency assumption.num_iterations=10::Int: number of times to simulate a violation of the counterfactual consistency assumption.
Notes
Examine the counterfactual consistency assumption. First, this function simulates counterfactual outcomes that are offset from the outcomes in the dataset by random scalars drawn from a N(0, numstddev). Then, the procedure is repeated numiterations times and averaged. If the model is a metalearner, then the estimated individual treatment effects are averaged and the mean CATE is averaged over all the iterations, otherwise the estimated treatment effect is averaged over the iterations. The previous steps are repeated for each element in numdevs.
Examples
julia> x, t = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]
julia> y = vec(rand(1:100, 100, 1)))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> counterfactual_consistency(g_computer)
CausalELM.simulate_counterfactual_violations
— Functionsimulate_counterfactual_violations(y, dev)
Arguments
y::Vector{<:Real}
: vector of real-valued outcomes.dev::Float64
: deviation of the observed outcomes from the true counterfactual outcomes.
Examples
julia> x, t, y = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]), vec(rand(1:100, 100, 1))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> simulate_counterfactual_violations(g_computer)
-0.7748591231872396
CausalELM.exchangeability
— Functionexchangeability(model)
Test the sensitivity of a G-computation or doubly robust estimator or metalearner to a violation of the exchangeability assumption.
References
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, t = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]
julia> y = vec(rand(1:100, 100, 1)))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> e_value(g_computer)
CausalELM.e_value
— Functione_value(model)
Test the sensitivity of an estimator to a violation of the exchangeability assumption.
References
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, t = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]
julia> y = vec(rand(1:100, 100, 1)))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> e_value(g_computer)
CausalELM.binarize
— Functionbinarize(x, cutoff)
Convert a vector of counts or a continuous vector to a binary vector.
Arguments
x::Any
: interable of numbers to binarize.x::Any
: threshold after which numbers are converted to 1 and befrore which are converted to 0.
Examples
julia> CausalELM.binarize([1, 2, 3], 2)
3-element Vector{Int64}:
0
0
1
CausalELM.risk_ratio
— Functionrisk_ratio(model)
Calculate the risk ratio for an estimated model.
Notes
If the treatment variable is not binary and the outcome variable is not continuous then the treatment variable will be binarized.
References
For more information on how other quantities of interest are converted to risk ratios see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, t = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]
julia> y = vec(rand(1:100, 100, 1)))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> risk_ratio(g_computer)
CausalELM.positivity
— Functionpositivity(model, [,min], [,max])
Find likely violations of the positivity assumption.
Notes
This method uses an extreme learning machine or regularized extreme learning machine to estimate probabilities of treatment. The returned matrix, which may be empty, are the covariates that have a (near) zero probability of treatment or near zero probability of being assigned to the control group, whith their entry in the last column being their estimated treatment probability. In other words, they likely violate the positivity assumption.
Arguments
model::Union{CausalEstimator, Metalearner}
: a model to validate/test the assumptions of.min::Float64
=1.0e-6: minimum probability of treatment for the positivity assumption.high::Float64=1-min
: the maximum probability of treatment for the positivity assumption.
Examples
julia> x, t = rand(100, 5), Float64.([rand()<0.4 for i in 1:100]
julia> y = vec(rand(1:100, 100, 1)))
julia> g_computer = GComputation(x, t, y, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> positivity(g_computer)
Validation Metrics
CausalELM.mse
— Functionmse(y, ŷ)
Calculate the mean squared error
See also mae
.
Examples
julia> mse([-1.0, -1.0, -1.0], [1.0, 1.0, 1.0])
4.0
CausalELM.mae
— Functionmae(y, ŷ)
Calculate the mean absolute error
See also mse
.
Examples
julia> mae([-1.0, -1.0, -1.0], [1.0, 1.0, 1.0])
2.0
CausalELM.accuracy
— Functionaccuracy(y, ŷ)
Calculate the accuracy for a classification task
Examples
julia> accuracy([1, 1, 1, 1], [0, 1, 1, 0])
0.5
CausalELM.precision
— Functionprecision(y, ŷ)
Calculate the precision for a classification task
See also recall
.
Examples
julia> CausalELM.precision([0, 1, 0, 0], [0, 1, 1, 0])
1.0
CausalELM.recall
— Functionrecall(y, ŷ)
Calculate the recall for a classification task
See also CausalELM.precision
.
Examples
julia> recall([1, 2, 1, 3, 0], [2, 2, 2, 3, 1])
0.5
CausalELM.F1
— FunctionF1(y, ŷ)
Calculate the F1 score for a classification task
Examples
julia> F1([1, 2, 1, 3, 0], [2, 2, 2, 3, 1])
0.4
CausalELM.confusion_matrix
— Functionconfusion_matrix(y, ŷ)
Generate a confusion matrix
Examples
julia> CausalELM.confusion_matrix([1, 1, 1, 1, 0], [1, 1, 1, 1, 0])
2×2 Matrix{Int64}:
1 0
0 4
Extreme Learning Machines
CausalELM.fit!
— Functionfit!(model)
Fit an ExtremeLearner to the data.
References
For more details see: Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Examples
julia> x, y = [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0], [0.0, 1.0, 0.0, 1.0]
julia> m1 = ExtremeLearner(x, y, 10, σ)
fit!(model)
Fit an ensemble of ExtremeLearners to the data.
Arguments
model::ELMEnsemble
: ensemble of ExtremeLearners to fit.
Notes
This uses the same bagging approach as random forests when the labels are continuous but uses the average predicted probability, rather than voting, for classification.
Examples
julia> X, Y = rand(100, 5), rand(100)
julia> m1 = ELMEnsemble(X, Y, 10, 50, 5, CausalELM.relu)
julia> fit!(m1)
CausalELM.predict
— Functionpredict(model, X)
Use an ExtremeLearningMachine or ELMEnsemble to make predictions.
Notes
If using an ensemble to make predictions, this method returns a maxtirs where each row is a prediction and each column is a model.
References
For more details see: Huang G-B, Zhu Q-Y, Siew C. Extreme learning machine: theory and applications. Neurocomputing. 2006;70:489–501. https://doi.org/10.1016/j.neucom.2005.12.126
Examples
julia> x, y = [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0], [0.0, 1.0, 0.0, 1.0]
julia> m1 = ExtremeLearner(x, y, 10, σ)
julia> fit!(m1, sigmoid)
julia> predict(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
julia> m2 = ELMEnsemble(X, Y, 10, 50, 5, CausalELM.relu)
julia> fit!(m2)
julia> predict(m2)
CausalELM.predict_counterfactual!
— Functionpredict_counterfactual!(model, X)
Use an ExtremeLearningMachine to predict the counterfactual.
Notes
This should be run with the observed covariates. To use synthtic data for what-if scenarios use predict.
See also predict
.
Examples
julia> x, y = [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0], [0.0, 1.0, 0.0, 1.0]
julia> m1 = ExtremeLearner(x, y, 10, σ)
julia> f1 = fit(m1, sigmoid)
julia> predict_counterfactual!(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
CausalELM.placebo_test
— Functionplacebo_test(model)
Conduct a placebo test.
Notes
This method makes predictions for the post-event or post-treatment period using data in the pre-event or pre-treatment period and the post-event or post-treament. If there is a statistically significant difference between these predictions the study design may be flawed. Due to the multitude of significance tests for time series data, this function returns the predictions but does not test for statistical significance.
Examples
julia> x, y = [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0], [0.0, 1.0, 0.0, 1.0]
julia> m1 = ExtremeLearner(x, y, 10, σ)
julia> f1 = fit(m1, sigmoid)
julia> predict_counterfactual(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
julia> placebo_test(m1)
CausalELM.set_weights_biases
— Functionset_weights_biases(model)
Calculate the weights and biases for an extreme learning machine.
Notes
Initialization is done using uniform Xavier initialization.
References
For details see; Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Examples
julia> m1 = RegularizedExtremeLearner(x, y, 10, σ)
julia> set_weights_biases(m1)
Utility Functions
CausalELM.var_type
— Functionvar_type(x)
Determine the type of variable held by a vector.
Examples
julia> CausalELM.var_type([1, 2, 3, 2, 3, 1, 1, 3, 2])
CausalELM.Count()
CausalELM.mean
— Functionmean(x)
Calculate the mean of a vector.
Examples
julia> CausalELM.mean([1, 2, 3, 4])
2.5
CausalELM.var
— Functionvar(x)
Calculate the (sample) mean of a vector.
Examples
julia> CausalELM.var([1, 2, 3, 4])
1.6666666666666667
CausalELM.one_hot_encode
— Functionone_hot_encode(x)
One hot encode a categorical vector for multiclass classification.
Examples
julia> CausalELM.one_hot_encode([1, 2, 3, 4, 5])
5×5 Matrix{Float64}:
1.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0
CausalELM.clip_if_binary
— Functionclip_if_binary(x, var)
Constrain binary values between 1e-7 and 1 - 1e-7, otherwise return the original values.
Arguments
x::Array
: array to clip if it is binary.var
: type of x based on calling var_type.
See also var_type
.
Examples
julia> CausalELM.clip_if_binary([1.2, -0.02], CausalELM.Binary())
2-element Vector{Float64}:
1.0
0.0
julia> CausalELM.clip_if_binary([1.2, -0.02], CausalELM.Count())
2-element Vector{Float64}:
1.2
-0.02
CausalELM.@model_config
— Macromodel_config(effect_type)
Generate fields common to all CausalEstimator, Metalearner, and InterruptedTimeSeries structs.
Arguments
effect_type::String
: "averageeffect" or "individualeffect" to define fields for either models that estimate average effects or the CATE.
Examples
julia> struct TestStruct CausalELM.@model_config average_effect end
julia> TestStruct("ATE", false, "classification", true, relu, F1, 2, 10, 5, 100, 5, 5, 0.25)
TestStruct("ATE", false, "classification", true, relu, F1, 2, 10, 5, 100, 5, 5, 0.25)
CausalELM.@standard_input_data
— Macrostandard_input_data()
Generate fields common to all CausalEstimators except DoubleMachineLearning and all Metalearners except RLearner and DoublyRobustLearner.
Examples
julia> struct TestStruct CausalELM.@standard_input_data end
julia> TestStruct([5.2], [0.8], [0.96])
TestStruct([5.2], [0.8], [0.96])
CausalELM.generate_folds
— Functiongenerate_folds(X, T, Y, folds)
Create folds for cross validation.
Examples
julia> xfolds, tfolds, yfolds = CausalELM.generate_folds(zeros(4, 2), zeros(4), ones(4), 2)
([[0.0 0.0], [0.0 0.0; 0.0 0.0; 0.0 0.0]], [[0.0], [0.0, 0.0, 0.0]], [[1.0], [1.0, 1.0, 1.0]])
CausalELM.convert_if_table
— Functionconvert_if_table(t)
Convert a data structure that implements the Tables.jl API to a matrix, otherwise return the original data.
Examples
julia> CausalELM.convert_if_table([1 1; 1 1])
2×2 Matrix{Int64}:
1 1
1 1