Star

Overview

CausalELM provides easy-to-use implementations of modern causal inference methods in a lightweight package. While CausalELM implements a variety of estimators, they all have one thing in common—the use of machine learning models to flexibly estimate causal effects. This is where the ELM in CausalELM comes from—the machine learning model underlying all the estimators is an extreme learning machine (ELM). ELMs are a simple neural network that use randomized weights and offer a good tradeoff between learning non-linear dependencies and simplicity. Furthermore, CausalELM implements bagged ensembles of ELMs to reduce the variance resulting from randomized weights.

Estimators

CausalELM implements estimators for aggreate e.g. average treatment effect (ATE) and individualized e.g. conditional average treatment effect (CATE) quantities of interest.

Estimators for Aggregate Effects

Interrupted Time Series Estimator
G-computation
Double machine Learning

Individualized Treatment Effect (CATE) Estimators

S-learner
T-learner
X-learner
R-learner
Doubly Robust Estimator

Features

Estimate a causal effect, get a summary, and validate assumptions in just four lines of code
Enables using the same structs for regression and classification
Includes 13 activation functions and allows user-defined activation functions
Most inference and validation tests do not assume functional or distributional forms
Implements the latest techniques from statistics, econometrics, and biostatistics
Works out of the box with AbstractArrays or any data structure that implements the Tables.jl interface
Works with CuArrays, ROCArrays, and any other GPU-specific arrays that are AbstractArrays
CausalELM is lightweight—its only dependency is Tables.jl
Codebase is high-quality, well tested, and regularly updated

What's New?

Includes support for GPU-specific arrays and data structures that implement the Tables.jl API
Only performs randomization inference when the inference argument is set to true in summarize methods
Summaries support calculating marginal effects and confidence intervals
Randomization inference now uses multithreading
CausalELM was presented at JuliaCon 2024 in Eindhoven
Refactored code to be easier to extend and understand

What makes CausalELM different?

Other packages, mainly EconML, DoWhy, CausalAI, and CausalML, have similar funcitonality. Beides being written in Julia rather than Python, the main differences between CausalELM and these libraries are:

Simplicity is core to casualELM's design philosophy. CausalELM only uses one type of machine learning model, extreme learning machines (with bagging) and does not require you to import any other packages or initialize machine learning models, pass machine learning structs to CausalELM's estimators, convert dataframes or arrays to a special type, or one hot encode categorical treatments. By trading a little bit of flexibility for a simpler API, all of CausalELM's functionality can be used with just four lines of code.
As part of this design principle, CausalELM's estimators decide whether to use regression or classification based on the type of outcome variable. This is in contrast to most machine learning packages, which have separate classes or structs fro regressors and classifiers of the same model.
CausalELM's validate method, which is specific to each estimator, allows you to validate or test the sentitivity of an estimator to possible violations of identifying assumptions.
Unlike packages that do not allow you to estimate p-values and standard errors, use bootstrapping to estimate them, or use incorrect hypothesis tests, all of CausalELM's estimators provide p-values and standard errors generated via approximate randomization inference.
CausalELM strives to be lightweight while still being powerful and therefore does not have external dependencies: all the functions it uses are in the Julia standard library with the exception of model constructors, which use Tables.matrix to ensure integration with a wide variety of data structures.
The other packages and many others mostly use techniques from one field. Instead, CausalELM incorporates a hodgepodge of ideas from statistics, machine learning, econometrics, and biostatistics.
CausalELM doesn't use any unnecessary abstractions. The only structs are the actual models. Estimated effects are returned as arrays, summaries are returned in a dictionary, and the results of validating an estimator are returned as tuples. This is in contrast to other packages that utilize separate structs (classes) for summaries and inference results.

Installation

CausalELM requires Julia version 1.8 or greater and can be installed from the REPL as shown below.

using Pkg 
Pkg.add("CausalELM")

More Information

For a more interactive overview, see our JuliaCon 2024 talk here