Double Machine Learning

5. Double Machine Learning#

In this lab, we replicate a study on internet-accessed sexually transmitted infection (eSTI) testing using a provided dataset. Initially, we perform descriptive analyses, presenting tables and graphs to compare the characteristics of treated and control groups.

The core focus is on Double Machine Learning (DML). We use various machine learning algorithms—Lasso, Regression Trees, Boosting Trees, and Regression Forests—to estimate the treatment effect while controlling for other variables. The results, including coefficients and standard errors, are compared across methods, with a consolidated table and graphical representation. This detailed analysis highlights the strengths of different approaches and helps identify the most effective model based on the lowest Root Mean Squared Error (RMSE) for both the outcome and treatment variables.

This lab offers a comprehensive hands-on experience in applying advanced econometric techniques and machine learning methods to real-world data, aligning with the rigorous analytical skills often required in research assistant roles.