Work to Replicate
Wang, Yixin, and David M. Blei. "The Blessings of Multiple Causes." Journal of the American Statistical Association 114.528 (2019): 1574-1596. DOI: 10.1080/01621459.2019.1686987
Motivation
Addressing unobserved confounding in observational datasets is critically important when trying to make causal inferences. The Blessings of Multiple Causes promotes one such method to do so, the deconfounder.
In the experiments that my colleagues and I have performed, the technique has proven simultaneously harder to use than described and ultimately ineffective in simulations where we know the ground truth.
These observations persist when performing initial experiments with the authors' original data.
My colleagues (@bouzaghrane and @hassanobeid1994) and I believe that the original presentation of the Wang and Blei's work glosses over the issues that contribute to these hardships. Researchers attempting to make causal inferences with their own datasets may waste undue time trying to use this method, if they are not aware of these problems. Accordingly, we think a replication is a good idea as we can show, in the context of the original paper, new details that allow users to make informed choices about how to use this method and if the method is worth trying at all.
Beyond the points above, the authors' example code is in tensorflow, and it quite difficult to read / understand (in our opinion). Once we've replicated Wang and Blei's work in tensorflow, we would like to rewrite their code in pytorch / pyro. We expect this to be both easier to understand and edit for others who want to use / build upon their work.
Challenges
We expect the replication to take 2-3 months due to factors such as the ongoing pandemic and the fact that the collaborators on this project have other day-jobs.
Mild, but easily surmounted, expected difficulties include the fact that the authors example code is in tensorflow, and we have only basic familiarity with this framework. However, the authors have posted most of the code needed to replicate their paper. Additionally, Wang and Blei's article describes their algorithms clearly, and the data used in their study is available.
Lastly, the original code for the paper is not in the public domain, but we expect the authors to be reachable via email or via the github repo that provides tutorial / example code for the paper.
Questions
- Given that the original paper performs a simulation study, and we don't directly have access to the exact random seeds used in the article, are the editors of ReScience C okay with a replication that leads to the same qualitative conclusions even if the results are not identical?
- Beyond the pure replication of the article's results, are the editors of ReScience C okay with our providing further analysis and commentary of the original data and methods? All such analysis and commentary will be reproducible and open source via github with data and code.
Work to Replicate
Motivation
Addressing unobserved confounding in observational datasets is critically important when trying to make causal inferences. The Blessings of Multiple Causes promotes one such method to do so, the deconfounder.
In the experiments that my colleagues and I have performed, the technique has proven simultaneously harder to use than described and ultimately ineffective in simulations where we know the ground truth.
These observations persist when performing initial experiments with the authors' original data.
My colleagues (@bouzaghrane and @hassanobeid1994) and I believe that the original presentation of the Wang and Blei's work glosses over the issues that contribute to these hardships. Researchers attempting to make causal inferences with their own datasets may waste undue time trying to use this method, if they are not aware of these problems. Accordingly, we think a replication is a good idea as we can show, in the context of the original paper, new details that allow users to make informed choices about how to use this method and if the method is worth trying at all.
Beyond the points above, the authors' example code is in tensorflow, and it quite difficult to read / understand (in our opinion). Once we've replicated Wang and Blei's work in tensorflow, we would like to rewrite their code in pytorch / pyro. We expect this to be both easier to understand and edit for others who want to use / build upon their work.
Challenges
We expect the replication to take 2-3 months due to factors such as the ongoing pandemic and the fact that the collaborators on this project have other day-jobs.
Mild, but easily surmounted, expected difficulties include the fact that the authors example code is in tensorflow, and we have only basic familiarity with this framework. However, the authors have posted most of the code needed to replicate their paper. Additionally, Wang and Blei's article describes their algorithms clearly, and the data used in their study is available.
Lastly, the original code for the paper is not in the public domain, but we expect the authors to be reachable via email or via the github repo that provides tutorial / example code for the paper.
Questions