Competing in a data science contest without reading the data(blog.mrtz.org) |
Competing in a data science contest without reading the data(blog.mrtz.org) |
The disconnect between static and interactive data analysis that is at the heart of the post is probably the most ignored issue in science.
To be honest, its hard not to ignore it given the implications of it (that we only get one shot at a set of test/validation/experimental data) and if we mess up, we're screwed.
Here is a very bad, very bad, very old, very old, AAAI workshop paper that sums up the idea (the journal paper is behind a pay wall.
http://aaaipress.org/Papers/Workshops/1999/WS-99-06/WS99-06-...