If you use this data set in publications please cite
A.-L. Boulesteix, S. Lauer, M. Eugster, 2013. A plea for neutral comparison studies in computational sciences. PLoS One 8(4):e61562.
The issue (v2) should not be included in the clustering.
It is possible to consider the journal (v1) or the discipline (v17) as external variables.
v2 issue
v3 comparison study of 2 or more methods (yes/no)
v4 evaluation on real data (yes/no)
v5 satisfies comparison study and evaluation on real data (yes/no)
v6 number of data sets (integer)
v7 number of considered methods / variants (integer)
v8 number of accuracy measures (integer)
v9 new method? (1=yes, 0=no) v10 have the authors (cumulatively) published three or more methods on one of the used methods included in the comparison? (yes/no)
v11 clear winner(s)? (yes/no)
v12 new method belongs to winners ? (yes/no)
v13 winners mentioned in abstract ? (yes/no)
v14 negative result (yes/no)
v15 if new method, results presented for several methods ? (yes/no)
v16 simulation included ? (yes/no)
v17 discipline ("BI"=bioinformatics, "CS"=computational statistics, "ML"=machine learning)
v18 winner categories (1 = clear winner , 2 = competitor with advantages, 3 = competitor without advantages (or drawbacks))
v19 new method or comparison ("new method", "comparison")
There are missing values in:
- v10 if the information was not found during our search
- v12 if no new method is presented
- v15 if no new method is presented or if there is no clear winner
- v18 if there is no clear winner
If they are used as external variables, it is not expected that the clustering will correspond exactly to v1 or v17.