IFCS Repository: Benchmarking studies on supervised classification methods

Benchmarking studies on supervised classification methods

Downloads: benchmarking_literature.arff, benchmarking_literature.csv, benchmarking_literature.RData

If you use this data set in publications please cite
A.-L. Boulesteix, S. Lauer, M. Eugster, 2013. A plea for neutral comparison studies in computational sciences. PLoS One 8(4):e61562.

Summary of the data

Contributor:: Anne-Laure Boulesteix
Source:: M.Sc. Sabine Lauer sabine.lauer@tu-dortmund.de Technical University Dortmund, Germany
License:: PDDL

General information about the data

Abstract:

This dataset describes the benchmarking experiments comparing supervised classification methods performed in 59 articles published in 7 computational journals from 2010 to 2012.

Subject matter background:

The goal was to get a picture of current practice of benchmarking experiments with real data in several disciplines within computational sciences (bioinformatics, computational statistics, machine learning), with a special focus on the extent of the experiments (number of datasets, number of compared methods, etc) and neutrality (e.g., does the article introduce a new method or is the comparison study the main topic).

Data structure:

object x variables data matrix

Data objects and variables:

The objects are the articles.

The issue (v2) should not be included in the clustering.

It is possible to consider the journal (v1) or the discipline (v17) as external variables.

Data values:

v1 journal ("BIOINF"=Bioinformatics, "BMCB"=BMC Bioinformatics, "CS"=Computational Statistics, "CSDA"=Computational Statistics & Data Analysis, "JMLR"=Journal Of Machine Learning Research, "JCGS"=Journal of Computational and Graphical Statistics, "MACHL"=Machine Learning)

v2 issue

v3 comparison study of 2 or more methods (yes/no)

v4 evaluation on real data (yes/no)

v5 satisfies comparison study and evaluation on real data (yes/no)

v6 number of data sets (integer)

v7 number of considered methods / variants (integer)

v8 number of accuracy measures (integer)

v9 new method? (1=yes, 0=no) v10 have the authors (cumulatively) published three or more methods on one of the used methods included in the comparison? (yes/no)

v11 clear winner(s)? (yes/no)

v12 new method belongs to winners ? (yes/no)

v13 winners mentioned in abstract ? (yes/no)

v14 negative result (yes/no)

v15 if new method, results presented for several methods ? (yes/no)

v16 simulation included ? (yes/no)

v17 discipline ("BI"=bioinformatics, "CS"=computational statistics, "ML"=machine learning)

v18 winner categories (1 = clear winner , 2 = competitor with advantages, 3 = competitor without advantages (or drawbacks))

v19 new method or comparison ("new method", "comparison")

There are missing values in:

- v10 if the information was not found during our search

- v12 if no new method is presented

- v15 if no new method is presented or if there is no clear winner

- v18 if there is no clear winner

Preprocessing:

No special pre-processing is required.

URLs:

http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/plea2013/index.html

External criteria for clustering quality

External variable that represents the underlying true clustering: The journal (v1) or the discipline (v17) may be considered as external variables. But it is also possible to take a different view and use them in the clustering.
If they are used as external variables, it is not expected that the clustering will correspond exactly to v1 or v17.
Substantive justification: It is likely that journals/disciplines each have their own culture regarding benchmarking experiment. But exceptions are common.