The Iris Data

Downloads: iris.arff, iris.csv, iris.RData

Summary of the data

Contributor:
Friedrich Leisch
Source:
Sir Ronald A. Fisher, Edgar Anderson, this copy taken from R version 3.2.0.
License:
PDDL

General information about the data

Abstract:
This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
Subject matter background:
Seperate the three species from each other.
Data structure:
object x variables data matrix
Data values:
no missings
Preprocessing:
no
Relevant papers:
Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179–188.

Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2–5.

External criteria for clustering quality

External variable that represents the underlying true clustering
Variable species

Internal criteria for clustering quality: cluster membership

Number of clusters
3: number of species in the data set