LYMPHOMA
Distinct types of diffuse large B-cell lymphoma
identified by gene expression profiling.
Ash A. Alizadeh, Michael B. Eisen, R. Eric Davis, Chi Ma, Izidore
S. Lossos, Andreas Rosenwald, Jennifer C. Boldrick, Hajeer Sabet, Truc Tran, Xin
Yu, John I. Powell, Liming Yang, Gerald E. Marti, Troy Moore, James Hudson Jr,
Lisheng Lu, David B. Lewis, Robert Tibshirani, Gavin Sherlock, Wing C. Chan,
Timothy C. Greiner, Dennis D. Weisenburger, James O. Armitage, Roger Warnke,
Ronald Levy, Wyndham Wilson, Michael R. Grever, John C. Byrd, David Botstein,
Patrick O. Brown & Louis M. Staudt
- NATURE, VOL 403, Nš 3, pp. 503-511, February 2000.
- Web supplement to the article
- Complete database (96 instances x 4026 genes) in ARFF format, with labelled classes (DLBCL + 8 classes) [2Mb]
- Complete database (96 instances x 4026 genes) in ARFF format, with labelled classes (45 instances from DLBCL were labelled with Germinal Centre, GCL, or Activated, ACL. In short, 11 classes) [1Mb]
- Reduced database (45 instances x 4026 genes) in ARFF format, with two labelled classes (Germinal Centre, GCL, and Activated, ACL) [1Mb]
LEUKEMIA
Molecular Classification of Cancer: Class Discovery and
Class Prediction by Gene Expression Monitoring.
T.R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.
P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D.
Bloomfield, E. S. Lander.
-
SCIENCE, VOL 286, pp. 531-537, 15 October 1999.
-
Training database (38 instances x 7129 genes) in ARFF format, with labelled classes [1.2Mb]
-
Test database (34 instances x 7129 genes) in ARFF format, with labelled classes [1.1Mb]
GLOBAL CANCER MAP
Multiclass cancer diagnosis using tumor gene expression
signatures.
S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, C.-H. Yeang, M.
Angelo, C. Ladd, M. Reich, E. Latulippe, J.P. Mesirov, T. Poggio, W. Gerald, M.
Loda, E.S. Lander and T.R. Golub.
-
PNAS, VOL 98, nš 26, pp. 15149-15154, December 18, 2001.
-
Training database (144 instances x 16063 genes) in ARFF format, with labelled classes [13.1Mb]
-
Test database (46 instances x 16063 genes) in ARFF format, with labelled classes [4.5Mb]
DISCOVERY CHALLENGE ECML 2004
The dataset was prepared by downloading and processing verious information from the SAGEmap website as of December 2002.
- Training database (90 instances x 27679 genes) in ARFF format, with labelled classes [8Mb]
- Information about the Tag identity and about the related gene.
- Information about the biological situations.
EMBRYONAL TUMOURS OF THE CENTRAL NERVOUS SYSTEM
Prediction of Central Nervous
System Embryonal Tumour Outcome based on Gene Expression.
Scott L. Pomeroy, Pablo Tamayo, Michelle Gaasenbeek, Lisa M.
Sturla, Michael Angelo, Margaret E. McLaughlin, John Y. H. Kim, Liliana C.
Goumnerova, Peter M. Black, Ching Lau, Jeffrey C. Allen, David Zagzag, James M.
Olson, Tom Curran, Cynthia Wetmore, Jaclyn A. Biegel, Tomaso Poggio, Shayan
Mukherjee, Ryan Rifkin, Andrea Califano, Gustavo Stolovitzky, David N. Louis,
Jill P. Mesirov, Eric S. Lander & Todd R. Golub
-
NATURE, VOL 415, pp. 436-442, 24 January 2002.
-
Dataset C (60 instances x 7129 genes) in ARFF format, with two labelled classes [1.85Mb]
COLON CANCER
Broad patterns of gene expression revealed by
clustering of tumor and normal colon tissues probed by oligonucleotide arrays.
U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra,
D. Mack, and A. J. Levine
-
PNAS, VOL 96, Issue 12, pp. 6745-6750, 8 June 1999
-
Dataset (62 instances x 2000 genes) in ARFF format, with labelled classes [1.1Mb]
YEAST
Systematic determination of genetic network
architecture.
S. Tavazoie, J.D. Hughes, M.J. Campbell, R.J. Cho, G.M. Church.
-
Nature Genetics, 1999 Jul;22(3):281-5.
-
Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church (Web suplement)
-
Dataset (17 instances x 1884 genes) in ARFF format, no labelled classes [0.3Mb]
-
From Rosetta Compendium: Dataset (300 instances x 6325 genes) in ARFF format, no labelled classes[10.5Mb]
STATE FAILURE
Advanced Data- and Knowledge-Driven Methods for State Failure Risk Assessment
State Failure Task Force-III
-
Dataset (ARFF) [2.5Mb]
-
Dataset (Excel) [10.4Mb]
