A stable iterative method for refining discriminative gene clusters

Min Xu, Mengxia Zhu, Louxin Zhang

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Background: Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem. Results: We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches. Conclusion: This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.

Original languageEnglish
Article numberS18
JournalBMC Genomics
Volume9
Issue numberSUPPL. 2
DOIs
StatePublished - 16 Sep 2008

Fingerprint

Multigene Family
Genes
Learning
Technology
Phenotype
Datasets

Cite this

Xu, Min ; Zhu, Mengxia ; Zhang, Louxin. / A stable iterative method for refining discriminative gene clusters. In: BMC Genomics. 2008 ; Vol. 9, No. SUPPL. 2.
@article{6d867d8dcdcc455586367579280617b4,
title = "A stable iterative method for refining discriminative gene clusters",
abstract = "Background: Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem. Results: We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches. Conclusion: This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.",
author = "Min Xu and Mengxia Zhu and Louxin Zhang",
year = "2008",
month = "9",
day = "16",
doi = "10.1186/1471-2164-9-S2-S18",
language = "English",
volume = "9",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central Ltd.",
number = "SUPPL. 2",

}

A stable iterative method for refining discriminative gene clusters. / Xu, Min; Zhu, Mengxia; Zhang, Louxin.

In: BMC Genomics, Vol. 9, No. SUPPL. 2, S18, 16.09.2008.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A stable iterative method for refining discriminative gene clusters

AU - Xu, Min

AU - Zhu, Mengxia

AU - Zhang, Louxin

PY - 2008/9/16

Y1 - 2008/9/16

N2 - Background: Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem. Results: We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches. Conclusion: This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.

AB - Background: Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem. Results: We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches. Conclusion: This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.

UR - http://www.scopus.com/inward/record.url?scp=52249098414&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-9-S2-S18

DO - 10.1186/1471-2164-9-S2-S18

M3 - Article

C2 - 18831783

AN - SCOPUS:52249098414

VL - 9

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - SUPPL. 2

M1 - S18

ER -