Clustering of short time-course gene expression data with dissimilar replicates

Ozan Cinar; Ozlem Ilk; Cem Iyigun

doi:10.1007/s10479-017-2583-3

Clustering of short time-course gene expression data with dissimilar replicates

Ozan Cinar^*, Ozlem Ilk, Cem Iyigun

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Microarrays are used in genetics and medicine to examine large numbers of genes simultaneously through their expression levels under any condition such as a disease of interest. The information from these experiments can be enriched by following the expression levels through time and biological replicates. The purpose of this study is to propose an algorithm which clusters the genes with respect to the similarities between their behaviors through time. The algorithm is also aimed at highlighting the genes which show different behaviors between the replicates and separating the constant genes that keep their baseline expression levels throughout the study. Finally, we aim to feature cluster validation techniques to suggest a sensible number of clusters when it is not known a priori. The illustrations show that the proposed algorithm in this study offers a fast approach to clustering the genes with respect to their behavior similarities, and also separates the constant genes and the genes with dissimilar replicates without any need for pre-processing. Moreover, it is also successful at suggesting the correct number of clusters when that is not known.

Original language	English
Pages (from-to)	405-428
Number of pages	24
Journal	Annals of Operations Research
Volume	263
Issue number	1-2
DOIs	https://doi.org/10.1007/s10479-017-2583-3
Publication status	Published - 1 Apr 2018

Keywords

Microarray gene expression
Short time-series
Replication
Distance
Clustering
Cluster validation
SERIES DATA
MICROARRAY EXPERIMENTS
FORECAST DENSITIES
DNA MICROARRAY
CELL-CYCLE
PROFILES
PATTERNS
MODEL
CLASSIFICATION
IDENTIFICATION

Access to Document

10.1007/s10479-017-2583-3

Cite this

@article{a860a686dd794de38f42efea527e1c0d,

title = "Clustering of short time-course gene expression data with dissimilar replicates",

abstract = "Microarrays are used in genetics and medicine to examine large numbers of genes simultaneously through their expression levels under any condition such as a disease of interest. The information from these experiments can be enriched by following the expression levels through time and biological replicates. The purpose of this study is to propose an algorithm which clusters the genes with respect to the similarities between their behaviors through time. The algorithm is also aimed at highlighting the genes which show different behaviors between the replicates and separating the constant genes that keep their baseline expression levels throughout the study. Finally, we aim to feature cluster validation techniques to suggest a sensible number of clusters when it is not known a priori. The illustrations show that the proposed algorithm in this study offers a fast approach to clustering the genes with respect to their behavior similarities, and also separates the constant genes and the genes with dissimilar replicates without any need for pre-processing. Moreover, it is also successful at suggesting the correct number of clusters when that is not known.",

keywords = "Microarray gene expression, Short time-series, Replication, Distance, Clustering, Cluster validation, SERIES DATA, MICROARRAY EXPERIMENTS, FORECAST DENSITIES, DNA MICROARRAY, CELL-CYCLE, PROFILES, PATTERNS, MODEL, CLASSIFICATION, IDENTIFICATION",

author = "Ozan Cinar and Ozlem Ilk and Cem Iyigun",

year = "2018",

month = apr,

day = "1",

doi = "10.1007/s10479-017-2583-3",

language = "English",

volume = "263",

pages = "405--428",

journal = "Annals of Operations Research",

issn = "0254-5330",

publisher = "Springer",

number = "1-2",

}

TY - JOUR

T1 - Clustering of short time-course gene expression data with dissimilar replicates

AU - Cinar, Ozan

AU - Ilk, Ozlem

AU - Iyigun, Cem

PY - 2018/4/1

Y1 - 2018/4/1

N2 - Microarrays are used in genetics and medicine to examine large numbers of genes simultaneously through their expression levels under any condition such as a disease of interest. The information from these experiments can be enriched by following the expression levels through time and biological replicates. The purpose of this study is to propose an algorithm which clusters the genes with respect to the similarities between their behaviors through time. The algorithm is also aimed at highlighting the genes which show different behaviors between the replicates and separating the constant genes that keep their baseline expression levels throughout the study. Finally, we aim to feature cluster validation techniques to suggest a sensible number of clusters when it is not known a priori. The illustrations show that the proposed algorithm in this study offers a fast approach to clustering the genes with respect to their behavior similarities, and also separates the constant genes and the genes with dissimilar replicates without any need for pre-processing. Moreover, it is also successful at suggesting the correct number of clusters when that is not known.

AB - Microarrays are used in genetics and medicine to examine large numbers of genes simultaneously through their expression levels under any condition such as a disease of interest. The information from these experiments can be enriched by following the expression levels through time and biological replicates. The purpose of this study is to propose an algorithm which clusters the genes with respect to the similarities between their behaviors through time. The algorithm is also aimed at highlighting the genes which show different behaviors between the replicates and separating the constant genes that keep their baseline expression levels throughout the study. Finally, we aim to feature cluster validation techniques to suggest a sensible number of clusters when it is not known a priori. The illustrations show that the proposed algorithm in this study offers a fast approach to clustering the genes with respect to their behavior similarities, and also separates the constant genes and the genes with dissimilar replicates without any need for pre-processing. Moreover, it is also successful at suggesting the correct number of clusters when that is not known.

KW - Microarray gene expression

KW - Short time-series

KW - Replication

KW - Distance

KW - Clustering

KW - Cluster validation

KW - SERIES DATA

KW - MICROARRAY EXPERIMENTS

KW - FORECAST DENSITIES

KW - DNA MICROARRAY

KW - CELL-CYCLE

KW - PROFILES

KW - PATTERNS

KW - MODEL

KW - CLASSIFICATION

KW - IDENTIFICATION

U2 - 10.1007/s10479-017-2583-3

DO - 10.1007/s10479-017-2583-3

M3 - Article

SN - 0254-5330

VL - 263

SP - 405

EP - 428

JO - Annals of Operations Research

JF - Annals of Operations Research

IS - 1-2

ER -