Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

Parallel Computing for Biological Data


Listen Later

In the 1990s a number of technological innovations appeared that revolutionized biology, and 'Bioinformatics' became a new scientific discipline. Microarrays can measure the abundance of tens of thousands of mRNA species, data on the
complete genomic sequences of many different organisms are available, and other technologies make it possible to study various processes at the molecular level. In Bioinformatics and Biostatistics, current research and computations are limited by the available computer hardware. However, this problem can be solved using high-performance computing resources. There are several reasons for the increased focus on high-performance computing: larger data sets,
increased computational requirements stemming from more sophisticated methodologies, and latest developments in computer chip production.
The open-source programming language 'R' was developed to provide a powerful and extensible environment for statistical and graphical techniques. There are many
good reasons for preferring R to other software or programming languages for scientific computations (in statistics and biology). However, the development of
the R language was not aimed at providing a software for parallel or high-performance computing. Nonetheless, during the last decade, a great deal of research has been conducted on using parallel computing techniques with R.
This PhD thesis demonstrates the usefulness of the R language and parallel computing for biological research. It introduces parallel computing with R, and reviews and evaluates existing techniques and R packages for parallel computing on Computer Clusters, on Multi-Core Systems, and in Grid Computing. From a computer-scientific point of view the packages were examined as to their reusability in biological applications, and some upgrades were proposed.
Furthermore, parallel applications for next-generation sequence data and preprocessing of microarray data were developed. Microarray data are characterized by high levels of noise and bias. As these perturbations have to be removed, preprocessing of raw data has been a research topic of high priority over the past few years. A new Bioconductor package called affyPara for parallelized preprocessing of high-density oligonucleotide microarray data was developed and published. The partition of data can be performed on arrays using a block cyclic partition, and, as a result, parallelization of algorithms becomes directly
possible. Existing statistical algorithms and data structures had to be adjusted and reformulated for the use in parallel computing. Using the new parallel infrastructure, normalization methods can be enhanced and new methods became available. The partition of data and distribution to several nodes or processors solves the main memory problem and accelerates the methods by up to the factor fifteen for 300 arrays or more.
The final part of the thesis contains a huge cancer study analysing more than 7000 microarrays from a publicly available database, and estimating gene interaction networks. For this purpose, a new R package for microarray data management was developed, and various challenges regarding the analysis of this amount of data are discussed. The comparison of gene networks for different
pathways and different cancer entities in the new amount of data partly confirms already established forms of gene interaction.
...more
View all episodesView all episodes
Download on the App Store

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02By Ludwig-Maximilians-Universität München

  • 5
  • 5
  • 5
  • 5
  • 5

5

1 ratings


More shows like Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

View all
Tonspur Forschung by Annik Rubens

Tonspur Forschung

3 Listeners

Einführung in die Ethnologie by Prof. Dr. Frank Heidemann

Einführung in die Ethnologie

0 Listeners

Theoretical Physics Schools (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Theoretical Physics Schools (ASC)

2 Listeners

MCMP – Mathematical Philosophy (Archive 2011/12) by MCMP Team

MCMP – Mathematical Philosophy (Archive 2011/12)

6 Listeners

Hegel lectures by Robert Brandom, LMU Munich by Robert Brandom, Axel Hutter

Hegel lectures by Robert Brandom, LMU Munich

6 Listeners

MCMP – Metaphysics and Philosophy of Language by MCMP Team

MCMP – Metaphysics and Philosophy of Language

2 Listeners

MCMP – Philosophy of Science by MCMP Team

MCMP – Philosophy of Science

1 Listeners

Sommerfeld Lecture Series (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Sommerfeld Lecture Series (ASC)

0 Listeners

MCMP by MCMP Team

MCMP

2 Listeners

Women Thinkers in Antiquity and the Middle Ages - SD by Peter Adamson

Women Thinkers in Antiquity and the Middle Ages - SD

0 Listeners