Objective
The genomic era has resulted in an explosion of biological data thanks to huge advances in the fields of molecular biology and genomics. Now, in the post-genomic era, researchers in these areas have been overwhelmed by experimental data. The field of bioinformatics is dedicated to working with this data and can be used to answer some very interesting and important questions.
A key component of all cancer treatment is to correctly diagnose the patient's cancer subtype so that the optimal treatment can be provided. Since the most effective treatment can vary dramatically between cancer subtypes, giving the correct diagnosis is a very important step in the road to recovery. The aim of this project was to focus on classifying data into breast cancer subtypes.
This project was written in Python as an experiment to explore the effectiveness of current open-source software solutions for scientists. In addition, a framework was developed to enable very easy and flexible use of the cloud-based Platform as a Service (PaaS) provider, Amazon Web Services (AWS). This service was utilised for any large simulations. This involved the dynamic initialisation of virtual servers and then distributing work on anyway up to 50 servers concurrently. Developing such a system allows for a potential unlimited decrease in runtime for parrallelisable simulations.
A key component of all cancer treatment is to correctly diagnose the patient's cancer subtype so that the optimal treatment can be provided. Since the most effective treatment can vary dramatically between cancer subtypes, giving the correct diagnosis is a very important step in the road to recovery. The aim of this project was to focus on classifying data into breast cancer subtypes.
This project was written in Python as an experiment to explore the effectiveness of current open-source software solutions for scientists. In addition, a framework was developed to enable very easy and flexible use of the cloud-based Platform as a Service (PaaS) provider, Amazon Web Services (AWS). This service was utilised for any large simulations. This involved the dynamic initialisation of virtual servers and then distributing work on anyway up to 50 servers concurrently. Developing such a system allows for a potential unlimited decrease in runtime for parrallelisable simulations.