Thesis ProposalsMaster Thesis proposalsSome preliminary considerations on the proposed thesis topics
Themes proposed for 2012/2013 Self-tuning data replication in large scale transactional data gridsArea Distributed systems, data replication, autonomic computing Context This thesis will focus on the area of large scale transactional data platforms, such as Cassandra, Infinispan, Coherence. In order to maximize scalability, these platforms rely on genuine partial replication mechanisms, which place a static bound on the number of copies of data in the system and rely on random hashing techniques to scatter uniformly the data across the nodes of the platform. The downside of these approaches is that they fail to keep into account the data access locality of applications, which leads to a dramatic increase of the probability of incurring in expensive network communications to fetch data remotely from other nodes while processing. Objectives The objective of the thesis will be the design, development and evaluation of locality-aware data replication techniques that will self-tune the placement of replicas of data across the platform in order to maximize data locality and hence applications' performance. The self-tuning mechanism will have to deal with three main challenges:
Requirements I strongly encourage potential candidates to arrange a short meeting to discuss the details of the proposal before applying. Simply send me an email to schedule a meeting. Expected Results
International collaborations This thesis work will be carried out in the scope of the European project Cloud-TM, whose aim is to develop a self-optimizing middleware platform aimed at simplifying the development and administration of applications deployed on cloud computing infrastructures. The Cloud-TM consortium is composed by international representatives of Academia (IST and CINI) and Industry (Red Hat, Algorithmica), thus giving the possibility to the student to come in contact with international experts and work on challenging and cutting-edge topics which are of interest for a very broad community. The results of this thesis will be integrated with one of the mainstream open source transactional data grids, namely Infinispan (www.infinispan.org) by Red Hat, which is also a partner of Cloud-TM. The thesis will provide plenty of occasions to closely collaborate with the Infinspan developers' team and to contribute code to some core components of the Cloud-TM platform and/or of Infinispan. Possibility of Scholarships A scholarship will be provided by the Cloud-TM project to support this thesis work. Elastic auto scaling of transactional data grids in cloud environmentsArea Distributed Systems, Cloud Computing, Capacity Planning Context Over the last years Cloud Computing has emerged as a disruptive paradigm for the future generation of IT services. In the cloud, resources are then dispensed “elastically”, with a seemingly unbounded amount computational power and storage available on demand, in a pay-only-for-what-you-use pricing model. Just as the electric grid revolutionized access to electricity one hundred years ago, freeing corporations from having to generate their own power and enabling them to concentrate on their business differentiators, cloud computing is hailed as revolutionizing IT, freeing corporations from large IT capital investments and enabling them to plug into extremely powerful computing resources over the network. The issue of data management in cloud computing environments is one of the hottest research areas of the moment, both in the academic and industrial communities. This thesis will focus on the area of elastic transactional data grids, namely distributed transactional data platforms that are capable of dynamically adjusting their scale (number of nodes) to meet the characteristics of the incoming workload. Objectives The objective of this thesis is to build a "Transactional AutoScaler" (TAS), namely a module in charge of elastically scaling a transactional data grid on the basis of the actual workload demands. TAS will consist of two main modules:
Methodologies that will be employed/learnt during the the thesis The performance forecasting models will be based both on analytical methods, e.g. queuing theory or stochastic modeling techniques, as well as on machine learning tools, e.g. neural networks, decision trees, Q-learning. The student is not expected to have background in the above areas, and will be assisted in the learning of their theoretical foundations and of tools that exploit them. International Collaborations This thesis work will be carried out in the scope of the European project Cloud-TM, whose aim is to develop a self-optimizing middleware platform aimed at simplifying the development and administration of applications deployed on cloud computing infrastructures. The Cloud-TM consortium is composed by international representatives of Academia (IST and CINI) and Industry (Red Hat, Algorithmica), thus giving the possibility to the student to come in contact with international experts and work on challenging and cutting-edge topics which are of interest for a very broad community. TAS will be integrated with one of the mainstream open source transactional data grids, namely Infinispan by Red Hat, which is also a partner of Cloud-TM. The thesis will provide plenty of occasions to closely collaborate with the team of Infinispan and to contribute code to some core components of the Cloud-TM platform. Possibility of Scholarships The Cloud-TM project will provide a scholarship to support this thesis work. |