Working inside SCALUS
SCALUS is a Marie Curie Initial Training Network (MCITN) and therefore focuses on the education of early stage researchers, so called ESRs, within their first four year's after obtaining a bachelor or master degree. By joining SCALUS you will be working for three years in a full-time employment with top researchers in the field of storage systems (cluster, grid, and cloud storage). Depending on the position you choose (see the list of open positions below) you will be employed by one of the partners, which form this Marie Curie Initial Training Network. You will be a part of a joint research effort by all 16 partners and will cooperate with scientists across Europe. To enhance the cooperation as well as to develop your complementary skills you will also have the opportunity to attend summer schools, workshops, conferences, secondments, and internships.
Who can apply?
Early-stage researchers (ESRs) are defined as those in the first four years (full-time equivalent) of their research careers, starting at the date of obtaining the degree which would formally entitle them to embark on a doctorate, either in the country in which the degree was obtained or in the country in which the research training is provided, irrespective of whether or not a doctorate is envisaged. Researchers supported by an ITN are normally required to undertake transnational mobility (i.e. move from one country to another) when taking up an appointment.
Example A: a researcher has graduated with a first degree in biology in 2004 and would like to start her Ph.D. studies in 2007. She is eligible as an ESR within the ITN as she has less than 4 years of research experience and no PhD.
Example B: a researcher has already been working as a researcher in industry for two years since graduating with his first degree in chemistry. He would be able to benefit from participation in an ITN as an ESR even without pursuing a Ph.D. degree.
How to apply?
Recruitment has started and will be open at least until April 15th, 2010 or until all positions are filled. Click here to enter the submission site.
Contact
In case of any questions, please do not hesitate to contact us.
Open Positions
ESR 1: Accelerators for Cluster and Grid Storage
Hardware accelerators like FPGAs and GPGPUs are able to increase the performance of a number of storage related tasks, like the encoding or encryption of data. Nevertheless, investigating the complete IO stack often shows that the integration of these accelerators only delivers a small speed-up or even leads to a slow-down. The research of this ESR at the University of Paderborn and University of Bielefeld will combine hardware oriented research with operating systems research. The ideal candidates would have developed familiarity in computer systems and computer architecture, and good systems building skills in low-level systems software (e.g. kernel modules, block-level storage systems, controller firmware as well as in hardware description languages, like VHDL).
ESR 2: QoS Schemes for Large Scale Storage Systems
Quality of Service fulfilment is important both in local as well as in globally distributed storage environments. Locally, greedy clients are able to overstrain storage capabilities of cluster file systems. Globally, storage services also have to cope with network delays as well as environmental noise. The research of this ESR at the University of Paderborn will combine operating systems oriented research with theoretical work on the combination of deterministic and randomized data distribution strategies for parallel and Grid and Cloud file systems.
ESR 3: Autonomic Replica Management
Caching and replications are two very similar concepts that traditionally have been kept separated in large distributed environments. The objective of this work taking part at the the Barcelona Supercomputing Center is to use client side caching (both memory and disk) to improve redundancy and performance. In addition, replication will not only target read-only files, but also read-write files. The student assigned to this topic will have to propose data distribution and location mechanisms that integrate both caching and replication in an intelligent and automatic way. Finally, it is important that the proposed mechanisms scale to WAN as well as allow a better cooperation between resources.
ESR 4: Placement Algorithms merging Deterministic and Randomized Approaches
Maintaining the load among the different disks in a large storage system is not always easy, especially if new disks are added every few months. So far, two different mechanisms have been proposed to achieve this load balance: a deterministic placement similar to RAID placement policies and a randomized mechanisms based on pseudo-random algorithms. Both mechanisms have advantages and problems. The student assigned to this topic and working at the Barcelona Supercomputing Center will have to find mixed mechanisms and come out with a data distribution strategy that has most of the advantages and none (if possible) of the problems.
ESR 5: Load Balancing and Balls-into-Bins Games
The successful candidate will be required to have a strong background in theoretical computer science and mathematics. They will be working mostly on theoretically analysing random allocation strategies ("balls-into-bins" games) in non-standard models, where e.g. the balls or bins may be non-uniform, or the balls' choices may depend on each other. They will be expected to interact with the local research group at Durham University. Even though the focus of this position is theoretical, the ultimate goal of the overall project will be design and implementation of a data storage system, and as such the candidate will obviously be expected to interact with the remaining collaborators in the project, and, in particular, will have to have a strong focus on, and interest in, applicability and practicality issues.
ESR 6: Vulnerability measures for networks
The successful candidate will be required to have a strong background in theoretical computer science and discrete mathematics. They will be working mostly on theoretically investigating structural properties of graphs (that represent data or relations between data sets) and developing and analysing new vulnerability measures that suit this type of graphs, as well as computational complexity issues related to these measures. They will be expected to interact with the local research group at Durham University. Even though the focus of this position is theoretical, the ultimate goal of the overall project will be the design and implementation of a data storage system, and as such the candidate will obviously be expected to interact with the remaining collaborators in the project, and, in particular, will have to have a strong focus on, and interest in, applicability and practicality issues.
ESR 7: High-Throughput Communication Architectures
Fault tolerance and security are crucial issues for modern storage systems. In order to increase the performance of costly encoding or encryption operations the use of external accelerators (GPUs, FPGAs) as well the use of built-in features of modern multi- and many-core systems are of research interest. The objective of this topic is the development of acceleration strategies for storage systems with a special emphasis on efficiency and portability. The ESR assigned to this topic will be working at the Goethe University Frankfurt am Main.
ESR 8: Load Balancing and Scheduling in Cluster File Systems
Parallel applications have different access profiles to I/O resources. However, current systems usually exhibit only static configurations with respecct to compute and I/O resources. We often suffer from not optimally using the given resources. The research team at the University Hamburg investigates the question of how we can introduce load balancing into the cluster environment and the cluster file system. With an optimal strategy the parallel job should find just the I/O bandwidth being provided by the system that is necessary to satisfy its needs. This can either be accomplished by assigning an appropriate number of compute and I/O nodes to the job or by implementing mechanisms inside the cluster file system.
ESR 9: Block-level Scalable Storage System Virtualization
One junior researcher position (Early Stage Researchers -- ESRs) in the area of scalable storage systems is available in the CARV Laboratory of the Institute of Computer Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece. The area of storage systems is currently under significant change due to changes in underlying technologies that open exciting possibilities. The ideal candidates would hold a Bachelor's or Master's degree in computer science, computer engineering or related fields, would have developed familiarity in computer systems and computer architecture, and good systems building skills in low-level systems software (e.g. kernel modules, network protocols, file-systems, block-level storage systems, controller firmware).
ESR 10: File-level Scalable Storage System Virtualization
One junior researcher position (Early Stage Researchers -- ESRs) in the area of scalable storage systems is available in the CARV Laboratory of the Institute of Computer Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece (http://www.ics.forth.gr). The area of storage systems is currently under significant change due to changes in underlying technologies that open exciting possibilities. The ideal candidates would hold a Bachelors or Master's in computer science, computer engineering or related field, would have developed familiarity in computer systems and computer architecture, and good systems building skills in low-level systems software (e.g. kernel modules, network protocols, file-systems, block-level storage systems, controller firmware).
ESR 11: Autonomic Management of Data Grids and Cloud Storage
Data Grids and Cloud Storage are environments characterized by their complexity and dynamism. Therefore, the resource management in such large-scale environments is complex and decisions taken at one time may be counterproductive in a short time later. Knowing the future state of the environment is essential to implement policies tailored to each situation. The student assigned to this issue at Universidad Politecnica de Madrid will have to develop an autonomous prediction system that enables the enhanced management of storage resources in these environments.
ESR 12: High-performance, Secure and Fault Tolerant Grid File System
Fault tolerance is an issue that has not been adequately addressed in large-scale storage environments, due to the complexity of them. Knowing the behavior of these environments can help prevent certain failures, which results in improved fault tolerance. The student assigned to this topic at Universidad Politecnica de Madrid will develop strategies for fault tolerance based on the modeling of large-scale storage environments.
ESR 13: Toward an autonomic federation of distributed file systems
The goal of this thesis is to design and implement a generic framework that aims at federating remote distributed file systems arround a common grid infrastructure. After analyzing how file systems available on each site can be efficiently exploited at the grid level, the innovating part will consist in using new programming approaches such as the Component/AOP ones to design and implement a generic framework. The Phd applicant will be integrated into the ASCOLA research group from the Ecole des Mines de Nantes in France. The candidates should be skilled enough in operating system, distributed system and system programming.
ESR 14: Towards a scalable, fault-tolerant, self-adaptive Cloud file system
Data-intensive applications running on Cloud infrastructures require features such as file sharing among rented virtual machines with a high throughput under heavy access concurrency. Such features are not fully supported by today’s Cloud storage systems like S3, which is used in the EC2 Cloud. The KerData research team of INRIA Rennes - Bretagne Atlantique designs and implements BlobSeer, a generic data-sharing platform which aims at providing support for storing massive data with fine-grained access control under heavy concurrency on large-scale distributed infrastructures. The goal of this PhD thesis is to explore the possibility to use BlobSeer as a storage substrate for a higher-level, scalable, fault-tolerant, self-adaptive cloud file system optimized for high-throughput, massively parallel data processing.
ESR 15: Grid-wide platform for computations on large datasets
Grid-wide and cloud-wide parallel computations currently access and share data through various communication frameworks and through storage services such as global file systems and various cloud storage services. In addition to completely different APIs, some of the approaches require careful alignment of the data distribution to the nodes being used. This student will integrate the data distribution and caching mechanisms, grid file system, and hybrid RAM-based/secondary-storage data sharing, all developed by other students in this MCITN. The resulting computational platform will enable the application developer to focus on computational aspects and to simply use the data through a set of unified APIs. The position is available at XLAB.
ESR 16: Definition of scalable solutions for distributed caching in large cluster file systems
High performance and scalability require highly efficient usage of the capabilities of a distributed storage system and the adequate means for applications / users to manage those capabilities and data objects in the right way. The definition of the respective mechanisms, protocols and interfaces must be aware of the SCALUS-approach for the storage architecture (cluster-, grid-, cloud-paradigm). The main task is to define scalable solutions for distributed caching in large cluster file systems. The position is available at Fujitsu Technology Solutions.


