GSoC/GCI Archive
Google Summer of Code 2009

The Globus Alliance

Web Page: http://dev.globus.org/wiki/Google_Summer_of_Code_2009_Ideas

Mailing List: There is no single main mailing list. See http://dev.globus.org/wiki/Mailing_Lists for a list of mailing lists. "gt-user" is, however, the mailing list that tends to attract most general queries (https://lists.globus.org/mailman/listinfo/gt-user)

The Globus Alliance is a community of organizations and individuals developing fundamental technologies behind the "Grid," which lets people share computing power, databases, instruments, and other on-line tools securely across corporate, institutional, and geographic boundaries without sacrificing local autonomy.

Since its creation in 1996, the Globus Alliance has been committed to developing open source software, although development was initially carried out by a small number of university research groups. However, since transitioning in 2005 to an open governance model (http://dev.globus.org/), derived from Apache's Jakarta project, the scope of participants has widened to include many more groups around the world, including companies and individuals.

Globus currently hosts more than 20 projects, actively developed by a community of more than 100 committers, and spanning a variety of technology concerns on grid systems: common runtime, data management, information management, security, and documentation. Additionally, members of the Globus community can propose new projects which, after an "incubation" process (http://dev.globus.org/wiki/Incubator/Incubator_Process) can escalate to full Globus projects. There are currently 25 active projects in incubation.

Projects

  • ajax web interface for Globus Toolkit services This project is to develop a set of JavaScript API that enables accessing Globus from a web client using Ajax technologies. The framework is based on Service Oriented Architecture and will be consist of a backend service that mediates service requests to Globus toolkit, as well as a web client that is able to access the web service through Ajax technologies.Upon the completion of this project a Grid developer or user could interact with Grid that is using Globus from inside a web browser.
  • Develop a GUI for GridWay Current Gridway environment does not provide users with a Graphical interface for monitoring and submitting jobs. The changes can be made using GTK+ or The GIMP Toolkit which will use the existing infrastructure of DRMAA API (C bindings). The existing command line interfaces will be supported in Graphical User Interface so that users would be able to compose,manage,synchronise and control their jobs just by clicking the graphical interface instead of the commands.
  • Develop a new 'sync' feature for GridFTP client, globus-url-copy. Add a 'sync' feature to globus-url-copy for optimizing data transfer required for synchronizing the source and destination files. It minimizes the amount of data transferred by sending only the changed or required sections of the file. This helps in effective bandwidth utilization by avoiding redundant data transfers and reduces the time taken for synchronization.
  • Distribution of computing jobs among different clouds (Nimbus, AWS) Until now a proof-of-principle jobsystem was developed for ATLAS computing (LHC at CERN, Geneva) using Amazon Web Services. The idea is to make the system compatible to Nimbus. Then it would be possible to easily distribute jobs among different clouds. Additionally, the implementation of GANGA (delivering a standardized job description language) into the system would be convenient.
  • Globus XIO Checksum Driver This project aims to produce a driver for Globus XIO that will checksum data streams. The driver will have two modes of operation: 1) Checksum the stream of data as it passes through the driver and report the final value on close. The endpoints will then verify out-of-band that the checksums match. 2) Checksum each block of data and add the checksum to a block header before it is transmittet. If the checksums do not match the driver will either close the connection or request retransmit.
  • GridWay + GoogleMaps web interface The goal will be to develop a web interface with a geographical representation of the GridWay resources using a GoogleMaps mashup, including useful information such as statistics of usage, workload, pending jobs, queue size, etc. It will also contain options so the user can filter and select which information is relevant (where are the user's active jobs, or what is the situation of the submitted jobs...). This map could be also used as an interface to submit new jobs to certain hosts.
  • Multiple Cluster support for Nimbus The Nimbus software does not have the capability to schedule requests across multiple clusters providing cloud resources, as highlighted in Globus project idea "ClusterBuilder: Distributed Virtual Environments over the Clouds". The summer project involves 2 parts. Part 1 will create a tool to publish information about the Nimbus cloud resources at a particular site to a registry (likely Globus MDS). Part 2 will be the creation of a tool to schedule requests to multiple Nimbus clouds.
  • Performance characterization of GridFTP on 10+ Gigabit networks using hosts with 10 Gigabit network interface cards In this project, I will characterize the performance of Globus GridFTP transfers over both TCP and UDT protocols for hosts connected at 10Gb/s over a wide area network, and compare the performance of GridFTP using each. Standard disk-to-disk GridFTP transfers will be tested, as well as memory-to-memory transfers. Characterization will include detailed measurements regarding the speed and latency involved in the transfers, as well as resource utilization at the source and destination hosts.
  • PSK Globus XIO driver A large amount of research into security protocols to find the best to establish that a pre-shared key is known by both communicating endpoints, and research into different encryption techniques to ensure data integrity. Writing the XIO driver in C so that it accepts or rejects a connection based on the key. Writing an additional protocol which checks that both sides have the same pre-shared key once the connection has been made.
  • The Taverna plug-in for constructing CQL queries to caGrid cancer research data services Researches use Taverna workbench for creating workflows, which include calls to web services. In case of caGrid data services, the access to the data may be too complicated for the end-users, because of the need to use CQL language. In that case the plug-in will be a great help for biologists, allowing them to focus on their work rather than on correct query construction. Besides the basic functionality of constructing CQL queries, the plug-in will offer additional conveniences for the users.