Funded by the National Science Foundation (IIS-0836560)
PI: Jimmy Lin, Co-PI: Philip Resnik
Note: This project concluded in June 2011. This website is no longer actively maintained, and is available primarily for archival purposes.
Funded by the National Science Foundation (IIS-0836560)
PI: Jimmy Lin, Co-PI: Philip Resnik
Note: This project concluded in June 2011. This website is no longer actively maintained, and is available primarily for archival purposes.
In October 2007, Google and IBM jointly announced the Academic Cloud Computing Initiative (ACCI), with the goal of helping both researchers and students address the challenges of "web-scale" computing. The initiative revolves around Google's MapReduce programming framework, which represents a proven approach to tackling data-intensive problems in a distributed manner. Six universities were involved in the collaboration at the outset: Carnegie Mellon University, Massachusetts Institute of Technology, Stanford University, the University of California at Berkeley, the University of Maryland, and University of Washington. See Google press release, IBM press release, and UMD press release.
As part of this initiative, IBM and Google have dedicated a large cluster of several hundred machines for use by faculty and students at the participating institutions. The cluster takes advantage of Hadoop, an open-source implementation of MapReduce in Java. By making these resources available, Google and IBM hope to encourage faculty adoption of cloud computing in their research and also integration of the technology into the curriculum. A few months later, the ACCI teamed up with the National Science Foundation to create the Cluster Exploratory (CLuE) initiative, whereby NSF would provide funding to support research on the ACCI infrastructure. This project was funded under that program.
In the context of this project, we have been exploring the intersection of large-scale text retrieval and statistical machine translation. One thread has been scaling up iterative machine learning algorithms to larger and larger dataset. Another thread has been the application of IR techniques to automatically extract bilingual training data.
![]() |
Jimmy Lin Associate Professor The iSchool (College of Information Studies), University of Maryland |
![]() |
Philip Resnik Professor Department of Linguistics, University of Maryland |
![]() |
Chris Dyer Ph.D. student Department of Computer Science, University of Maryland (graduate Spring 2010) |
![]() |
Tamer Elsayed Ph.D. student Department of Computer Science, University of Maryland (graduated Summer 2009) |
![]() |
Ferhan Ture Ph.D. student Department of Computer Science, University of Maryland |