[BDCSG2008] NSF Plans for Supporting Data Intensive Computing (Jeannette Wing and Christophe Bisciglia)

NSF listens at you academics. Jeannete opens the floor with this claim. Questions: What are the limitations of this modeling paradigm (data-intensive one)? What are meaningful metrics of performance here? What about security processes and data on a shared resource? How can we reduce power consumption? Can this parading problem not possible otherwise, or simplify them, or open the door to new applications? NSF rolling out cluster exploratory program, also going to roll out a new solicitation for Data-Intensive Computing. Also emphasizing from data to knowledge, since scientist are throwing it away. This is a great opportunity for collaborative efforts between CS and scientist. NSF goal: provide access to cluster resource and access to massive data sets. Google and IBM rolling out the cluster (for academics). NSF will roll out a cluster exploratory will be the solicitation program announced yesterday to distribute access to the cluster and research grants. Review of Christophe experience on teaching a class about clustering, and he realized that providing away computer cycles is more valuable than plain grant money. It runs on Hadoop. The cluster will be allocate by rack weeks, 5 Terabytes and priority on 80 processes (but still people there and lower priority and large data sets). And since the reviewing was not Google expertise they reach to NSF to use it. Googler to start collaborations and IBM will also help providing support for it. Jeannette claiming this is a new model, but NSF is open for new model and other partners.