توضیحات
ABSTRACT
Hadoop is an open-source data processing framework that includes a scalable, fault- tolerant distributed le system, HDFS. Although HDFS was designed to work in conjunction with Hadoop’s job scheduler, we have re-purposed it to serve as a grid storage element by adding GridFTP and SRM servers. We have tested the system thoroughly in order to understand its scalability and fault tolerance. The turn-on of the Large Hadron Collider (LHC) in 2009 poses
a signicant data management and storage challenge; we have been working to introduce HDFS as a solution for data storage for one LHC experiment, the Compact Muon Solenoid (CMS)
ITRODUCTION
The Large Hadron Collider in Geneva, Switzerland is not only a unique experiment for investigations in particle physics, but also presents an exceptional computing challenge for those tasked with recording and processing huge amounts of data. When originally tasked with the problem, the four detector experiments on the LHC decided to take a decentralized processing approach through using grid computing. The grid formed to handle this task is the Worldwide LHC Computing Grid (WLCG). At the Holland Computing Center in Nebraska, we have established a Tier-2 site for the CMS experiment. A Tier-2 site is primarily an analysis and simulation site for physicists. The site
currently has about 200TB of storage and will grow to 400TB during 2009.The heaviest usage of the storage system comes from transferring data from a Tier-1 site (such as FNAL in the US) and from running I/O-intensive analysis jobs. The downloads are expected to sustain 5 Gbps and burst at 9 Gbps. The analysis jobs read at a rate of a few MB/s, but currently have a huge number of random read operations The compute cluster is approximately 1,000 cores; as certain analysis jobs are CPU-bound, the cluster needs a few million I/O operations per minute.Since the LHC experiments formed the WLCG, several Internet-based companies have emerged as having far larger data needs. Google grew from processing 100 TB of data a day in 2004 [1] to processing 20 PB a day [2]
Year : 2009
Publisher : IOP
By : Brian Bockelman
File Information : English Language / 7 Page / Size : 482 K
Download : click
سال : 2009
ناشر : IOP
کاری از : Brian Bockelman
اطلاعات فایل : زبان انگلیسی/7 صفحه /حجم : 482 K
لینک دانلود : روی همین لینک کلیک کنید
نقد و بررسیها
هنوز بررسیای ثبت نشده است.