Name: Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
SKU: 1255
Price: 10000 IRT
Availability: InStock

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

10,000 تومان

توضیحات

ABSTRACT

While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations.

INTRODUCTION

With the exponential growth in the scale of Machine Learning and Data Mining (MLDM) problems and increasing sophistication of MLDM techniques, there is an increasing need for systems that can execute MLDM algorithms efficiently in parallel on large clusters. Simultaneously, the availability of Cloud computing services like Amazon EC2 provide the promise of on-demand access to affordable large-scale computing and storage resources without substantial upfront investments. Unfortunately, designing, implementing, and debugging the distributed MLDM algorithms needed to fully utilize the Cloud can be prohibitively challenging requiring MLDM experts to address race conditions, deadlocks, distributed state, and communication protocols while simultaneously developing mathematically complex models and algorithms. Nonetheless, the demand for large-scale computational and storage resources, has driven many [2, 14, 15, 27, 30, 35] to develop new parallel and distributed MLDM systems targeted at individual models and applications. This time consuming and often redundant effort slows the progress of the field as different research groups repeatedly solve the same parallel/distributed computing problems. Therefore, the MLDM community needs a high-level distributed abstraction that specifically targets the asynchronous, dynamic, graph-parallel computation found in many MLDM applications while hiding the complexities of parallel/distributed system design. Unfortunately, existing high-level parallel abstractions (e.g. MapReduce [8, 9], Dryad [19] and Pregel [25]) fail to support these critical properties. To help fill this void we introduced [24] GraphLab abstraction which directly targets asynchronous, dynamic, graph-parallel computation in the shared-memory setting.

Year: 2012

Publishe: ONR

By: Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein

File Information: English Language/ 12Page / size:1,390KB

Download: click

سال :2012

ناشر : ONR

کاری از : Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein

اطلاعات فایل : زبان انگلیسی / 12 صفحه / حجم : 1,390KB

لینک دانلود : روی همین لینک کلیک کنید

نقد و بررسی‌ها

هنوز بررسی‌ای ثبت نشده است.

اولین کسی باشید که دیدگاهی می نویسد “Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud”

برای فرستادن دیدگاه، باید وارد شده باشید.

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

توضیحات

نقد و بررسی‌ها

کتاب کلان داده (bigdata)

کتاب زبان برنامه نویسی متلب

کتاب آموزشی 3DMAX

کتاب آموزش جوملا 2.5

درباره فروشگاه

ارتباط با ما

Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

توضیحات

نقد و بررسی‌ها

محصولات مرتبط

کتاب کلان داده (bigdata)

کتاب زبان برنامه نویسی متلب

کتاب آموزشی 3DMAX

کتاب آموزش جوملا 2.5

درباره فروشگاه

ارتباط با ما