محصولات

خانه مقالات مقالات کامپیوتر هادوپ A survey of open source tools for machine learning with big data in the Hadoop ecosystem
supervised_machin_learning_workflow

A survey of open source tools for machine learning with big data in the Hadoop ecosystem

ادامه/دانلودرایگان!

A survey of open source tools for machine learning with big data in the Hadoop ecosystem

توضیحات محصول

ABSTRACT
With an ever-increasing amount of options, the task of selecting machine learning tools for big data can be difficult. The available tools have advantages and drawbacks,and many have overlapping  uses. The world’s data is growing rapidly, and traditional tools for machine learning are becoming insufficient as we move towards distributed and real-time processing. This paper is intended to aid the researcher or professional who understands machine learning but is inexperienced with big data. In order to evaluate tools, one should have a thorough understanding of what to look for. To that end, this paper provides a list of criteria for making selections along with an analysis of the advantages and drawbacks of each. We do this by starting from the beginning, and looking at what exactly the term “big data” means. From there, we go on to the Hadoop ecosystem for a look at many of the projects that are part of a typical machine learning architecture and an understanding of how everything might fit together. We discuss
the advantages and disadvantages of three different processing paradigms along with a comparison of engines that implement them, including MapReduce, Spark, Flink, Storm, and H2O. We then look at machine learning libraries and frameworks including Mahout, MLlib, SAMOA, and evaluate them based on criteria such as scalability, ease of use, and extensibility. There is no single toolkit that truly embodies a one-sizefits- all solution, so this paper aims to help make decisions smoother by providing as much information as possible and quantifying what the tradeoffs will be. Additionally, throughout this paper, we review recent research in the field using these tools and talk about possible future directions for toolkit-based learning

BACKGROUND
As the price of data storage has gone down and high performance computers have become more widely accessible, we have seen an expansion of machine learning (ML) into a host of industries including finance, law enforcement, entertainment, commerce, and healthcare. As theoretical research is leveraged into practical tasks, machine learning tools are increasingly seen as not just useful, but integral to many business operations

Year : 2015

Publisher : Springer

By : Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter and Tawfiq Hasanin

File Information : English Language / 36 Page / Size : 2.2 M

Download : click

سال : 2015

ناشر : Springer

کاری از :Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter and Tawfiq Hasanin

اطلاعات فایل : زبان انگلیسی /36 صفحه /حجم : 2.2 M

لینک دانلود : روی همین لینک کلیک کنید

دیدگاه‌ها

هیچ دیدگاهی برای این محصول نوشته نشده است.

Be the first to review “A survey of open source tools for machine learning with big data in the Hadoop ecosystem”