توضیحات
ABSTRACT
With an ever-increasing amount of options, the task of selecting machine learning tools for big data can be difficult. The available tools have advantages and drawbacks,and many have overlapping uses. The world’s data is growing rapidly, and traditional tools for machine learning are becoming insufficient as we move towards distributed and real-time processing. This paper is intended to aid the researcher or professional who understands machine learning but is inexperienced with big data. In order to evaluate tools, one should have a thorough understanding of what to look for. To that end, this paper provides a list of criteria for making selections along with an analysis of the advantages and drawbacks of each. We do this by starting from the beginning, and looking at what exactly the term “big data” means. From there, we go on to the Hadoop ecosystem for a look at many of the projects that are part of a typical machine learning architecture and an understanding of how everything might fit together. We discuss
the advantages and disadvantages of three different processing paradigms along with a comparison of engines that implement them, including MapReduce, Spark, Flink, Storm, and H2O. We then look at machine learning libraries and frameworks including Mahout, MLlib, SAMOA, and evaluate them based on criteria such as scalability, ease of use, and extensibility. There is no single toolkit that truly embodies a one-sizefits- all solution, so this paper aims to help make decisions smoother by providing as much information as possible and quantifying what the tradeoffs will be. Additionally, throughout this paper, we review recent research in the field using these tools and talk about possible future directions for toolkit-based learning
BACKGROUND
As the price of data storage has gone down and high performance computers have become more widely accessible, we have seen an expansion of machine learning (ML) into a host of industries including finance, law enforcement, entertainment, commerce, and healthcare. As theoretical research is leveraged into practical tasks, machine learning tools are increasingly seen as not just useful, but integral to many business operations
Year : 2015
Publisher : Springer
By : Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter and Tawfiq Hasanin
File Information : English Language / 36 Page / Size : 2.2 M
Download : click
سال : 2015
ناشر : Springer
کاری از :Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter and Tawfiq Hasanin
اطلاعات فایل : زبان انگلیسی /36 صفحه /حجم : 2.2 M
لینک دانلود : روی همین لینک کلیک کنید
نقد و بررسیها
هیچ دیدگاهی برای این محصول نوشته نشده است.