• A Case-based Data Warehousing Courseware[taliem.ir]

    A Case-based Data Warehousing Courseware

    تومان

    Data warehousing is one of the important approaches for data integration and data preprocessing. The objective of this project is to develop a web-based interactive courseware to help beginner data warehouse designers to reinforce the key concepts of data warehousing using a case study approach. The case study is to build a data warehouse for a university student enrollment prediction data mining system. This data warehouse is able to generate summary reports as input data files for a data mining system to predict future student enrollment. The data sources include: (1) the enrollment data from California State University, Sacramento and (2) the related public data of California. The  ourseware is designed to build the data warehouse systematically using a set of 4 demonstrations covering the following data warehousing topics: fundamentals, design principle, building an enterprise data warehouse using incremental approach, and aggregation.

  • A hybrid evolutionary algorithm for attribute selection in data mining[taliem.ir]

    A hybrid evolutionary algorithm for attribute selection in data mining

    تومان

    Real life data sets are often interspersed with noise, making the subsequent data mining process difficult. The task of the classifier could be simplified by eliminating attributes that are deemed to be redundant for classification, as the retention of only pertinent attributes would reduce the size of the dataset and subsequently allow more comprehensible analysis of the extracted patterns or rules. In this article, a new hybrid approach comprising of two conventional machine learning algorithms has been proposed to carry out attribute selection. Genetic algorithms (GAs) and support vector machines (SVMs) are integrated  effectively based on a wrapper approach. Specifically, the GA component searches for the best attribute set by applying the principles of an evolutionary process. The SVM then classifies the patterns in the reduced datasets, corresponding to the attribute subsets represented by the GA chromosomes. The proposed GA- SVM hybrid is subsequently validated using datasets obtained from the UCI machine learning repository.  Simulation results demonstrate that the GA-SVM hybrid produces good classification accuracy and a higher level of consistency that is comparable to other established algorithms. In addition, improvements are made to the hybrid by using a correlation measure between attributes as a fitness measure to replace the weaker members in the population with newly formed chromosomes. This injects greater diversity and increases the overall fitness of the population. Similarly, the improved mechanism is also validated on the same data sets used in the first stage. The results justify the improvements in the classification accuracy and demonstrate its potential to be a good classifier for future data mining purposes.

  • A State Of The Art Opinion Mining And Its[taliem.ir]

    A State Of The Art Opinion Mining And Its Application Domains

    تومان

    This paper critically evaluates existing work, presents an opinion mining framework and exposes new areas of research in opinion mining. Individuals, businesses and government can now easily know the general opinion  prevailing on a product, company or public policy. At the core of this field is semantic orientation of  subjective terms in documents or reviews which seeks to establish their contextual connotation through  opinion mining. Overall item sentiment can be expressed based on its sentiment words in general or by  specifically identifying its features and the opinions being expressed about them. This leads us to the  motivation of the framework for opinion mining and categorizing current literature in such a manner as to make clear, research opportunities. The freedom offered by the web as a platform for presenting opinions on any subject brings with it many new opportunities.

  • Amoeba-Based Knowledge Discovery[taliem.ir]

    Amoeba-Based Knowledge Discovery

    تومان

    We propose an amoeba-based knowledge discovery or data mining system, that is implemented using an amoeboid organism and an associated control system. The amoeba system can be considered as one of the new non-traditional computing paradigms, and it can perform intriguing, massively parallel computing that  utilizes the chaotic behavior of the amoeba .Our system is a hybrid of a traditional knowledge-based unit implemented on an ordinary computer with an amoeba-based search unit and an optical control unit interface. The solutions in our system can have one-to-one mapping to solutions of other well-known areas such as neural networks and genetic algorithms. This mapping feature allows the amoeba to use and apply techniques developed in other areas. Various forms of knowledge discovery processes are introduced. Also, a new type of knowledge discovery technique, called “autonomous metaproblem solving,” is discussed.

  • Database Preprocessing and Comparison between Data Mining[taliem.ir]

    Database Preprocessing and Comparison between Data Mining Methods

    تومان

    Database preprocessing is very important to utilize memory usage, compression is one of the preprocessing needed to reduce the memory required to store and load data for processing, the method of compression introduced in this paper was tested, by using proposed examples to show the effect of repetition in database, as well as the size of database, the results showed that as the repetition increased the compression ratio will be increased. The compression is one of the important activities for data preprocessing before implementing data mining. Data mining methods such as Na¨ıve Bayes, Nearest Neighbor and Decision Tree are tested.  The implementation of the three methods showed that Na¨ıve Bayes method is effectively used when the data attributes are categorized, and it can be used successfully in machine learning. The Nearest Neighbor is most suitable when the data attributes are continuous or categorized. The third method tested is the Decision Tree, it is a simple predictive method implemented by using simple rule methods in data classification. The success of data mining implementation depends on the completeness of database, that represented by data warehouse, that must be organized by using the important characteristics of data warehouse.

  • Four Decades of Data Mining in Network and[taliem.ir]

    Four Decades of Data Mining in Network and Systems Management

    تومان

    How has the interdisciplinary data mining field been practiced in Network and Systems Management (NSM)? In Science and Technology, there is a wide use of data mining in areas like bioinformatics, genetics, Web and more recently astroinformatics. However, the application in NSM has been limited and inconsiderable. In this article, we provide an account of how data mining has been applied in managing networks and systems for the past four decades, presumably since its birth. We look into the field’s applications in the key NSM activities – discovery, monitoring, analysis, reporting and domain knowledge acquisition. In the end, we discuss our perspective on the issues that are considered critical for the effective application of data mining in the modern systems which are characterized by heterogeneity and high dynamism.

  • Four Decades of Data Mining in Network and[taliem.ir]

    Four Decades of Data Mining in Network and Systems Management

    تومان

    How has the interdisciplinary data mining field been practiced in Network and Systems Management (NSM)? In Science and Technology, there is a wide use of data mining in areas like bioinformatics, genetics, Web and more recently astroinformatics. However, the application in NSM has been limited and inconsiderable. In this article, we provide an account of how data mining has been applied in managing networks and systems for the past four decades, presumably since its birth. We look into the field’s applications in the key NSM activities – discovery, monitoring, analysis, reporting and domain knowledge acquisition. In the end, we discuss our perspective on the issues that are considered critical for the effective application of data mining in the modern systems which are characterized by heterogeneity and high dynamism.

  • Four Decades of Data Mining in Network and[taliem.ir]

    Four Decades of Data Mining in Network and Systems Management

    تومان

    How has the interdisciplinary data mining field been practiced in Network and Systems Management (NSM)?  In Science and Technology, there is a wide use of data mining in areas like bioinformatics, genetics, Web and more recently astroinformatics. However, the application in NSM has been limited and inconsiderable. In this  article, we provide an account of how data mining has been applied in managing networks and systems for the past four decades, presumably since its birth. We look into the field’s applications in the key NSM activities – discovery, monitoring, analysis, reporting and domain knowledge acquisition. In the end, we discuss our perspective on the issues that are considered critical for the effective application of data mining in the modern systems which are characterized by heterogeneity and high dynamism  .

  • Knowledge management vs. data mining Research trend, forecast and[taliem.ir]

    Knowledge management vs. data mining: Research trend, forecast and citation approach

    تومان

    Knowledge management (KM) and data mining (DM) have become more important today, however, there are few comprehensive researches and categorization schemes to discuss the characteristics for both of them. Using a bibliometric approach, this paper analyzes KM and DM research trends, forecasts and citations from 1989 to 2009 by locating headings ‘‘knowledge management’’ and ‘‘data mining’’ in topics in the SSCI database. The bibliometric analytical technique was used to examine these two topics in SSCI journals from 1989 to 2009, we found 1393 articles with KM and 1181 articles with DM. This paper implemented and classified KM and DM articles using the following eight categories—publication year, citation, country/territory, document type, institute name, language, source title and subject area— for different distribution status in order to explore the differences and how KM and DM technologies have developed in this period and to analyze KM and DM technology tendencies under the above result. Also, the paper performs the K–S test to check whether the distribution of author article production follows Lotka’s law. The research findings can be extended to investigate author productivity by analyzing variables such as chronological and academic age, number and frequency of previous publications, access to research grants, job status, etc. In such a way characteristics of high, medium and low publishing activity of authors can be identified. Besides, these findings will also help to judge scientific research trends and understand the scale of development of research in KM and DM through comparing the increases of the article author. 

  • Mining-association-rules-for-the-quality-improvement-of-the-production-process.[taliem.ir]

    Mining association rules for the quality improvement of the production process

    تومان

    Academics and practitioners have a common interest in the continuing development of methods and computer applications that support or perform knowledge-intensive engineering tasks. Operations  management dysfunctions and lost production time are problems of enormous magnitude that impact the performance and quality of industrial systems as well as their cost of production. Association rule mining is a data mining technique used to find out useful and invaluable information from huge databases. This work develops a better conceptual base for improving the application of association rule mining methods to extract knowledge on operations and information management. The emphasis of the paper is on the improvement of the operations processes. The application example details an industrial experiment in which association rule mining is used to analyze the manufacturing process of a fully integrated provider of drilling products. The study reports some new interesting results with data mining and knowledge discovery  techniques applied to a drill production process. Experiment’s results on real-life data sets show that the proposed approach is useful in finding effective knowledge associated to dysfunctions causes.

  • Mining customer knowledge for tourism new product development and[taliem.ir]

    Mining customer knowledge for tourism new product development and customer relationship management

    تومان

    In recent years tourism has become one of the fastest growing sectors of the world economy and is widely recognized for its contribution to regional and national economic development. Tourism product design and development have become important activities in many  areas/countries as a growing source of foreign and domestic earnings. On the other hand, customer relationship management is a competitive strategy that businesses need in order to stay focused on the needs of their customers and to integrate a customer-oriented approach throughout the organization. Thus, this paper uses the Apriori algorithm as a methodology for association rules and clustering analysis for data mining, which is implemented for mining customer knowledge from the case firm, Phoenix Tours International, in Taiwan. Knowledge extraction from data mining results is illustrated as knowledge patterns, rules, and knowledge maps in order to propose suggestions and solutions to the case firm for new product development and customer relationship management.

  • Using Data Mining to[taliem.ir]

    Using Data Mining to Detect Insurance Fraud

    تومان

    Insurance companies lose millions of dollars each year through fraudulent claims, largely because they do not have a way to easily determine which claims are legitimate and which may be fraudulent. To ensure that  adjusters target claims which have the greatest likelihood of adjustment, many insurance companies have incorporated IBM SPSS data mining into their investigating and auditing processes. This report describes how data mining techniques can enable you to improve accuracy and save time, money and resources.

  • Violent Web images classification based on MPEG7[taliem.ir]

    Violent Web images classification based on MPEG7 color descriptors

    تومان

    In this article, we present a contribution to the violent Web images classification. This subject is deeply  important as it has a potential use for many applications such as violent Web sites filtering. We propose to combine the techniques of image analysis and data-mining to relate low level characteristics extracted from the image’s colors to a higher characteristic of violence which could be contained in the image. We present a comparative study of different data mining techniques to classify violent Web images. Also, we discuss how the combination learning based methods can improve accuracy rate. Our results show that our approach can detect violent content effectively.