Difference Between Data Warehousing and Data Mining
The Data Warehousing and Data Mining are two very powerful and popular techniques to analyze data. Users who are inclined to statistics use data mining. They use statistical models to search for patterns that are hidden in the data. Data miners find useful interaction among data elements that is good for business. But then, data experts who can analyze the dimensions of the business directly tend to use data warehouses.
Data mining is also called Knowledge Discovery in Data (KDD). As stated before, it is a part of computer science, dealing with the extraction of unknown and interesting information from raw data. Because of the exponential growth of data, especially in business, data mining has is a very important instrument in converting this wealth of data to business intelligence. Till a few decades ago, manual extraction of patterns was seemingly impossible. For example, it is currently used for different applications such as fraud detection, social network analysis, and marketing. Data mining takes care of four tasks: classification, clustering, regression and the association. Clustering identifies similar groups of raw data. Classification deals with learning the rules that are useful for the application of new data and include a characteristic way of following steps: designing the models, data preprocessing, learning/ feature selection and evaluation / validation. Regression is the finding of functions with minimum error with minimal data model. And the association looks for relationships between variables. Data mining is used to answer questions like what are the main products that could help achieve high profit next year in Wal-mart?
As stated before, Data warehousing is used to analyze data, by different of users with a slightly different goal in mind. For example, in case of retail sector, Data Warehousing users are more concerned with what kinds of purchases are popular among customers, so the results of the analysis can help the customer by improving the customer experience. But the data miners assume a first hypothesis as to which customers buy a certain type of product and analyze data to evaluate it. Data warehousing can be a major retailer, which initially stores the outlets with the same quantities of products to later learn that stores in New York sell the smallest size inventory much faster than in stores in Chicago. So looking at this result can store the retailer’s store New York with smaller sizes compared to stores in Chicago.
So as you can clearly see, these two types of analysis seem to be of the same nature with the naked eye. Both concern really increased profits based on historical data. But obviously, there are key differences. In simple terms, Data Mining and Data Warehousing are dedicated to the furniture of different types of analytical, but probably for different types of users. In other words, Data Mining is looking for correlations, tries to support a statistical hypothesis. However, Data Warehousing responds to a relatively wider issue and cuts and dices the data forward from there to recognize ways to bring improvement in the future.