Customer Login     Search     Contact Us

Data Warehousing


Businesses in all industries, working to provide all different types of products or services, can define their worth through different assets. An asset is anything of value in an organization. For example, an inventory item may be an asset, or a car in a fleet, or office furniture, or any number of other things that have a quantitative value.

Net Fusion has significant experience with data warehousing. Numerous databases have been designed for customers to consolidate data into a normalized warehouse or into a star schema used for dimensional analysis. We have created ETL processes that pull data from all types of sources including text files defined with fixed width columns, Microsoft Dynamics GP, CRM solutions, web services, XML source files, and web pages to name a few.

Of all the different company assets an organization could possibly own, one of the most important, if not the absolute most important is information in the form of data. This is data for analysis that can be used for trending, as key performance indicators, forecasting, and countless other purposes. When data are looked at as an asset, it becomes possible to incorporate business intelligence into the decision making process. In other words, statistics that are derived from company data can be used to directly influence strategic decisions.

Interestingly enough, for the majority of businesses, the availability of such a valuable asset, which makes the difference between an objective and subjective decision making process, is not available. The problem is not that the asset does not exist, but rather it cannot be used for analysis. More often than not the information is spread out across multiple sources in an organization and available within a limited scope.

For example, a marketing manager may have an Excel workbook used to track the success of a campaign in terms of units sold. An inventory manager may have a different Excel workbook that has information about remaining units, backordered units, and sold units. A company executive is only privy to the information reported on from the two different managers and has to manually cross reference data in order to gain an understanding of process that span multiple departments. This scenario can easily be expanded with the size of an organization from departments to divisions.

When a claim is made that a data warehouse will solve all of a company’s data issues, and immediately provide a facility for business intelligence it sounds like a good direction to take. Data warehousing, however, is often a misunderstood term that is lumped together with business intelligence and data analysis. It is very true that all of these different components work together, but a data warehouse is not synonymous with business intelligence.

A data warehouse is nothing more than a large repository for data. Depending on the needs of a business the design and implementation of the warehouse may take different forms. In some cases the database may be highly normalized, and in other cases it may use a star or snowflake schema to facilitate fast dimensional analysis. Regardless of the design, in its purest form, a data warehouse is nothing more than data.

The data stored in a warehouse typically comes from many different sources. Application data from e-commerce solutions, information out of a CRM, company information from an HR application, and financial information out of QuickBooks or Microsoft Dynamics GP are all examples of source data for a warehouse. Information from all of these disparity sources is consolidated, and relationships are established within the warehouse, thereby providing a consistent source of information that spans a complete business or organization.

The process of consolidating this data is generally referred to as “extract, transform, and load” (ETL). The extraction process is used to pull the data out of sources. Transformation refers to cleaning and manipulating the data prior to insertion in the warehouse. The load process is the act of placing the transformed source data into the warehouse.

In almost all cases this is the most difficult and time consuming process of a warehousing project. Source data is often inconsistent and needs to be cleaned before it can be inserted into the warehouse. It may have to be broken down into more granular forms than are immediately available in source data. When it is time to load the data in the warehouse different relationships must be established for the data to make sense.

Depending on business needs and cost constraints the ETL process could happen on a real time basis throughout the course of a normal day, or it could happen once every day, week, or even month. The frequency of the ETL process is dictated by data reporting and analysis requirements for an organization and cost impact. More frequent consolidation of data through the ETL process has a greater impact on source data and their associated applications which can be reduced with better hardware and servers.

Regardless of the warehouse design or the frequency of data consolidation, once it is in place and operational, the data asset becomes available for use, though not directly by decision makers. This gap between a completed warehouse implementation and the ability to use the data is bridged by other components that are technically very different than data warehousing. This is exactly why many data warehousing endeavors are either never completed or become unusable. Planning for these additional components is often overlooked.

In order for data to be used as an asset it must be accessible by executives and decision makers. This is facilitated through analysis tools and reports. SQL Server Analysis Services (SSAS) can be used to create cubes out of data in a warehouse (and other sources). The term cube is used to indicate a three dimensional view of data that is aggregated over many different dimensions. For example sales figures for an employee for a year might be one dimension. The next could be sales figures for all employees in a division for a year. A final dimension might be sales figures for all divisions in a company for a year.

SSAS also has a data mining feature which has the ability to establish relationships in data that is seemingly unrelated. Many times the pure quantity of data makes it impossible for a human eye to discern relationships. The advanced algorithms contained in the data mining functionality of SSAS are able to find the relationships however and make them available to decision makers.

Dimensional data analysis and data mining are part of a specialized field related to data called online analytical processing (OLAP). It is very important to understand the distinction between data warehousing and OLAP. OLAP is a tool used to analyze data in a warehouse; it is not actually part of a data warehouse. This is very often misunderstood and leads to project that do not take the time necessary to create these tools into account, leading to over budget and/or unsuccessful projects.

Net Fusion has significant experience with data warehousing. Numerous databases have been designed for customers to consolidate data into a normalized warehouse or into a star schema used for dimensional analysis. We have created ETL processes that pull data from all types of sources including text files defined with fixed width columns, Microsoft Dynamics GP, CRM solutions, web services, XML source files, and web pages to name a few.

More importantly, Net Fusion has significant experience in turning your data into an asset. Having a data warehouse does not mean executives and decision makers will automatically have the ability to incorporate business intelligence into strategic decision making. There are many other components related to data warehousing that need to be planned for, designed, and implemented before end users can make full use of a company’s data. The experience and expertise that Net Fusion can bring to your organization includes all of these components.