|
|
In our last White Paper we stated the basic
and clarifying aspects of Data Warehousing.-
It’s essential to understand that Data
Warehousing is a process and not a product. The clarification of
this concept will help us prevent a short or medium term
failure.-
The objective of the present White Paper is
to help identify the "Critical Area" of ongoing
implementation, maintenance and improvement. It will also help you
find a break-even point between the quick response given to users
and the successful implementation of the Data Warehousing
process.-
It’s absolutely critical to achieve
this balance. In order to do this we need to thoroughly understand
the complexity involved in Data Warehousing processes as well as
rely on an appropriate work methodology to guarantee the success of
the implementation.-
Top.-
|
Data Warehousing and total implementation
environment.-
|

Top.-
|
DSE - Data Storage Environment
|
The first and most important step to
implement and maintain a Data Warehousing process lies on
understanding the complexity and importance of the processes
involved in DSE.-
According to Inmon (mentor
of the term "Data Warehouse") 80% of the investment and
the efforts spared to implement and maintain a Data Warehousing
process is made on the DSE. As shown in the chart above the data
feeding the Data Warehouse should be extracted from different
sources already processed or not by the company.-
Example: Different Host and RDBMS, VSAM files, the same data
in different formats or measure units, external sources such as
Marketing Consultants, Surveys, Suppliers,
Internet/Intranet,etc.-
The data extracted
and transported to the Data Warehouse should be consistent,
remapped, integrated, clean and synchronized. The business rules
should be defined and they may also be summarized if necessary.-
Data should be available to users or preferably
distributed among them.-
Once the project
has been implemented, Users (Marketing, Production, Finance, or
Budget Decision Makers) will gradually get to know all about the
exploitation of the information in the Data Warehouse. Users will
also need the Analysis Tools, Reporting, DataSets for DataMining or
Datamarts fit to their needs.-
Not all the
users need OLAP and only a few need DataMining. Others will do well
with Excel and dynamic tables.-
All these
dynamic changes should be considered in the DSE so as to thoroughly
exploit a Data Warehouse.-
It’s very
important to know why a DataWarehouse should be ready to constantly
grow by additions as well as to absorb the changes in source data or
users’ new requirements.-
In general
the technologies used in Data Warehousing are not very well known
but they are constantly growing. It’s very common that
pressure or confusion leads to an analysis of Data Warehousing
implementation through excellent and attractive tools such as
Reporting, OLAP analysis, DSS, DataMining or through DataMart and
its templates-
The following are the two reasons for which
Data Warehouse feeding processes should always be reviewed:
- The constant technological updating or
OLTP application maintenance, engineering, etc. -
As a result data sources are subject
to changes (formats, Units of measure) which in turn should
be reflected in the DW and integrated to previous data. This
will even happen when working on a stabilized OLTP
environment and after an excellent analysis of the needs.
According to Inmom, if you are planning to improve your
transactional data model (OLTP) to avoid or minimize the DSE
phase, you will never have a Data Warehouse-
- The pressure exerted by User's and/or
Company’s decision makers.-
The DataWarehousing process itself
allows users to enjoy an incredible IS independence.
Besides, they should have a whole vision of the data they
can exploit.-
Reporting OLAP, DSS or
DataMining products change decision Maker's perceptions
demanding more and better information not timely
requested-
Top.-
|
DSE - Environment Administration and
Automation.-
|

The chart clearly reflects how the Data
Warehouse maintenance costs increase if DSE is not administered and
automated. -
Obviously, if we lack the
appropriate tools and maintenance processes are not automated,
Analysis, Programming and Database Administration costs will
increase indefinitely as Data Warehouse processes evolve.-
Strict controls on the data and its processes
should be performed and administration should be dynamic and
flexible..-
To select the right tools and
automate the DSE processes we should take into account the
following:
- The Data Warehouse records are not
updated.-
- The information volume grows constantly
through records additions.-
- Incorporation or constant changes in
data source.-
- It should point out unexpected changes
or changes not informed in data source.-
- It’s a Batch process, but
processes by exception should be considered.-
- It should also take into account
processes that include granulating vs. volume analysis.-
- It should be implemented with Open and
Relational technology (RDBMS)
If all these requirements are met, the
resulting Data Warehouse will have the following
characteristics:
- Upgradeable (very important considering
the constant growth of information).-
- Easy to access, flexible.-
- Capacity of Tool distribution and
Users’ functions.-
- Ability to quickly add more dimensions
in DataMart (Rolap/Molap).-
- Perform DataMining processes without
inconvenient in the adequate DataSets.-
- Ensure data quality.-
- Do not interfere in transacctional
processes.-
Top.-
In order to make Data Warehousing processes
a permanent event in the company of an attainable cost inversely
proportional to the benefits rendered, the methodology should have
the following characteristics:
- Specific design for Data Warehousing.-
- Pilot Start-up phase with the
"Key" user and a Work Team with managerial support
(This phase should be implemented immediately and
shouldn’t create ambitious expectations, however it should
be efficient enough to motivate prospective users as well as
design and administer DSE at the same time).-
- Administration from Meta Data
Dictionary.-
- Olap Design (Rolap/Molap) for
DataMart.-
- Knowledge transference.-
- Ongoing improvement.-
- Strict control on the work schedule.-
Top.-
Carlos A. Arabito (DataWarehousing/DSS
Consultant)
OSC S.A.
e-mail: carlos.arabito@gmail.com
Copyright
(c) 1996 by OSC S.A.
Developed by SysAmerica
Web Design & Hosting
All rights
reserved. All brand names and product names used on these web pages
are trademarks, or trade names of their respective holders.
|