|
Business Challenge
The Extract, Transformation, and Load (ETL) system is the most time-consuming and expensive part of building a data warehouse and delivering business intelligence to your user community. A decade ago the majority of ETL systems were hand crafted, but the market for ETL software has steadily grown and the majority of practitioners now use ETL tools in place of hand-coded systems.
There are a lot of advantages in use of ETL system such as:
- Visual flow and self-documentation. The single greatest advantage of an ETL tool is that it provides a visual flow of the system's logic.
- Structured system design. ETL tools are designed for the specific problem of populating a data warehouse. Although they are only tools, they do provide a metadata-driven structure to the development team
- Operational resilience. Many of the evalueted home-grown ETL systems are fragile: they have too many operational problems. ETL tools provide functionality and practices for operating and monitoring the ETL system in production.
- Advanced data cleansing functionality. Most ETL systems are structurally complex, with many sources and targets. At the same time, requirements for transformation are often fairly simple, consisting primarily of lookups and substitutions.
Solution
Tezauri Data integration Tool is a ETL software development tool fully aligned with IFW BDW concept. Innovative Message Concept and well designed metadata model together with custom developed components and Tezauri Manager as central application for system management, allows you to produce state of the art ETL system for your IFW BDW based data warehouse regardless of that where you source data is stored.
Benefits
Tezauri – Data Integration benefits:
- Capitalizes your investment in IFW models by giving you state of the art ETL tool aligned with data models
- Solves the data structure gap between IFW models and data models of operational sources
- Raises the abstraction level for ETL developers enabling them to focus on data concepts instead on bits and bytes
- Rapidly improves speed of system development
- Custom developed SSIS components with encapsulated metadata logic make ETL applications much more simple
- Enables easy ETL system management including metadata
- Enables real time metadata documentation generation and reporting
Innovative Message Concept
Message Concept is designed as a staging area between source systems ad BDW in order to simplify and improve efficiency of ETL process.
Major features of this Message concept are:
- Messages are derived from BDW data concepts and entities
- Small number of entities, up to 20, represented as highly renormalized BDW entities
- Extensive usage of XML for storing result of complex transformation
- Physical representation of Messages can be: Entities (Tables), Files (txt, XML, etc), Memory objects.
- Message structure is build on top of different types of attributes (classifications, domain mapped attributes, descriptors, etc)
- Adjustment for asynchronous and near to real time data load
Metadata model
Tezauri Data Integration extensively uses metadata in order to enable high efficiency and control of ETL process. All metadata are stored in common, unified TDI meta data model.
There are several metadata sources:
- Metadata related to operational data
- tables and columns metadata
- keys and relations metadata
- constraints and rules metadata
- Classification metadata
- Schemas
- Values
- Messages metadata
- Mappings metadata
- Classification
- Surrogate Keys
- Descriptions
- Products
- ETL transformation metadata
- SSIS packages, tasks, groups
- Jobs
- Execution results
- Data Quality metadata
- DQ rules and rule sets
- DQ check results
Custom developed SSIS Components
Custom developed SSIS components are intended to encapsulate application logic of metadata and hide it from the developers. TDI has following custom developed SSIS components:
- Message Mapper. Mapping source system structures to BDW
- Classification resolver. Mapping Source System Values to BDW CL data concept
- Surrogate key. Surrogate key management based on metadata.
- Accounting Unit Key. Management of accounting unit keys, based on relationships between Arrangement, Accounting Unit Type and Accounting Unit Balance
- Expression engine. This is one of the most important building blocks in TDI environment. TDI EE is extensively used for building Data transformations, Data Mappings, DQ rules
Tezauri Manager
Tezauri manager is enterprise desktop application, which enables development and maintaining of Tezauri DI solution.
Main purpose of this tool is:
- Metadata Management
- Development/Implementation
- Configuration/Administration
- Maintenance/Monitoring
Major Features:
- Specialized CDC system oriented to batch processing
- Organizes source tables into logical parts (Subject Areas) named Source Systems and Subsystems
- Defines and maintnains the logical and physical structure of Message
- Relates attributes to Classification Schema
- Surrogate key management
- IBM FSDM is classification oriented data model
- Classification Module enables creation and maintenance of classification schemas and values
- Quality of data can be evaluated through different types of rules
|