1. Report Requirements Collection
Business requirements determine the scope of a new data mart, a new functional area, or a new set of reports. These requirements also determine what data must be available in the data warehouse, how it is organized, and how often it is updated. They have a large impact on the data model design and reports design. Data Integra project manager, data architect and BI report developers meet with client executives, managers, report users and business analysts in a series of interview sessions to define and refine report requirements. This is also a first chance to get an overall understanding of the source data that is needed to support the report requirements. At the conclusion of this step, Data Integra project manager prepares formal report requirement document and send it to users for review and signoff.
2. Source Data Assessment
Data Integra data architect and BI report developers conduct source data assessment, sometimes also called data profiling. This step employs analytic methods for looking at data in order to develop a thorough understanding of the content, structure and quality of the source data. With the help of a good data-profiling tool, it is possible to process very large amounts of source data and uncover data anomalies and defects that need to be properly addressed early on in the project. Data-profiling results can also be used to prepare the business sponsors and end users for the realistic development schedules, the limitations in the source data, and the report requirements that can be supported
3. Data model design
Data Integra data architect works together with BI report developers to design logical data model for data warehouse. This is done using industry-standard dimensional data modeling techniques. We begin by constructing a matrix that represents key business processes and their dimensionality. The matrix serves as a blueprint to ensure that the data warehouse is extensible across the organization over time. Coupling with business requirements and data assessment obtained in the previous two steps, we then develop a dimensional data model. This model identifies the fact table grain, associated dimensions, attributes, and hierarchical drill paths. The preliminary aggregation plan is also developed. This set of activities concludes with ETL specifications of source-to-target data mapping. Next, Data Integra data architect works with client DBA to define the necessary physical database design to support the logical database design. Database naming standard is defined. Database environment is set up. Preliminary indexing and partitioning strategies are also determined. Both logical and physical designs can be greatly facilitated by using a graphical CASE tool such as ERStudio or ERWin. These tools have the advantage of centralizing designs, versioning and automatically generating database scripts.
4. ETL design and development
Data Integra ETL developer designs and implements ETL process.ETL (Extraction, Transformation, Load) is the process of extracting data from the source system, enforcing data quality and consistency standards, transforming data according to the ETL specifications created in the last step, and finally loading data into data warehouse. ETL process needs to execute efficiently, correctly and repeatedly at a pre-determined schedule (daily in most cases). This step consumes the bulk of time and resources (easily 70%) needed for implementation and maintenance of a typical data warehouse.
5. BI reports development and maintenance
Data Integra BI report developer creates reports and their associated supporting objects (templates, filters, metrics, prompts, etc.) according to reporting requirements specified in step 1. These reports include both canned reports with fixed templates and filters and ad hoc reports that users can manipulate dynamically at runtime. Data Integra BI report developer then unit-tests and system-tests these reports. Upon passing the tests, reports are migrated from development environment to production environment. Then, client project manager and Data Integra project manager and BI report developers conduct user acceptance test (UAT) with end users. End users sign off upon satisfactory evaluation of these reports. After these reports are deployed, users may periodically request enhancements and report defects. Data Integra BI report developer then needs to maintain these reports through enhancements and bug-fixings.
6. BI reports Quality Assurance (QA)
Once reports are deployed in production, to ensure a high level of quality of service to end users, report results need to be validated and report performances need to be monitored on a daily basis. Daily, weekly and monthly incremental load to data warehouse also need to be validated and monitored. Incorrect or missing report data and poor report performance could occur intermittently (sometimes even frequently during initial deployment period) due to a variety of reasons, such as:
- Enhancements or bug-fixes to other reports inadvertently broke existing working reports
- Data warehouse incremental load failed
- Reporting environment software or hardware upgrade
To build user confidence and increase data warehouse and BI acceptance, it is absolutely imperative to detect these report data and performance issues as much as possible, before end users run into them. Ideally, the QA process should be finished on a daily basis before users start executing reports at the beginning of a working day. |