Data Preparation can be seen as the practical pre-processing of certain data for a specific purpose such as a new implementation, business transformation, system conversion (upgrade), or migration.  

Data Migration Strategy is vital in establishing the scope, principles, responsibilities, and approach for the migration. Typically, this strategy would speak to the traditional “E-T-L” process whereby data is Extracted, Transformed (validated, cleansed/prepared, transformed), and then Loaded to the target system in one continuous process. However, the variability and validity of the data being migrated can cause serious issues and is usually one of the biggest threats to the success of a Migration project. 

“Many projects only identify the effort and complexity involved in preparing the data mid-project when it may already be too late.” States Ben Strydom, Director (Operations) at GlueData Services. 

Data-readiness is always a key factor affecting project risk (timelines and delivery). Data ownership (integral to driving data cleansing) may be poorly defined or non-existent meaning the identification of requirements and implementation of relevant measures to fix adds complexity and additional burden to business and project resources. 

Furthermore, the full effort and complexity of data cleanup for migration are always relatively unknown until the final design is bedded down.  

“Ultimately, data of poor quality can technically be loaded to a system, but the accuracy of said data can be critical to a business’s processes. This results in unnecessary and increased risk which would add to business overhead (the cost to rectify) and could even halt key operations indefinitely 

Generally, the appetite for such a Data Preparation initiative is low due to the perception that only short-term benefit is achieved but imagine going live with a new system having loaded bad quality data, which causes critical processes to fall over at the same time as you are trying to improve data quality!” continues Ben.

GlueData recommends: 

  • Shifting the bulk of the activities related to the preparation of data from the Transformation phase of the migration project into a separate stream or project, which starts before the Migration project.  
  • Incorporating Data Preparation elements into the organizations Data Quality and Data Governance initiatives for long-term gain.

Data Preparation should start as soon as possible after the decision has been made to move to S/4HANA to have enough time for the design, implementation, and execution of the majority of the Data Preparation activities before Data Migration project dependencies occur. 

The key principles recommended for Data Preparation are as follows:  

Start early 

  • Data Cleansing is a highly iterative process. The earlier one starts the more iterations can be completed. This should result in much improved Data Quality at the time the Data Migration Realisation phase starts. 

Initiate Data Preparation with SAP Standard requirements 

  • The full scope and effort for Data Preparation are usually unknown at the outset. By starting with SAP Standard and Best Practice requirements (which are technically known) the majority of the scoped objects can already be addressed immediately

Data Preparation Assessments do not replace Technical Readiness Assessments 

  • Technical Readiness speaks to system configuration, code and other mechanical components and technical readiness assessments are typically concerned with ensuring a technically sound conversion. 
  • Data Preparation emphasises Data Readiness and should be executed in conjunction with Technical Readiness Assessments to obtain deeper insight into conversion requirements which includes the dependencies between data and the system configuration. 

Limit scope early and often  

  • There are various means of specifically limiting the scope, during Data Preparation. The first step is to define what constitutes Business- & migration-relevant data (“Data Relevancy”). 
  • By focusing only on relevant data and excluding irrelevant data the volumes and effort are continuously managed.  

Leverage and enrich existing Data Quality and Data Governance solutions 

  • Some Organisations may already have mature Data Quality and Data Governance solutions. In such cases, the Data Preparation stream should incorporate these solutions to speed up the processes of defining data ownership, cleansing approach, and the like. 

Data Ownership defined 

  • Data Owners are responsible for the data in a specific system, process, sub-set of data, Data Object, Business Area or module, etc.  
  • Their responsibilities include controlling the Data Quality of Legacy data and will naturally extend to the same areas of control on the new S/4HANA system. Therefore, they must also own all the Data Preparation and Data Transformation steps applied to the data in between these systems.  
  • There is no point in developing a Cleansing approach for Data Preparation without being able to assign the task of making the changes. 

Cleanse data at Source 

  • During Data Preparation, the necessary cleansing of data must be executed within the Source system to ensure a single source of truth for the relevant data during Data Quality, Preparation, Migration, and general Governance. Having one standardized source for relevant data will ensure lower complexity in terms of consolidation, design, and even reconciliation.  

Run in parallel

  • Due to the fact Data Preparation deals with pre-processing of data before the Data Migration project is ready to execute this processing may result in an overlap of timelines between Data Preparation and Data Migration.  
  • By running Data Preparation in parallel allows for the gradual transfer of relevant activities and solution elements onto the migration project at a pace the project can absorb.  
  • Alternatively, the Data Preparation can be run as an independent stream (perhaps within the migration project once it starts). In this case, more emphasis should be placed on properly planning the integration between Data Preparation and Data Migration. 

Establish a Data Preparation Landscape (DPL) 

  • The DPL will expand to integrate the Data Migration Landscape when the Data Migration project kicks off. This reduces the need for additional applications, systems, licenses, and so forth. It also allows for visibility and seamless technical integration between the two landscapes.  

By using this recommended approach to Data Preparation the organisation can materially reduce the risk and effort during the Data Migration project. But there are numerous other possible benefits to be realized: 

During the Data Preparation process, the organisation can leverage the learnings and insights for both Project and Business-as-usual benefits. 

Planning stages for the Data Migration project are improved through insights as to how data is used within existing Business Processes (as well as how these will change moving to S/4HANA). Recognising the Target system requirements, for example, allows Business to gain an understanding of what life will be like post-migration and to implement the right processes and solutions in preparation thereof. 

Architecture and solution design will be clearer as a result of these insights which, by extension, may reduce fit-gap requirements later on in the project. 

Data Preparation, in general, improves the likely success of the migration project in multiple ways:

  • Reduce complexity and ‘noise’ across project phases 
  • Early adoption by business 
  • Reduced risk to project timelines and delivery by moving Data Preparation activities (effort) out of the migration project 
  • Use Data Relevance to further manage scope and volume 
  • Clarify dependencies between Data Quality & Data Migration 
  • Improved integration planning and visibility of data impacts & dependencies 
  • Greater agility to respond to scope changes (e.g. new data sources & business requirements)  

Training (the ‘human factor’) cannot be overlooked and should also be addressed early on to support the adoption & ownership of the new systems and processes. Ideally, the relevant resources are included as stakeholders and recipients of training during the project. 

Rules, Data Quality requirements, and Data Ownership enrich the various (ongoing) Data Governance solutions and principles. 

  • Reduced overhead in rectifying issues 
  • Maturity in the understanding of data requirements for critical processes 
  • Clarity of design for future data control & management solutions 
  • Reduced data cleansing requirements (continuous improvement) 
  • Enriched rule sets for Data Quality/Data Governance