Every end product must meet and exceed customer expectations. For a successful delivery, it is not just about doing what matters, but also about how it is done by following and implementing the desired standards.
This post outlines the best practices to consider with IDMC CDI ETL during the following phases.
- Development
- Operations
Development Best Practices
- Native Datatypes check between Database table DDLs and IDMC CDI Mapping Source, Target, and Lookup objects.
- If any consistency is observed in IDMC mappings with Native Datatype/Precision/Scale, ensure that the Metadata Is Edited to keep them in sync between DDL and CDI mappings.
- In CDI, workflow parameter values in order to be consumed by the taskflows, a Dummy Mapping task has to be created where the list of Parameters/Variables needs to be defined for further consumption by tasks within the taskflows (Ex, Command task/Email task, etc)
- Make sure to limit the # of Dummy Mapping tasks during this process
- Best practice is to create 1 Dummy Mapping task for a folder to capture all the Parameters/Variables required for that entire folder.
- For Variables whose value needs to be persistent for the next taskflow run, make sure the Variable value is mapped to the Dummy Mapping task via an Assignment task. This Dummy mapping task would be used at the start and end of the task flow to ensure that the overall task flow processing is enabled for Incremental Data processing.
- If some Audit sessions are expected to run concurrently within other taskflows, ensure that the property “Allow the mapping task to be executed simultaneously” is enabled.
- Avoid using the SUSPEND TASKFLOW option, as it requires manual intervention during job restarts. Additionally, this property may cause issues during job restarts.
- Ensure correct parameter representation using Single Dollar/Double Dollar. Incorrect representation will cause the parameters not to be read by CDI during Job runs.
- While working with Flatfiles in CDI mappings, always enable the property “Retain existing fields at runtime”.
- If a sequence generator is likely to be used in multiple sessions/workflows, it’s better to make it a reusable/SHARED Sequence.
- Use the Sequential File connector type for mappings using Mainframe VSAM Sources/Normalizer.
- If a session is configured to have STOP ON ERRORS >0, ensure the LINK conditions for the next task to be “PreviousTask.TaskStatus – STARTS WITH ANY OF 1, 2” within CDI taskflows.
- For mapping task failure flows, set the LINK conditions for the next task to be “PreviousTask.Fault.Detail.ErrorOutputDetail.TaskStatus – STARTS WITH ANY OF 1, 2” within CDI taskflows.
- Partitions are not supported with Sources under Query mode. Ensure multiple sessions are created and run in parallel as a workaround.
- Currently, parameterization of Schema/Table is not possible for Mainframe DB2. Use an ODBC-type connection to access DB2 with Schema/Table parameterization.
Operations Best Practices
- Use Verbose data Session log config only if absolutely required, and then only in the lower environment.
- Ensure the Sessions pick the parameter values properly during job execution
- This can be verified by changing the parameter names and values to incorrect values and determining if the job fails during execution. If the job fails, it means that the parameters are READ correctly by the CDI sessions.
- Ensure the Taskflow name and API name always match. If different, the job will face issues during execution via the runAJobCli utility from the command prompt.
- CDI doesn’t store logs beyond 1000 mapping tasks run in 3 days on Cloud (it does store logs in Secure Agent). To retain Cloud job run stats, create Audit tables and use the Data Marketplace utility to get the Audit info (Volume processes, Start/End time, etc) loaded to the Audit tables by scheduling this job at regular intervals (Hourly or Daily).
- In order to ensure no issues with Generic Restartability during Operations, ensure a Dummy assignment task is introduced whenever the code contains Custom error handling flow.
- In order to facilitate SKIP FAILED TASK and RESUME FROM NEXT TASK operations, ensure every LINK condition has an additional condition appended, “Mapping task. Fault.Detail.ErrorOutputDetail.TaskStatus=1”
- If mapping task log file names are to be suffixed with the Concurrent run workflow instance name, ensure it is done within the Parameter file. IDMC mapping task config level is not capable due to parameter concatenation issues.
- Ensure to copy mapping task log files in the Secure agent server after job run, since IDMC doesn’t honour the “Save Session log for these runs” property set at the mapping task level when the session log file name is parameterized.
- Ensure Session Log File Directory doesn’t contain / (Slash) when used along with parameters (ex., $PMSessionLogDir/ABC) under Session Log Directory Path. When used, this would append every run log to the same log file.
- Concurrent runs cannot be performed on taskflows from the CDI Data Integration UI. Use the Paramset utility to upload concurrent paramsets and use the runAJobCli utility to run taskflows with multiple concurrent run instances from the command prompt.
Conclusion
In addition to coding best practices, following these Development and Operations best practices will help avoid rework and save efforts, thereby achieving customer satisfaction with the Delivery.
Source: Read MoreÂ