Pipelines

Production and Development branches on pipelines

As we’ve already discussed, there is an option in VDH to add new branches to a pipeline. This was used as a versioning technique earlier, but now we can see versions inside the pipeline, so we don’t use branches to create new versions.

Each pipeline should never have more than 2 branches: Production and Development.

The Production branch should always be ready for execution, so you shouldn’t do testing on it. You should use the Development branch to test any change that you want to make, and then copy (pull) it to the production branch after you are sure that the change is final.

So, in conclusion, this is not a good way to create branches:

And this is the way you should try to use them:

Please note that you may add a date at the name of the branch, but always make sure that there is a clear distinction between Production and Development branches.

Use numbers when naming pipelines

There are a lot of cases when the pipelines should be executed in a specific order, in that case, it is a good habit to name them with numbers according to the execution order. This would make it easier for other people to track the general way in which the data is being processed.

VDH allows sorting on names, so provided that we have numbered the projects, it will be easier to recognize the running order as well.

Notes on pipelines

It is very important to include explanations and notes describing specific processes and steps taken in a pipeline (e.g. if that process is client-specific or a general rule, so that it is easier for everyone to understand the reason and flow of its application). Notes can be regarding engineering steps, as well as for the business knowledge behind it.

Duplicating importers when not using versions

It is often needed to perform tests from importers which are versioned. We should avoid inserting a hard-coded date of the main importer because we might forget to turn the original version back. So, it is a good practice to always take another new importer and load the respective hard-coded path for testing.

When to push and when to commit

Once we have made changes in the pipeline, we save that version by clicking on Commit and when running that pipeline directly from VDH, changes will be reflected. However, when that pipeline is part of a scheduled job to run automatically, we need to make sure that we have Pushed the pipeline. The job will only take into account the latest Pushed Version. However, when we make changes, and directly run the pipeline, the Commit will be applied automatically.

There is also an option to leave a comment on the push version, to make it easier for the user when scrolling through the activity log at the right. Apart from the version comment, you may also name the push tag version.

Last updated

Was this helpful?