AWS
❌ Don’t cancel a pipeline in EMR while it is running, in those cases the pipeline cannot be canceled, but it has to be killed 🔫
2 steps to Kill pipelines:
If the pipeline has not started running in EMR yet, select the step and click on Cancel on the top left.
If the pipeline has already started running in EMR, a command should be written in the console “yarn application - kill idOfApplication”. The idOfApplication can be found through “yarn application -list” which will list all the id-s of running applications.
❌ Don’t run pipelines that take longer than 20 minutes on the production cluster without consulting with your colleagues, every Data Team member uses the same cluster and running long pipelines will not allow them to be efficient on their tasks. It is best if you run long pipelines on breaks or after working hours, or create a new cluster if it is urgent.
Last updated
Was this helpful?