Overview
The goal of all this way-of-working is to keep things as isolated as possible, meaning only create a new entity if one does not already exist. Below, will wrap up what we talked about above and illustrate it in real examples.
Platform and S3 Linkage
In order to have even more control over what we do, we apply the platform and s3 linkage. This is all about navigating through projects, file creations and dashboards with ease, without having to check over the whole platform. Basically, the ideal goal is to have a project in platform represent a folder in S3, in our standardized output this is as follows: s3://prime-data-lake/production/client/vdh/
Standardized Output
/standardized_output/
Standardized Output > Raw
/standardized_output/raw/
Standardized Output > Raw > Store Mapper (pipe)
/standardized_output/raw/store_mapper
Standardized Output > Retail Template
/standardized_output/retail_template/
Standardized Output > Product 360
/standardized_output/product_360/
ML Solutions > Promotion Effectiveness
/solutions/promotion_effectiveness/
ML Solutions > Promotion Effectiveness > Input Prep
/solutions/promotion_effectiveness/input_prep
Data Quality Assurance > Retail Template > Point of Sale
/standardized_output/retail_template/DQA/point_of_sale
The ideal table would look like this, but it’s normal to have stuff out of these restrictions as well.
Standardized Output Graph
A new created project, should at least have these components, where the:
Square represents a platform project (or subproject).
The name inside the square represents the project name.
The italic text under name represents the corresponding S3 path.
The soft rectangle represents project components (pipelines, dashboards, etc., not subproject).

This can also be called a starting point template for a new created project.
Last updated
Was this helpful?