Introduction we typically use apache airflow for organization - set the order of tasks, make sure the next task starts after the previous one is completed successfully and control the schedulin...
Airflow
Kubernetes Advanced
Helm Getting Started docker desktop directly has an option to enable kubernetes. this starts a single node kubernetes cluster when the docker desktop app is started point kubectl to the righ...
DBT
DBT Introduction initial architecture was to load the transformed data into warehouses using etl the recent addition of cloud data warehouses led to promotion of elt - blast the data into the...
Spark Advanced
Stream Processing Motivation popular spark cloud platform - databricks popular spark on prem platform - cloudera setup for databricks and local is already covered here when our iterations...
Python Data Analysis
Jupyter Notebooks jupyter - web based environment for notebook documents allows to have python code along with headings, charts, tables, etc the entire flow - there is a notebook...
Python Basics
Getting Started python3 vs python2 - python3 is not backwards compatible - a lot of big changes were made it addresses a lot of issues with python2, and the world has moved to python3 now c...
Data Engineering
Different Architectures oltp - online transactional processing used for operational data keeping we do not maintain a history of the data - we update records in place to ...
Java Multithreading
Concepts there are two benefits of multithreading - responsiveness and performance repeat - remember multithreading gives both the features above concurrency means performing different task...
High Level Design
Software Architecture what is software architecture - high level design - hide implementations and express in terms of abstractions of the different components and how th...
Low Level Design
Some Principles dry - don’t repeat yourself - the code should be changed in a single place only yagni - you aren’t gonna need it - do not introduce / foresee features you will not need in fut...