Stream Processing Motivation popular spark cloud platform - databricks popular spark on prem platform - cloudera setup for databricks and local is already covered here when our iterations...
Spark Advanced
Python Data Analysis
Jupyter Notebooks jupyter - web based environment for notebook documents allows to have python code along with headings, charts, tables, etc the entire flow - there is a notebook...
Python Basics
Getting Started python3 vs python2 - python3 is not backwards compatible - a lot of big changes were made it addresses a lot of issues with python2, and the world has moved to python3 now c...
Data Engineering
Different Architectures oltp - online transactional processing used for operational data keeping we do not maintain a history of the data - we update records in place to ...
Java Multithreading
Concepts there are two benefits of multithreading - responsiveness and performance repeat - remember multithreading gives both the features above concurrency means performing different task...
High Level Design
Software Architecture what is software architecture - high level design - hide implementations and express in terms of abstractions of the different components and how th...
Low Level Design
Some Principles dry - don’t repeat yourself - the code should be changed in a single place only yagni - you aren’t gonna need it - do not introduce / foresee features you will not need in fut...
Elasticsearch
Introduction elasticsearch is open source we interact with elasticsearch using rest api and json, making it easy to work with elasticsearch is written in java and uses apache lucene underne...
Spark
Introduction spark - developed at uc berkley as an improvement of hadoop it borrowed concepts from hadoop but now works independent of it unlike hive etc, spark doesn’t convert into the map...
Java
Object Oriented Programming Basics a java program can have any number of classes. the classes can have any name and the java program can have any name however, only one public class...