Home
Shameek Agarwal
Cancel

Elasticsearch

Introduction elasticsearch is open source we interact with elasticsearch using rest api and json, making it easy to work with elasticsearch is written in java and uses apache lucene underne...

Spark

Introduction spark - developed at uc berkley as an improvement of hadoop it borrowed concepts from hadoop but now works independent of it unlike hive etc, spark doesn’t convert into the map...

Java

Object Oriented Programming Basics a java program can have any number of classes. the classes can have any name and the java program can have any name however, only one public class...

Docker and Kubernetes

About Docker docker is a tool for managing containers container is a package of our code along with the dependencies and libraries to run that code docker follows a client server architectu...

Hadoop

Introduction 3 vs of big data - these are the reasons why big data was needed in the first place data volume - as the resolution of camera has increased, so has the size of the media...

Messaging Systems

Kafka Setup note - environment should have java 8+ installed download the zip from here unzip it - tar -xzf kafka_2.13-3.5.0.tgz note - the 2.13… here is not the kafka, but the scala ver...

Snowflake

Snowflake we can create and run queries inside worksheets we can see the snowflake_sample_data database by default with some sample data select * from snowflake_sample_data.tpch_sf1.custo...

Relational Databases

Downsides of File Based Systems data redundancy - data repeated at different places data inconsistency - data update at one place might not be reflected at another place difficult data acce...

Spring

Java and Maven Installation Steps (Ubuntu) java 17 is needed for spring framework 6 / spring boot 3 download deb file from here run sudo apt install ./jdk-17_linux-x64_bin.deb download bi...