Bigdata and NoSql Tutorials

Example blog post alt

Apache Kafka Architecture and Design

Kafka is distributed, that means it run as a cluster on multiple nodes called brokers, even brokers can span on multiple datacenters. A distributed system is Horizontally scalable and fault-tolerance.

Continue reading

Example blog post alt

Introduction to Apache Zookeeper

Zookeeper is a centralized open-source server for maintaining and managing configuration information, naming conventions and synchronization for distributed cluster environment.

Continue reading

Example blog post alt

Download and Install Apache Zookeeper on Ubuntu

In this particular article we will see how to download and install Zookeeper on Ubuntu. We will install Zookeepr on single node, here are steps you need to follow to start ...

Continue reading

Example blog post alt

Setup Apache Zookeeper multi node cluster on Ubuntu

In this article we will create a Apache Zookeeper cluster on three machines, we have three Ubuntu machines with 1 GB ram.

Continue reading

Example blog post alt

Multi-Broker Apache Kafka + Zookeeper Cluster Setup

Apache Kafka Multi Broker cluster setup on Ubuntu, In this article we will create a Apache Kafka multi broker cluster on three machines, we have three ubuntu machine with 4GB ram.

Continue reading

Example blog post alt

How to write a Kafka producer in java - Example

In previous articles we have seen, how to setup "Multi-Broker Apache Kafka Cluster" and "Zookeeper". In this article we will see how to write a kafka producer in java to write data on kafka cluster.

Continue reading

Example blog post alt

How to write a Kafka Consumer in java - Automatic offset commit

In previous article we have seen, how to write a "Kafka producer in java". In this article we will see how to write a kafka consumer in java with automatic offset committing to get data from kafka cluster.

Continue reading

Example blog post alt

How to write a Kafka Consumer in java - Manual offset commit

In previous article we have seen, how to write a "Kafka producer in java". In this article we will see how to write a kafka consumer in java with manual offset committing to get data from kafka cluster.

Continue reading

Example blog post alt

How to write a Kafka Consumer in java - assignable to a specific partition

In previous article we have seen, how to write a "Kafka producer in java". In this article we will see how to write a kafka consumer in java that can be assigned manually to a specific partition.

Continue reading

Example blog post alt

Big Data Analytics with Apache Spark

Apache Spark is a fast and general-purpose cluster computing system, it provides high-level APIs in Java, Scala, Python and R.

Continue reading

Example blog post alt

Install Apache Spark on Ubuntu

Install Apache Spark on Ubuntu - In this article we will see how to setup Apache Spark on ubuntu machines building from source code.

Continue reading

Example blog post alt

Apache Spark Multi-Node cluster setup on Ubuntu (Standalone mode)

Apache Spark Multi-Node cluster can be setup using cluster managers like Hadoop YARN, Apache Mesos or Standalone spark cluster manager. In this article we will see how to setup Apache Spark cluster on ubuntu machines using Simple standalone spark cluster manager.

Continue reading

Example blog post alt

How to create Spark Java Application and Submit it to Spark Cluster

In this article we will see "How to create Spark Java Application and Submit it to Spark Cluster" and submit it to spark cluster to be executed. We will create a maven Java application with Spark Java API.

Continue reading

Example blog post alt

What is Spark SQL, SQLContext, SparkSession & DataFrames and Datasets

In this article we will see, what is Spark SQL, SQLContext and SparkSession, how to create SQLContext and SparkSession in Spark and their implementation.

Continue reading

Example blog post alt

DataFrames in Apache Spark - Java Spark API

In this article we will see what are DataFrames in Apache Spark, how to create them and their operations using Spark Java API. We will also look into how to create a DataFrame from different sources like RDD, Java Lits, JSON and MySql etc.

Continue reading

Example blog post alt

Datasets in Apache Spark - Java Spark API

In this article we will see what are Datasets in Apache Spark, how to create them and their operations using Spark Java API.

Continue reading

Example blog post alt

Apache Spark Dataset Joins - Java API

In this article we will see, how to join two datasets in spark with Java API, different type of joins available in Spark java programming and difference between them with sample java code.

Continue reading

Example blog post alt

Apache Redis - Standalone Setup on Ubuntu

Apache Redis is an open source in memory cache store, used as a database cache and message broker. Redis is rich in data structures like strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperlog logs and geospatial indexes with radius queries. Apart from that Redis provides built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

Continue reading

Example blog post alt

Apache Redis Cluster Setup on Ubuntu

Redis cluster provides automatic sharding or data replication on multiple nodes and some degree of fault tolerance, some degree means when majority of nodes are up and running cluster tends to work.

Redis cluster uses two ports client port i.e. 7000 and one internal communication port that is always (the client port + 10000) and fixed, so make sure there two ports are always open and do the needful in server security and firewall if needed.

Continue reading

Example blog post alt

How to configure Apache Redis Master-Slave on Ubuntu

In order to maintain fault tolerance, we must have to adopt master and slave model meaning that a master node should have 1 to N complete replicas; more replicas more fault tolerance.

Continue reading

Example blog post alt

Spring Data with Standalone Redis Using Jedis Client (Spring + Redis Template)

In previous articles we have already seen, how to setup Redis Standalone Server in Ubuntu. In this particular article we will setup Spring Data with Standalone Redis Using Jedis Client.

Continue reading

Example blog post alt

How to use ByteBuffer to get and set data on Apache Redis

In this article, we will see, how one can save memory on Redis while storing data in form of byte buffer. The Java application will get and set key-values on redis using byte arrays, while converting byte array to/from byte buffer.

Continue reading

Example blog post alt

Introduction to Apache Solr 4.0 with Apache Tomcat

In this blog we will dive into Introduction and features of solr4.0 and will came to know how solr is a useful search server for full text searching. Solr is a simple configuration based implementation of Full Text Searching over lucene libraries.

Continue reading

Example blog post alt

Apache Solr 4.0 with Apache Tomcat 7 in Windows 7

In this blog i will tell you how to integrate Solr4.0 with apache-tomcat 7 in Windows environment. Before we start i recommend you to read an Introduction of solr4.0 from here.

Continue reading

Example blog post alt

Apache Solr 4.0 with Apache Tomcat 7 in Ubuntu Linux

In this blog i will tell you how to integrate Solr4.0 with apache-tomcat 7 in linux environment. Install Tomcat on your machine and and make sure it is ready to start.

Continue reading

Example blog post alt

How to integrate Highlighting in Search Results Using Apache solr4 and apache tomcat

In this particular blog we will come across a very useful feature of solr4 that is highlighting the search keywords in search data. In solr4 highlighting part cab be configured in request url as well as solrconfig.xml.

Continue reading

Example blog post alt

How to use the Solr Data Import Handler to index a MySQL database table

In this article we will go through Importing and indexing My Sql database table data in solr4 using Data Import Handler. Now its time to provide solr4 some data on which the search will be done.

Continue reading

Example blog post alt

BigData Technologies and NoSql - What is Apache Hadoop, MapReduce, HDFS, Hive and Pig

From our previous discussion we came to know that bigdata is a term used for describing rapid growth and maintainability of both structured and unstructured data.

Continue reading

Example blog post alt

What is Big Data ? Introduction and Analysis of Big Data

In this blog we will get into an brief introduction of Big Data, we will come to know about the factors and statics about Big Data. Lets start with some real time examples and scenes around all of us.

Continue reading

Example blog post alt

Basic Linux(ubuntu) terminal commands

Basic Linux(ubuntu) terminal commands, file system commands, make or delete a directory, check status of running processes and occupied ports and change permission of a file or a folder.

Continue reading

Example blog post alt

What is cloud computing - Introduction to AWS (Amazon Web Services)

In this particular blog i will provide you a short introduction on cloud computing. Today where IT is spreading very rapidly and application development is at its best, cloud computing has become a useful aspect of computing world.

Continue reading

Example blog post alt

What is Groovy - Differenve between Groovy and Java

Groovy is known as Java scripting language and a lot of groovy users are taking advantage of its flexible nature. Groovy is better known as a new age advanced substitute of Java, or we can say it a better Java.

Continue reading

Example blog post alt

Integrate Neo4j (Graph DB) with Java ? Neo4J with Java (Neo4J + Java)

In this article we will see, what is Neo4J and graph db and how to use Neo4J with Java, adding nodes, relationships, properties and much more, stay tight and go forward.

Continue reading

Example blog post alt

MongoDB Crud Operations (Create, Retrieve, Update and Delete in MongoDB)

We will discuss how to create a database in MongoDB, how to create a table in MongoDb, how to insert data in a MongoDB Collection and how to update and delete data from MongoDB table(Collection)

Continue reading

Example blog post alt

Install Setup and Start MongoDB on Windows

In this particular blog we will see How to setup or Install MongoDB in a Windows environment. Firstly we will discuss a very easy and quick setup to get started with mongoDB.

Continue reading

Got a technical query, or stuck somewhere ?