Troubleshooting: Cloudera Hue not able to access Solr collections through Search tab assuming both are set up with CDH4
Posts on 'Hadoop'
You are not so strong with SQL or you are not good at programming? And you need to create distributed scalable search on a very large dataset stored in HBase? And you need to achieve NRT (Near Real Time) indexing? Cloudera search along with Lily Hbase Indexer is there to rescue you!
It is affectionately said that what Oracle is to Relational Database, Cloudera is to Hadoop. Most of the Hadoop aspirants, at the beginning of their Hadoop development learning curve, fiddle with the setting up of CDH, some able to do it smoothly (Cloudera has put up an incredibly exhaustive installation guide), some requires to really sweat it out (few finer details and prerequisites are either missing or not enough emphasized upon) and very few actually gives it up losing their way in the verbose and cover-all-cases installation guide(multiple way of set up and different set of instructions for different linux OS; sometimes too much of variations for impatient starters). So there is a target audience for one more set up document which is leaner, meaner and streamlined with only one (the most preferred) variation. Try it at home!!
Hadoop Hands on - A POC Covering HDFS API, MapReduce, JSON and AVRO SerDe, HBase API With FuzzyRowFilter usage
My learning phase with Hadoop is still continuing. During this phase what I found is a great lack of a comprehensive POC which covers at least a few prominent Hadoop technologies. My POC can fill up that void. After having set up CDH4.7 in my laptop, I completed implementation of this POC touching HDFS API, MapReduce, JSON and AVRO SerDe, HBase.
Apache Hadoop Development Tools (HDT) is still in development phase. So, no official distribution of Hadoop 2.2.0 Eclipse Plugin is available now. But we can build the same using winghc/hadoop2x-eclipse-plugin. In this post, we'll build, install and configure the plugin with the Eclipse or any Eclipse based IDE (say, Spring Tool Suite) to ease the development activities using Hadoop framework.
Shell$ExitCodeException - Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/service/CompositeService
If you are getting "Exception from container-launch:org.apache.hadoop.util.Shell$ExitCodeException" in FAILED application's Diagnostics (or Command prompt) and "java.lang.NoClassDefFoundError: org/apache/hadoop/service/CompositeService" in 'stderr' containerlogs while running any Hadoop example on Windows, then add all the required Hadoop jars to the property 'yarn.application.classpath' in yarn-site.xml configuration file.
In this post, we'll use HDFS command 'bin\hdfs dfs' with different options like mkdir, copyFromLocal, cat, ls and finally run the wordcount MapReduce job provided in %HADOOP_HOME%\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.2.0.jar. On successful execution of the job in the Single Node (pseudo-distributed mode) cluster, an output (contains counts of the occurrences of each word) will be generated.
Maven Build Failure - Hadoop 2.2.0 - [ERROR] class file for org.mortbay.component.AbstractLifeCycle not found
In the previous post on Build, Install, Configure and Run Apache Hadoop 2.2.0 in Microsoft Windows OS, many people have encountered Maven build failure issue ("[ERROR] class file for org.mortbay.component.AbstractLifeCycle not found") for Apache Hadoop Auth project. So thought of sharing the fix as a separate post.
If we directly take the binary distribution of Apache Hadoop 2.2.0 release and try to run it on Microsoft Windows, then we'll encounter ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path.In the previous post - Build, Install, Configure and Run Apache Hadoop 2.2.0 in Microsoft Windows OS, I have already described how to build Windows distribution of Apache Hadoop 2.2.0. But if you are feeling little bit lazy to perform all the lengthy steps described there and want to get started with Hadoop quickly by-passing those steps, then this is the post worth looking into.
Good news for Hadoop developers who want to use Microsoft Windows OS for their development activities. Finally Apache Hadoop 2.2.0 release officially supports for running Hadoop on Microsoft Windows as well. But the bin distribution of Apache Hadoop 2.2.0 release does not contain some windows native components (like winutils.exe, hadoop.dll etc). As a result, if we try to run Hadoop in windows, we'll encounter ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path.
In this article, I'll describe how to build bin native distribution from source codes, install, configure and run Hadoop in Windows Platform.