I have two machines. One machine runs HBase 0.92.2 in pseudo-distributed mode, while the other one is using Nutch 2.x crawler. How can I configure these two machines so that one machine with HBase-0.92.2 acts as back end storage and the other with Nutch-2.x acts as a crawler?
How can I connect apache Nutch 2.x to a remote HBase cluster?
697 Views Asked by zahid adeel At
1
There are 1 best solutions below
Related Questions in HADOOP
- pcap to Avro on Hadoop
- schedule and automate sqoop import/export tasks
- How to diagnose Kafka topics failing globally to be found
- Only 32 bit available in Oracle VM - Hadoop Installation
- Using HDFS with Apache Spark on Amazon EC2
- How to get raw hadoop metrics
- How to output multiple values with the same key in reducer?
- Loading chararray from embedded JSON using Pig
- Oozie Pig action stuck in PREP state and job is in RUNNING state
- InstanceProfile is required for creating cluster - create python function to install module
- mapreduce job not setting compression codec correctly
- What does namespace and block pool mean in MapReduce 2.0 YARN?
- Hadoop distributed mode
- Building apache hadoop 2.6.0 throwing maven error
- I am using Hbase 1.0.0 and Apache phoenix 4.3.0 on CDH5.4. When I restart Hbase regionserver is down
Related Questions in HBASE
- HBase table is empty, but regions are increased
- I am using Hbase 1.0.0 and Apache phoenix 4.3.0 on CDH5.4. When I restart Hbase regionserver is down
- phoenix join operation not working with hbase
- Running HBase in standalone mode but get hadoop "retrying connect to server" message?
- How to start HBase master using Java code in ubuntu
- Spark and HBase Snapshots
- Connect to hbase from remote machine
- Read Data from HBase running on EMR Cluster with Spark installed on local machine
- Unable to connect to HBase stand alone server from windows remote client
- Error while installing HBase on windows
- Importtsv command does not work in Hbase1.0.1.1
- LongComparator does not work in Google Cloud Bigtable with HBase API
- How do you calculate the size of a single row in hbase?
- HBase AggregationClient in HDInsight
- why use protocol buffers in java
Related Questions in APACHE-ZOOKEEPER
- Is curator's persistent ephemeral nodes just regular ephemeral with retries?
- Supervisor node will not connect to storm cluster
- Read Data from HBase running on EMR Cluster with Spark installed on local machine
- Apache Curator (Scala) New Session On Reconnect Without LOST State Change
- Ansible: How can I set serial numbers for hosts
- will the watcher be set if async operation returns error with zookeeper client?
- NullPointerException while loading extension in linux
- ZooKeeper Recipes and Apache Curator
- kafka spout can't connect to kafka topic
- HACluster Config for Stardog
- How do config tools like Consul "push" config updates to clients?
- How namespaces/zookeeper chroots work together in Apache Curator
- Zookeeper error: Cannot open channel to X at election address
- java.nio.channels.UnresolvedAddressException when running druid sample app
- What's the differenet between three zkclient libraries on maven central repo
Related Questions in NUTCH
- How can I integrate Solr5.1.0 with Nutch1.10
- Trigger Apache Nutch Crawl Programmatically
- Nutch 2.3 REST curl syntax
- Nutch 2.3 + Elasticsearch / results not visualizing in Kibana
- inject runtime exception nutch 2.3
- Internal Server error while adding documents Solr
- Integrate Solr-5.2.1 with crawled data from Nutch?
- Nutch 2.x run every URL every time
- Nutch REST api Results (limited)
- Nutch: How to re-try transient errors (and none of the other URLs)?
- Apache Nutch REST api
- Integration of Apache Nutch 1.12 and Solr 5.4.1 failed
- what does SetProperty of solr.home do in Solr?
- Parsing open graph tags with nutch (into ElasticSearch)
- Nutch 2.3 - javax.net.ssl.SSLException
Related Questions in NUTCH2
- org.apache.tika.utils.XMLReaderUtils acquireSAXParser WARNING: Contention waiting for a SAXParser. Consider increasing the XMLReaderUtils.POOL_SIZE
- Nutch http.redirect.max may I know what does it Mean
- Find number of already exist documents in solr with solrindexing job in nutch
- Restrict Nutch to Seed path and its following webpages only
- Nutch 1.17 web crawling with storage optimization
- nutch fetch failed with protocol status: exception(16), lastModified=0: Http code=403, url=https://www.nicobuyscars.com
- Apache Nutch not reading a new configuration file when run with job file
- Apache Nutch title parsing issue for Language specific websites
- Apache Nutch section pages handling trick
- Apache Nutch SolrIndexer error in SolrCloud mode
- nutch time schedule to visit a page again
- Apache Nutch 2.3.1 opic scoring filter not working
- Apache Nutch not crawling all websites in in-links
- How can I connect apache Nutch 2.x to a remote HBase cluster?
- Apache Nutch ranking algorithm for specific language content
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
I finally did it.I was easy to do. i am sharing my experience here. May be it can help someone.
1- change the configuration file of hbase-site.xml for pseudo distributed mode.
2- MOST IMPORTANT THING: on hbase machine, replace localhost ip in /etc/hosts with your real network ip like this
10.11.22.189 master localhost
hbase machine's ip = 10.11.22.189 (note: if you won't change your hbase machine's localhost ip, remote nutch crawler won't be able to connect to it)
4- copy/symlink hbase-site.xml into $NUTCH_HOME/conf
5- start your crawler and see it working