I need some help regarding how to run a MapReduce Program/Job with Cloudera Docker Container.
I am using a Linux (ElementaryOS) high config. laptop (24GB RAM, i7 Processor).
I am able to install Cloudera docker image, ran it and also did the following without issues:
1. Seeing # prompt and run HDFS command (hadoop fs -ls) though it doesn't return anything.
2. Able to access Hue Editor
3. Able to run Cloudera manager and start all services (Everything).
4.In my Local Environment, I am able to create a WordCount MapReduce program (jar), downloaded all Maven dependencies for this program (not inside docker container).
Now my question is:
How to submit this WordCount JAR to running Docker Container?
How to run this MapReduce program/job (WordCount) with uploaded text file (HDFS)?
How to Execute MapReduce Job/JAR with Cloudera Quickstart Docker container
857 Views Asked by Srikanth At
1
There are 1 best solutions below
Related Questions in HADOOP
- pcap to Avro on Hadoop
- schedule and automate sqoop import/export tasks
- How to diagnose Kafka topics failing globally to be found
- Only 32 bit available in Oracle VM - Hadoop Installation
- Using HDFS with Apache Spark on Amazon EC2
- How to get raw hadoop metrics
- How to output multiple values with the same key in reducer?
- Loading chararray from embedded JSON using Pig
- Oozie Pig action stuck in PREP state and job is in RUNNING state
- InstanceProfile is required for creating cluster - create python function to install module
- mapreduce job not setting compression codec correctly
- What does namespace and block pool mean in MapReduce 2.0 YARN?
- Hadoop distributed mode
- Building apache hadoop 2.6.0 throwing maven error
- I am using Hbase 1.0.0 and Apache phoenix 4.3.0 on CDH5.4. When I restart Hbase regionserver is down
Related Questions in CLOUDERA-CDH
- Apache flume Regex Extractor Interceptor
- Hadoop MapReduce (Yarn) using hosts with different power/specifications
- How to specify the mappers failure threshold for a hadoop mapreduce job?
- Hadoop Yarn job: Wrong FS
- YARN Memory Allocation Parameters
- saveAsTable in Spark 1.4 is not working as expected
- Tablet Server Access for Accumulo Running on AWS
- Decommissioning multiple Hadoop DataNodes in parallel
- Creating a dynamic resource pool for yarn through Cloudera Manager REST api
- Not able to install hadoop using Cloudera Manager
- Can not find "Spark.dynamicAllocation.enabled" property in CDH 5.4.7 UI
- Use Flume to stream a webpage data to HDFS
- Can not query struct field with hive (CDH 5.9.0)
- Cloudera manager while installation , get HostName
- Mistaked code after execute a job using tPigLoad, tPigMap and tPigStoreResult
Related Questions in CLOUDERA-MANAGER
- How do I debug a failing cloudera-scm-server process?
- Heartbeating to cloudera manager fails
- Python import modulr _io error
- Installed Components versions in cloudera manager UI
- Cloudera Manager API - how to deploy the default configuration?
- dfs_hosts_allow in Cloudera Manager
- SparkException: local class incompatible
- SSH connection issue Cloudera Manager on Google Compute Engine
- Creating a dynamic resource pool for yarn through Cloudera Manager REST api
- Not able to install hadoop using Cloudera Manager
- Cloudera manager while installation , get HostName
- how to know when cdh 5.10 will come out
- Unable to start Hive via Terminal on Cloudera CDH5.8 VM - Bad health,Name Node is in Safe Mode
- Can not start Cloudera Manger Server, because of RuntimeException: Upgrade not allowed from CM3.x
- How to Execute MapReduce Job/JAR with Cloudera Quickstart Docker container
Related Questions in CLOUDERA-QUICKSTART-VM
- Cloudera QuickStarts VM on Ubuntu
- Run spark python job with Oozie and Hue - Intercepting System.exit(1)
- How to Execute MapReduce Job/JAR with Cloudera Quickstart Docker container
- Running spark in docker showing - site can't be reached
- How to write multiline in Cloudera QuickStart Terminal?
- PigStorage() isn't storing the query output
- How to speed up writing into Impala from Talend
- Cloudera Quick Start VM lacks Spark 2.0 or greater
- Install spark on cloudera vm
- yum update failed "StopIteration" hdr = idx.next()
- Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStrea
- From where or How can I download Cloudera Quick start VM 5.12
- Additional Spark' installation's access to HDFS and Hive
- where to set config values in cloudera hive setup?
- Not able to access HDFS
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
If you start your container with port mapping for the 8888 port, you will be able to access Hue that contains a file brower. So you will be able to easily put HDFS files in you cluster.
To launch a map/reduce job, you will need to copy your jar inside the container, as Cloudera didn't provide any volumes in it's container (at least, not documented here : http://www.cloudera.com/documentation/enterprise/latest/topics/quickstart_docker_container.html) it can be challenging. Maybe you can try adding it via scp.
I myself create some cloudera containers, I provide one container by node type (masternode, datanode, edgenode) and I just add a volume in the edgenode as iy seems to be a good think to provide. You can find my container in the docker hub : https://hub.docker.com/r/loicmathieu/cloudera-cdh-edgenode/