I was downloading twitter data using flume into hdfs. Though i have data more than 2 GB, my chunk file splits are less than 64MB. i.e. 1st file with 300KB,2nd File - 566 KB. Why is that happening?
why chunk files are split even though the file size is not 64MB?
73 Views Asked by Sakthi At
1
There are 1 best solutions below
Related Questions in HADOOP
- pcap to Avro on Hadoop
- schedule and automate sqoop import/export tasks
- How to diagnose Kafka topics failing globally to be found
- Only 32 bit available in Oracle VM - Hadoop Installation
- Using HDFS with Apache Spark on Amazon EC2
- How to get raw hadoop metrics
- How to output multiple values with the same key in reducer?
- Loading chararray from embedded JSON using Pig
- Oozie Pig action stuck in PREP state and job is in RUNNING state
- InstanceProfile is required for creating cluster - create python function to install module
- mapreduce job not setting compression codec correctly
- What does namespace and block pool mean in MapReduce 2.0 YARN?
- Hadoop distributed mode
- Building apache hadoop 2.6.0 throwing maven error
- I am using Hbase 1.0.0 and Apache phoenix 4.3.0 on CDH5.4. When I restart Hbase regionserver is down
Related Questions in BLOCK
- Magento custom block. Can't get block's file
- Concrete5 5.7 Block permissions
- Ceph- list object in a RADOS block device
- ASP.net main thread block during long running process
- How to modify multiple elements of a block with BEM CSS
- How To Pass A Twig Output String Into the urlFor() Function
- Coderbytes Letter Changes (Ruby)
- More blocks in Simulink
- Upload form inside Drupal block
- adblock - block one option from select list
- multiple return values in callback blocks
- Is possible to set hadoop blocksize 24 MB?
- How to pass data from the popped model view controller to the previous view controller using Swift language?
- Swift passing uninitialized object to block
- Unsupported field datatype: metadata
Related Questions in INPUT-SPLIT
- How can I explain Hadoop not to split my file in some special MapReduce task?
- splits in map reduce jobs
- How locations are calculated on Input Splits
- Wordcount: More than 1 map task per block, with speculative execution off
- jackson jsonparser restart parsing in broken JSON
- Creating custom InputFormat and RecordReader for Binary Files in Hadoop MapReduce
- AttributeError: 'builtin_function_or_method' object has no attribute 'split' (3)
- Input Splits in Hadoop
- MapReduce basics
- file storage, block size and input splits in Hadoop
- Does hadoop job submitter while calculating splits takes record boundries into account?
- How to read a record that is split into multiple lines and also how to handle broken records during input split
- How and where is input split size mentioned or passed to a MR program?
- Efficiency of NLineInputFormat's InputSplit calculations
- Input split and block in hadoop
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
That is because of your flume configuration.
read this, you will have to set hdfs.rollInterval or hdfs.rollSize