If we want to get data from spoolDir which contains Gzip file in it, what should I change for the source in the Flume process? Just have a customized EventDeserializer or also need new source type(eg., a customized GzipSpoolDirectorySource instead of the default spooldir) for the flume process?
Does anyone know how to read gzip file(gzip in thr spoolSourceDirectory) in Flume process?
599 Views Asked by user3502577 At
1
There are 1 best solutions below
Related Questions in GZIP
- Convert JSON.gz to JSON in node js
- How can i read a json file in gzip?
- Does anyone know how to read gzip file(gzip in thr spoolSourceDirectory) in Flume process?
- Grep a gzipped file?
- Compress json array list in java(Spring mvc) and decompress it in javascript(angular js)
- Seeking on a gz connection is unpredictable
- How do I grep GZ files to extract PNG files?
- I want to create a script for unzip (.tar.gz) file via (Python)
- getting CRC error while decompressing a gzip file in python
- Delete specific line(pattern) from .gz file using python for large file size
- How do I link a gzipped version of a stylesheet?
- Glassfish4 gzip encoding issue (corrupted responses)
- Apache settings to send gzipped CSS/JS files to browser
- Open a gz file using Minizip Library
- gzip and pipe to output (performance consideration)
Related Questions in FLUME
- API data to hadoop via Flume
- Does anyone know how to read gzip file(gzip in thr spoolSourceDirectory) in Flume process?
- Apache flume Regex Extractor Interceptor
- How to insert JSON in HDFS using Flume correctly
- Save flume output to hive table with Hive Sink
- error in streaming twitter data to Hadoop using flume
- Exception follows. org.apache.flume.FlumeException: Unable to load source type in flume twitter analysis
- log4j2 logger levels not working
- Why does an optional flume channel cause a non-optional flume channel to have problems?
- Collect logs from Mesos Cluster
- Flume interceptor to ignore input CSV file header while reading
- Use Flume to stream a webpage data to HDFS
- Flume interceptor for kafka message timestamp?
- Why does Apache Flume regex extractor accept only "1 digit" ?
- How to resolve unhandled error in Flume while extracting Tweets
Related Questions in SPOOL
- Does anyone know how to read gzip file(gzip in thr spoolSourceDirectory) in Flume process?
- Hide SQL > statements in the spool file
- SQL*Plus spool only data and exclude errors
- Write to a file in PL/SQL without spools or utl_file
- Get generated spool file Swiftmailer/Symfony2
- Spooling to a file with a name containing a space and script's parameter in sqlplus?
- Remove blank line from Spool file
- SQLcl unable to set spool to file
- How to select spool's filename from table in sqlplus
- C# Debug Visualizer throug reflection: get value of property contained in complex object using Reflection
- spool with column headers in pipe delimited sqlplus
- How to send 12 000 emails via Swiftmailer in Symfony2?
- how do you spool from a stored procedure that is executed through a database link?
- linux + how to stop files creation under /var/spool/clientmqeueue
- Extra spaces appending in Spooling SQL script in UNIX
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
OK, so if you don't want to unpack your GZIP files at Flume level, that#s actually quite easy. You can configure your Spool Dir source to use a BlobDeserializer:
https://flume.apache.org/FlumeUserGuide.html#event-deserializers
This will parse the entire file as one event and spool that. If you want to store that to HDFS for instacne, make sure that you activate the fileHeader property on your spool dir source. You can then use the %{file} variable in your path, which effectively allows you to use flume as a one to one file copy mechanism.