I have a table stored in ORC format with a bloom filter defined for 1 column. Is it possible to add a filter for another column (without reinserting the data) after the table is created and populated with data ?
Is it possible to add a bloom filter on an existing table with data?
1.2k Views Asked by Eugen At
1
There are 1 best solutions below
Related Questions in HIVE
- How do I set the Hive user to something different than the Spark user from within a Spark program?
- schedule and automate sqoop import/export tasks
- PIG merge two lines in the log
- Elephant bird with hive to query protobuf file
- How can we decide the total no. of buckets for a hive table
- How to create a table in Hive with a column of data type array<map<string, string>>
- How to find number of unique connection using hive/pig
- sqoop-export is failing when I have \N as data
- How can we test expressions in hive
- Run Hive Query in R with Config
- Rhive: The messages shows: Not Connected to Hiveserver2 (But can connect HDFS)
- HIVE Query Deleting source data blob
- Hive JOIN of query with subquery takes forever
- What is Metadata DB Derby?
- How could I set the number or size of output files in an "insert" script?
Related Questions in ORC
- Spark ORC reader is reading complete file even though single column is queried
- hive data processing taking longer time than expected
- Spark 2.0 DataSourceRegister configuration error while saving DataFrame as cvs
- Apache nifi issue with saving data from json to orc
- Hive LLAP - ORC split generation failed
- How to fasten spark dataframe write to hive table in ORC store
- NiFi PutHiveStreaming processor with Hive: Failed connecting to EndPoint
- How to create small files while inserting data to hive ORC table using TEZ
- Java: Read JSON from a file, convert to ORC and write to a file
- Optimize write to a hive table
- Renaming Column names in hive ORC table is resulting in NULL values in the New column
- Detection and Cleaning of Strike-out Texts on Handwriting
- binary format that allows to store multiple pandas dataframes with different columns, width, rows
- Read ORC files from AWS S3 bucket in Flink app
- How to read orc data into BQ while preserving "\r\n" in a string value?
Related Questions in BLOOM-FILTER
- Lock-free Bloom-like probabilistic data structures implemented in C
- A wired thing in collide of HashMap in Java
- Alternatives to Bloom Filter
- When is the bloom filter created on a Hive table?
- Data structure complementary of BloomFilters
- Finding possibility of occurrence of words in a document using bloom filter
- what are the options for obtaining k pair-wise independent hash functions that are fast
- Google Chrome usage of bloom filter
- What hash function should I use for a bloom-filter with +128-bit keys?
- Why does adding a tokenbf_v2 index to my Clickhouse table not have any effect
- BloomFilter Python
- Guava Bloom Filter does not support large insertions?
- Computing the approximate population of a bloom filter
- python bit array (performant)
- How to use bloomfilters with Ruby's Redis client
Related Questions in HIVEDDL
- Hive External Table - Drop Partition
- Error while trying to create external table in hive
- Hive Update partition vs MSCK Repair
- How to truncate a partitioned external table in hive?
- Is there anyway to change the datatype of the non-partition column of the external hive table?
- Is it possible to add a bloom filter on an existing table with data?
- CREATE TABLE doesn't load data from disk
- drop table command with partitions column in hive
- Hive load multiple partitioned HDFS file to table
- Hive difference between PARTITIONED BY, CLUSTERED BY and SORTED BY with BUCKETS and insert overwrite with PARTITIONED and CLUSTER BY?
- How to generate json object from hive SQL table description?
- What happens if I move Hive table data files before moving the table?
- LOCATION in Hive
- Apache Hive: How to Add Column at Specific Location in Table
- Sorted Table in Hive (ORC file format)
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
No. it is not possible without rewriting the data.
Alter tablewill not update files, and indexes and bloom filters are being stored in the data files, not in the metastore. If you alter table without rewriting data, then filters will be created for going forward basis, for newly inserted/updated data. So, you need to reinsert the data and much better to sort by filter columns, so bloom filters will be more efficient. Read about ORC indexes here.