Imagine there is a big array of Strings S. From that array I need to get only those strings which contain a specific substring. For example if my array is
String s [] = {"hello world", "back to hell", "say hello world"};
and my keyword is "hello", then it should return me first and last element.
I tried using KMP and Boyer-Moor algorithms to check each string in array whether it contains a substring or not, but it takes too much time.
Then I learned about Aho-Corasick algorithm. I am still looking it up, but seemingly it needs an array of substrings and one big string to match,while what I want is exactly opposite.
So I was looking for a suggestions on how to modify Aho-Corasick algorithm for my purposes, or another means to achieve those. Would be thankful for any suggestions.
Is there an efficient algorithm to extract from an array of string only those which contain a specific substring?
355 Views Asked by Akbar Amanov At
1
There are 1 best solutions below
Related Questions in ARRAYS
- How could you print a specific String from an array with the values of an array from a double array on the same line, using iteration to print all?
- What does: "char *argv[]" mean?
- How to populate two dimensional array
- User input sanitization program, which takes a specific amount of arguments and passes the execution to a bash script
- Function is returning undefined but should be returning a matched object from array in JavaScript
- The rules of Conway's Game of Life aren't working in my Javascript version. What am I doing wrong?
- Array related question, cant find the pattern
- Setting the counter (j) for (inner for loop)
- I want to flip an image (with three channels RGB) horizontally just using array slicing. How can I do it with python?
- Numpy array methods are faster than numpy functions?
- How to enter data in mongodb array at specific position such that if there is only 2 data in array and I want to insert at 5, then rest data is null
- How to return array to ArrayPool when it was rented by inner function?
- best way to remove a word from an array in a react app
- Vue display output of two dimensional array
- Undot Array with Wildcards in Laravel
Related Questions in STRING
- What does: "char *argv[]" mean?
- User input sanitization program, which takes a specific amount of arguments and passes the execution to a bash script
- JSON Body is Not Passing Certain Strings
- Regex to match repeated substring in Google Sheets
- Find the sum of the numbers in the sequence
- Hello, how can I use a block parameter of withstyle parameter when we create a annotated string in jetpackpack compose
- How to convert an HTML string to an escaped one?
- Quintic Number Number Counting Hash Function
- From Buffer("string", "hex) to string JS
- Calling ToString with a nominated format returns Char rather than String
- How to update an already existing array by accessing it by a variable with the exact same name assigned to it
- Why does \b not interpreted as backslash in this regular expression
- Python: why aren’t strings being internalized if they are received from ints by using str()?
- If the element(s) in the first list equal element(s) of the second list, replace with element(s) of the third list
- About Suffix Trees features
Related Questions in ALGORITHM
- MCNP 6 - Doubts about cells
- Given partially sorted array of type x<y => first apperance of x comes before first of y, sort in average O(n)
- What is the algorithm behind math.gcd and why it is faster Euclidean algorithm?
- Purpose of last 2 while loops in the merge algorithm of merge sort sorting technique
- Dots and Boxes with apha-beta pruning
- What is the average and worst-case time complexity of my string searching algorithm?
- Building a School Schedule Generator
- TC problem 5-2:how to calculate the probability of the indicator random variable?
- LCA of a binary tree implemented in Python
- Identify the checksum algorithm
- Algorithm for finding a subset of nodes in a weighted connected graph such that the distance between any pair nodes are under a postive number?
- Creating an efficent and time-saving algorithm to find difference between greater than and lesser than combination
- Algorithm to find neighbours of point by distance with no repeats
- Asking code suggestions about data structure and algorithm
- Heap sort with multithreading
Related Questions in AHO-CORASICK
- Aho-Corasick algorithm with C language
- Aho-Corasick text matching on groups of patterns
- Aho-Corasick implementation is taking too long
- Algorithmic way to search a list of tuples for a matching substring?
- Aho-Corasick for partial text matching
- How to match only whole words with Aho corasick?
- Find occurrences of the adjacent sub strings in the text
- Efficient String Search Using Dictionary Java
- Is there an efficient algorithm to extract from an array of string only those which contain a specific substring?
- Counting the sum of occurrences of a string set S in a trie T
- How to get the same offset between readelf/IDA and Aho-Corasick in a binary
- Why doesn't the lambda example in hankcs/AhoCorasickDoubleArrayTrie work?
- Updating an Aho-Corasick trie in the face of inserts and deletes
- Finding Patterns Occurrences In A Large Text File (Currently With Aho-Corasick)
- number of occurrences of list of words in a string with O(n)
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Build a suffix tree using Ukkonen algorithm or the one suggested in this source(PDF):
Then use the created suffix tree to search for a given pattern. The problem is to find all occurrences of pattern P (length m) in a suffix tree T. According to the above source:
Note that the length of the text (or the number of strings in the array) does not affect search efficiency. Therefore, you could pay for constructing a suffix tree once and then use it many times to search efficiently for short pattern strings.
EDIT: if you are in a hurry and do not mind a little bit of extra time complexity, you can construct suffix arrays instead of suffix trees using this approach(PDF) in just O(n*log^2(n)) with a very small piece of code. Here is the core idea of this approach:
And here is the pseudo-code reproduced from the above source:
After running this code,
Pwill contain the suffix array. Search using this approach is also straightforward: