I need to store documents such as .pdf, .doc and .txt files to MaprDB. I saw one example in Hbase where it stores files in binary and is retrieved as files in Hue, but I not sure how it could be implemented. Any idea how can a document be stored in MaprDB?
Store documents (.pdf, .doc and .txt files) in MaprDB
326 Views Asked by Amu At
1
There are 1 best solutions below
Related Questions in JAVA
- Add image to JCheckBoxMenuItem
- How to access invisible Unordered List element with Selenium WebDriver using Java
- Inheritance in Java, apparent type vs actual type
- Java catch the ball Game
- Access objects variable & method by name
- GridBagLayout is displaying JTextField and JTextArea as short, vertical lines
- Perform a task each interval
- Compound classes stored in an array are not accessible in selenium java
- How to avoid concurrent access to a resource?
- Why does processing goes slower on implementing try catch block in java?
- Redirect inside java interceptor
- Push toolbar content below statusbar
- Animation in Java on top of JPanel
- JPA - How to query with a LIKE operator in combination with an AttributeConverter
- Java Assign a Value to an array cell
Related Questions in HBASE
- HBase table is empty, but regions are increased
- I am using Hbase 1.0.0 and Apache phoenix 4.3.0 on CDH5.4. When I restart Hbase regionserver is down
- phoenix join operation not working with hbase
- Running HBase in standalone mode but get hadoop "retrying connect to server" message?
- How to start HBase master using Java code in ubuntu
- Spark and HBase Snapshots
- Connect to hbase from remote machine
- Read Data from HBase running on EMR Cluster with Spark installed on local machine
- Unable to connect to HBase stand alone server from windows remote client
- Error while installing HBase on windows
- Importtsv command does not work in Hbase1.0.1.1
- LongComparator does not work in Google Cloud Bigtable with HBase API
- How do you calculate the size of a single row in hbase?
- HBase AggregationClient in HDInsight
- why use protocol buffers in java
Related Questions in HUE
- Loading chararray from embedded JSON using Pig
- Hue configuration error
- Pig 0.12.0 won't execute shell commands with timezone change using backticks
- Apache Pig 0.12.0 on Hue not preprocessing statements as expected
- Error in running livy spark server in hue
- Error in running hive
- how to enable query result set download option in Hue Presto JDBC by default
- Store documents (.pdf, .doc and .txt files) in MaprDB
- Query Taking Ages to Run in Hive, any way to simplify it
- Run spark python job with Oozie and Hue - Intercepting System.exit(1)
- Solr Search in Hue - is it possible to disable the auto-fill of field names?
- How to Set Custom Delimiter in PIG
- How to add jar files for Hue in Cloudera?
- Dynamically Animating Hue Shift on Canvas Image
- Not found module saml2 while Installing HUE on ubuntu hadoop standalone
Related Questions in MAPR
- MAPR -File Read and Write Process
- Classpath issues in MAPR
- MapR oozie sqoop error; Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
- Yarn MRV2 performance Tuning Number of mappers and Reducers MRV1 performance better
- Physical data location of MapR DB table
- Bulk load in multiple MapR table
- How to install Mapr client on windows 7 64bit?
- Store documents (.pdf, .doc and .txt files) in MaprDB
- Why is MapR giving me a null pointer when reading files?
- Connecting to remote Mapr Hive via JDBC
- starting warden after zookeeper of MapR
- MapR client not executing hadoop - Windows
- maprsteam with spring integration java client
- Connect SQL Server to MapR-Talend Sandbox
- maprlogin java.lang.UnsatisfiedLinkError com.mapr.security.JNISecurity.SetParsingDone()
Related Questions in NOSQL
- Developing a search and tag heavy website
- PostgreSQL 9.4 - Elements of jsonb array to ts_vector in
- ArangoDb get latest document from all collections
- Modeling with MongoDB and Mongoose
- Any high-level .NET clients for PostgreSQL's JSON type?
- Filtered Query in Elastic Search
- Nosql database design for complex querying
- JDBC or JPA for NoSQL
- How to connect Clusterpoint database to an android appliaction
- How to write query to my Cloudant database?
- How do I query X specific documents all at once using an index with pouchdb?
- Project a _id ref field as document
- MongoDB from PHP strange behavior
- MongoDB slow log (profiler) shows many many "killcursors" action when heavy load comes, why? and what killcursors do?
- Optimize duplicate values in NoSql key-value storage
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
First thing is , Im not aware about Maprdb as Im using Cloudera. But I have experience in hbase storing many types of objects in hbase as byte array like below mentioned.
Most primitive way of storing in hbase or any other db is byte array. see my answer
You can do that in below way using Apache commons lang API. probably this is best option, which will be applicable to all objects including image/audio/video etc..
please test this method with one of object type of any of your files.
SerializationUtils.serializewill return bytes. which you can insert.Note :jar of apache commons lang always available in hadoop cluster.(not external dependency)
another example :
For any reason if you don't want to use
SerializationUtilsclass provided by Apache commons lang, then you can see below pdf serialize and deserialize example for your better understanding but its lengthy code if you useSerializationUtilsthe code will be reduced.Above you are getting byte array you can prepare put request to upload to database i.e Hbase or any other database
Once you persisted, you can get the same using hbase get or
scanyougetyour pdf bytes and use the below code to again make same file i.e someFile.pdf in this case.EDIT : Since you asked HBASE examples I'm adding this.. in the below method
yourcolumnasBytearrayis your doc file for instance pdf.. converted to byte array (usingSerializationUtils.serialize) in above examples...