I would like to crawl an amazon s3 bucket using manifold to relay the crawl to OpenSearchServer. I've seen other products carry an amazon S3 connector and I'm just wondering if there is a publicly available one for ManifoldCF.
Is there an AmazonS3 connector available for ManifoldCF?
344 Views Asked by Mdalz At
2
There are 2 best solutions below
1
kuhajeyan
On
Currently manifold does not provide the Amazon S3 connector by default, available connectors by default.
Beside, how to go about start writing connector i would suggest you to checkout source code from manifold svn, and look at how other connectors are written. Eg. Generic connectors, File System connectors are perfect examples of how you would write connectors.
Related Questions in SOLR
- Developing a search and tag heavy website
- How can I integrate Solr5.1.0 with Nutch1.10
- Solr ping taking time during full import
- Indexed data is not displaying on storefront
- Heap size issue on migrating from Solr 5.0.0 to Solr 5.1.0
- Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr
- Exact word not boosting much Solr
- Solr stopped with Error opening new searcher at org.apache.solr.core
- Data import in solr from multiple entities
- solr reindexing issue for EdgeNgramFilter
- Heap memory Solr and Elasticsearch
- How to index documents with their metadata in a DB using Solr 5.1.0
- Isnull equivalent in SOLR
- SolrNet query not working for Scandinavian characters
- Query always the same with Sunspot/Solr on rails
Related Questions in AMAZON-S3
- Convert JSON.gz to JSON in node js
- Downloading objects from S3 with presigned URL
- "Access Denied" - User's Permissions to S3 Bucket
- jQuery file upload to S3 (and rails) with CORS headers
- copying file from local machine to Ubuntu 12.04 returning permission denied
- AWS Flow Framework: Can we run activity worker and activity task on different EC2 instances
- Unable to access files from public s3 bucket with boto
- s3cmd not working as cron-task when echos/dates are added
- AWS S3 object listing
- React-native upload image to amazons s3
- S3 restrictions on quantity of object downloads
- How to upload a photo in Meteor to S3 and have it sync to database item?
- Limit upload size to S3 with presigned URL
- dragonfly-s3 with S3 IAM user causing a forbidden 403 response from Amazon
- Split S3 files into multiple output files
Related Questions in OPENSEARCH
- OpenSearch + PHP for INSPIRE ATOM : Why can I get the correct Content-Type?
- SharePoint 2013 not sending to OpenSearch endpoint the right {searchTerms}
- Autocomplete in opensearch
- Tab to Search in Chrome With AJAX Post Search Engine
- Open Search Server: Ignore content but follow links
- query with search terms in specific elements in opensearch?
- How to define a trigger keyword in OpenSearchDescription xml file?
- Opensearch Alerting: Per Document Monitor to include few document field(s) in alert message
- Opensearch cluster is not working with ingress
- Weird OpenSearch query results when using nested objects and some questions
- Alternatives to `asciifolding` filter for removing Greek ascents from unicode text
- aws eventbridge eventpattern for opensearch document insert
- Opensearch issues with json field names containing []
- How do I install pytorch in a Docker container without blowing up memory?
- OpenSearch Javascript Client No or Bad response
Related Questions in MANIFOLDCF
- Is manifold cf a good option for Google Drive indexing?
- SessionException occurs when crawling with solrCloud
- Web crawl using manifoldcf
- Best way to crawl through file system and index
- manifold sharepoint elasticsearch
- writing Mongo DB output connector for manifoldcf
- Extracting contents using Tika transformation - Manifold CF
- ManifoldCF error when creating ElasticSearch output connector
- Do I need to configure Authorities in ManifoldCF?
- How to crawl a website that has SAML authentication using ManifoldCF or nutch?
- Searching metadata from images using Datafari
- Manifoldcf documentum crawling slowness
- Alfresco Community Edition, ManifoldCF and Elasticsearch to optimize full-text search
- Crawling Jira with Manifoldcf and Solr - String index out of range
- How to get "Document status" data through REST API with Apache ManifoldCF
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Since Aug 27 there is one https://github.com/apache/manifoldcf/tree/trunk/connectors/amazons3
happy hacking!