We are crawling data from DCTM repository using ManiFoldCF documentum connector and writing the crawled data to MongoDB. Crawling triggered with throttling value 500.But crawling speed is very slow per minute connector is fetching only 170 documents. The server where MCF installed is configured with enough memory with 8 logical cores (CPU). Can someone help us here to improve crawling speed?
Manifoldcf documentum crawling slowness
154 Views Asked by user1956938 At
1
There are 1 best solutions below
Related Questions in JAVA
- Add image to JCheckBoxMenuItem
- How to access invisible Unordered List element with Selenium WebDriver using Java
- Inheritance in Java, apparent type vs actual type
- Java catch the ball Game
- Access objects variable & method by name
- GridBagLayout is displaying JTextField and JTextArea as short, vertical lines
- Perform a task each interval
- Compound classes stored in an array are not accessible in selenium java
- How to avoid concurrent access to a resource?
- Why does processing goes slower on implementing try catch block in java?
- Redirect inside java interceptor
- Push toolbar content below statusbar
- Animation in Java on top of JPanel
- JPA - How to query with a LIKE operator in combination with an AttributeConverter
- Java Assign a Value to an array cell
Related Questions in ELASTICSEARCH
- Elasticsearch schema for multiple versions of the same text
- Elasticsearch nested filter query
- Elasticsearch data model
- search with filter by token count
- Usage of - operator in elasticsearch
- Running multiprocessing on two different functions in Python 2.7
- How to get an Elasticsearch aggregation with multiple fields
- How to implement custom sort in elasticsearch?
- Custom Analyzer not working Elasticsearch
- How to implement full text search using Elasticsearch in Rails?
- UnresolvedAddressException in Logstash+elasticsearch
- Elasticsearch Fiddler No DNS
- Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr
- how to disable page query in Spring-data-elasticsearch
- Create Custom Analyzer after index has been created
Related Questions in SOLR
- Developing a search and tag heavy website
- How can I integrate Solr5.1.0 with Nutch1.10
- Solr ping taking time during full import
- Indexed data is not displaying on storefront
- Heap size issue on migrating from Solr 5.0.0 to Solr 5.1.0
- Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr
- Exact word not boosting much Solr
- Solr stopped with Error opening new searcher at org.apache.solr.core
- Data import in solr from multiple entities
- solr reindexing issue for EdgeNgramFilter
- Heap memory Solr and Elasticsearch
- How to index documents with their metadata in a DB using Solr 5.1.0
- Isnull equivalent in SOLR
- SolrNet query not working for Scandinavian characters
- Query always the same with Sunspot/Solr on rails
Related Questions in OPEN-SOURCE
- What does it mean for a language to be open source?
- Am I allowed to use GitLab Community Edition for commercial projects?
- Where is the Open Source repository for Swift 2.0?
- Android - Java Packages & Libraries
- What is the canonical way to integrate another open source project into an Android project when no jar file is available?
- Mozilla Firefox resource for bug reports and suggestions
- poll an http URL and show stats
- Convert char to datetime odoo 9
- Module Not Found from git url dependency
- Pip not working inside Virtual Env but works outside perfectly
- How to access the elapsed time of VLC when paused? - Java subtitle strings parsing program
- Making my own framework and using it as an upstream for projects
- Drupal 7 multi site site specific theme is not working
- Logo in all pages in wordpress
- Is it standard to make each user sign up for a 3rd party API key?
Related Questions in MANIFOLDCF
- Is manifold cf a good option for Google Drive indexing?
- SessionException occurs when crawling with solrCloud
- Web crawl using manifoldcf
- Best way to crawl through file system and index
- manifold sharepoint elasticsearch
- writing Mongo DB output connector for manifoldcf
- Extracting contents using Tika transformation - Manifold CF
- ManifoldCF error when creating ElasticSearch output connector
- Do I need to configure Authorities in ManifoldCF?
- How to crawl a website that has SAML authentication using ManifoldCF or nutch?
- Searching metadata from images using Datafari
- Manifoldcf documentum crawling slowness
- Alfresco Community Edition, ManifoldCF and Elasticsearch to optimize full-text search
- Crawling Jira with Manifoldcf and Solr - String index out of range
- How to get "Document status" data through REST API with Apache ManifoldCF
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
better tuning the crawling Database(PostgreSQL ) is good start.
there is some reference you can use: https://manifoldcf.apache.org/release/release-2.13/en_US/performance-tuning.html