Error while crawling path\to\file_folder: java.net.ConnectException: Connection timed out: connect
I am trying to ingest the remote server files using FSCrawler into the existing index of Elasticserach(which is on my local machine) but getting above exception.
Below is the _settings.yml file of FSCrawler:
---
name: "index_in_es_onefsc"
server:
hostname: "machinename.abc.com"
port: 22
username: "username"
password: "password@20"
protocol: "ssh"
fs:
url: "E:\\TestFilesToBeIndexed"
update_rate: "15m"
excludes:
- "*/~*"
json_support: false
filename_as_id: false
add_filesize: true
remove_deleted: true
add_as_inner_object: false
store_source: false
index_content: true
attributes_support: false
raw_metadata: false
xml_support: false
index_folders: true
lang_detect: false
continue_on_error: false
ocr:
language: "eng"
enabled: true
pdf_strategy: "ocr_and_text"
follow_symlinks: false
elasticsearch:
nodes:
- url: "http://127.0.0.1:9200"
bulk_size: 100
flush_interval: "5s"
byte_size: "10mb"
The documentation says that on Windows when doing SSH from and to a Windows machine you must use the following form:
I think that on Windows, you need to use:
Note that there is a known issue when running FSCrawler from a Windows machine. This has been fixed but in case you are using an older SNAPSHOT version than the one published on June 26th, you'll most likely need to upgrade.