I am trying to read data from mongodb(which is running in AWS Documentdb) and write to bigquery.
I have written python code for that and running it using python3 command. My pipeline string looks like below:
p | ReadFromMongoDB(uri='mongodb://documentdb_url:27017',db="test_db",coll="test_collection") | beam.Map(json_parse_fun) | 'WriteToBigQuery' >> beam.io.WriteToBigQuery('target_bq_table', write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)
json_parse_fun() converts mongodb data into JSON file.
but when i run this code, dataflow job fails with below error:
pymongo.errors.OperationFailure: Feature not supported: splitVector
MongoDB does not "run in DocumentDB".
DocumentDB is an imitation database implementing some of MongoDB's features. You found a feature it doesn't implement.
See "Feature not supported: $text" in document db with mongodb 3.6 compatiability.