I am trying to read data from mongodb(which is running in AWS Documentdb) and write to bigquery.

I have written python code for that and running it using python3 command. My pipeline string looks like below:

p | ReadFromMongoDB(uri='mongodb://documentdb_url:27017',db="test_db",coll="test_collection") | beam.Map(json_parse_fun) | 'WriteToBigQuery' >> beam.io.WriteToBigQuery('target_bq_table', write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)

json_parse_fun() converts mongodb data into JSON file.

but when i run this code, dataflow job fails with below error:

pymongo.errors.OperationFailure: Feature not supported: splitVector
1

There are 1 best solutions below

0
On

MongoDB does not "run in DocumentDB".

DocumentDB is an imitation database implementing some of MongoDB's features. You found a feature it doesn't implement.

See "Feature not supported: $text" in document db with mongodb 3.6 compatiability.