I have just installed Google Cloud platform
for a free trial. In order to run MapReduce
tasks with DataStore
, the docs says to run
./bdutil --upload_files "samples/*" run_command ./test-mr-datastore.sh
But I couldn't get this file on my local and there's a good reason for that, this way to run MapReduce
jobs seem to be deprecated see this on github. Is that true, is there an alternative way to create MapReduce tasks from local command lines without requiring BigQuery
?
The Datastore connector connector is indeed deprecated.
To your question "is there an alternative way to create MapReduce tasks from local command line", one option is to use Google Cloud Dataflow. It's not MapReduce per se, but it's the programming model for parallel data processing which has replaced MapReduce at Google. The Dataflow SDK includes support for Datastore access.
Unlike Hadoop, you don't have to set up a cluster. You just write your code (using the Dataflow SDK) and submit your job from the CLI. The Datastore service will create the needed workers on the fly to process your job and then terminate them.