I have logs which I am trying to push to Google BigQuery. I am trying to build the entire pipeline using google dataflow. The log structure is different and can be classified into four different type. In my pipeline I read logs from PubSub parse it and write to BigQuery table. The table to which the logs need to written is depending on one parameter in logs. The problem is I am stuck on a point where how to change TableName for BigQueryIO.Write at runtime.
Google dataflow write to mutiple tables based on input
1.5k Views Asked by nikhil sharma At
1
There are 1 best solutions below
Related Questions in GOOGLE-BIGQUERY
- Get the last data of my google analytics dataset
- Is there any form to write to BigQuery specifying the name of destination tables dynamically?
- How to obtain java repositories having maximum number of stars in GitHub-Archive
- Possible to create BigQuery Table/Schema without populating with Data?
- Google spreadsheet script authorisation to BigQuery
- Google BigQuery Optimization Strategies
- Error when I try to create different BigQuery tables at the same pipeline execution
- Run BigQuery without login authentication
- Is there a CityHash Python (2.7) Implementation for Google App Engine?
- pandas read_gbq returns httplib.ResponseNotReady
- Designing an API on top of BigQuery
- BigQuery row level security permissions
- What is the best way to fuzzy compare two tables
- Query Google Bigquery Through Python In Google App Engine
- How to integrate Google Bigquery with c# console application
Related Questions in GOOGLE-CLOUD-DATAFLOW
- Support for Cloud Bigtable as Sink in Cloud Dataflow
- Is it possible to read a message from a PubSub and separate its data in different elements of a PCollection<String>? If so, how?
- Is there any form to write to BigQuery specifying the name of destination tables dynamically?
- Is there anyway to poll the system watermark of a running data flow pipeline?
- Error when I try to create different BigQuery tables at the same pipeline execution
- Dataflow job errors: "'The resource 'projects/<removed>/zones/us-central1-a/disks/<removed>-harness-0' is not ready'
- INTERNAL: Write rejected
- Error during the pipeline execution: exceeds allowed maximum skew
- Error during pipeline execution: Cannot get host IP: cannot get node: node billingtransactionsprod-o-06150305-c2d7-harness-0 not found
- Cloud Dataflow - Increase JVM Xmx Value
- failed to compile dataflow sample
- Is there a limit on the number of side outputs in Google Cloud Dataflow?
- Inserting into BigQuery via load jobs (not streaming)
- How can I emit summary data for each window even if a given window was empty?
- How to read the resource file? (google cloud dafaflow)
Related Questions in GOOGLE-CLOUD-PUBSUB
- Generating wrong Uri for PubSub Service
- Is it possible to read a message from a PubSub and separate its data in different elements of a PCollection<String>? If so, how?
- Is there a way to configure the retention period for google cloud pub/sub?
- Is there any form to reduce the quantity of messages read per second from PubSubIO?
- Topic is created on cloud pub/sub but unable to create watch on that topic
- How to stop a streaming pipeline in google cloud dataflow
- Can a subscriber select messages based on publisher?
- Accessing the subscription/num_oustanding_messages metric in Google PubSub from Python
- Stream BigQuery table into Google Pub/Sub
- Google dataflow write to mutiple tables based on input
- How to Authenticate an External Publisher in Google Pub/Sub?
- Cross project push pub sub and firewall rules
- Google Cloud Pub/Sub: unable to get request PUSH from GAE endpoint URL
- For google cloud dataflow, is it possible to start another pipeline from a pipeline.
- PubSub Kafka Connect node connection end of file exception
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
You can use side outputs.
https://cloud.google.com/dataflow/model/par-do#emitting-to-side-outputs-in-your-dofn
The following sample code, reads a BigQuery table and splits it in 3 different PCollections. Each PCollections ends up sent to a different Pub/Sub topic (which could be different BigQuery tables instead).