For simplicity, I've a table in BigQuery with one field of type "Numeric". When I try to write a PySpark dataframe, with one column, to BigQuery it keeps on raising the NullPointerException. I tried converting pyspark column into int, float, string, and even encode it but it keeps on throwing the NullPointerException. Even after spending 5 to 6 hours, I'm unable to figure it out myself or on the internet that what is the issue here and what should be the exact pyspark dataframe column type for mapping it to BigQuery Numeric column type. Any help or direction would be of great help. Thanks in advance.
Write PySpark dataframe to BigQuery "Numeric" datatype
1.1k Views Asked by Malina Dale At
2
There are 2 best solutions below
0
Piyush Namra
On
This is due to the range of spark data frames has. It can accomodate only 10 digit number. In-order to correct this issue please cast the number to Long datatype.
IntegerType: Represents 4-byte signed integer numbers. The range of numbers is from
-2147483648 to 2147483647.
https://spark.apache.org/docs/latest/sql-ref-datatypes.html
Hope this helps.
Related Questions in GOOGLE-CLOUD-PLATFORM
- Google Logging API - What service name to use when writing entries from non-Google application?
- Custom exception message from google endpoints exception
- Unable to connect database of lamp instance from servlet running on tomcat instance of google cloud
- How to launch a Jar file using Spark on hadoop
- Google Cloud Bigtable Durability/Availability Guarantees
- How do I add a startup script to an existing VM from the developer console?
- What is the difference between an Instance and an Instance group
- How do i change files using ftp in google cloud?
- How to update all machines in an instance group on Google Cloud Platform?
- Setting up freeswitch server on Google cloud compute
- Google Cloud Endpoints: verifyToken: Signature length not correct
- Google Cloud BigTable connection setup time
- How GCE HTTP Cross-Region Load Balancing implemented
- Google Cloud Bigtable compression
- Google cloud SDK code to execute via cron
Related Questions in PYSPARK
- dataframe or sqlctx (sqlcontext) generated "Trying to call a package" error
- Importing modules for code that runs in the workers
- Is possible to run spark (specifically pyspark) in process?
- More than expected jobs running in apache spark
- OutOfMemoryError when using PySpark to read files in local mode
- Can I change SparkContext.appName on the fly?
- Read ORC files directly from Spark shell
- Is there a way to mimic R's higher order (binary) function shorthand syntax within spark or pyspark?
- Accessing csv file placed in hdfs using spark
- one job takes extremely long on multiple left join in Spark-SQL (1.3.1)
- How to use spark for map-reduce flow to select N columns, top M rows of all csv files under a folder?
- Spark context 'sc' not defined
- How lambda function in takeOrdered function works in pySpark?
- Is the DStream return by updateStateByKey function only contains one RDD?
- What to set `SPARK_HOME` to?
Related Questions in GOOGLE-BIGQUERY
- Get the last data of my google analytics dataset
- Is there any form to write to BigQuery specifying the name of destination tables dynamically?
- How to obtain java repositories having maximum number of stars in GitHub-Archive
- Possible to create BigQuery Table/Schema without populating with Data?
- Google spreadsheet script authorisation to BigQuery
- Google BigQuery Optimization Strategies
- Error when I try to create different BigQuery tables at the same pipeline execution
- Run BigQuery without login authentication
- Is there a CityHash Python (2.7) Implementation for Google App Engine?
- pandas read_gbq returns httplib.ResponseNotReady
- Designing an API on top of BigQuery
- BigQuery row level security permissions
- What is the best way to fuzzy compare two tables
- Query Google Bigquery Through Python In Google App Engine
- How to integrate Google Bigquery with c# console application
Related Questions in APACHE-SPARK-SQL
- How to query JSON data according to JSON array's size with Spark SQL?
- dataframe or sqlctx (sqlcontext) generated "Trying to call a package" error
- How to keep a SQLContext instance alive in a spark streaming application's life cycle?
- How to setup cassandra and spark
- Where are the API docs for org.apache.spark.sql.cassandra for Spark 1.3.x?
- Spark Cassandra SQL can't perform DataFrame methods on query results
- SparkSQL - accesing nested structures Row( field1, field2=Row(..))
- Cassandra Bulk Load - NoHostAvailableException
- DSE Cassandra Spark Error
- How to add any new library like spark-csv in Apache Spark prebuilt version
- Scala extraction/pattern matching using a companion object
- Error importing types from Spark SQL
- Apache Spark, add an "CASE WHEN ... ELSE ..." calculated column to an existing DataFrame
- one job takes extremely long on multiple left join in Spark-SQL (1.3.1)
- scala.MatchError: in Dataframes
Related Questions in PYSPARK-SCHEMA
- Need to add headers in existing data frame
- pyspark - how to add a new element to ArrayType column
- ValueError: Unable to parse datatype from schema. Could not parse datatype: interval year
- pyspark error while writing data to existing hive table
- compare two dataframes and display the data that are different
- i want to obtain max value of a column depending on two other columns and for the forth column the value of the most repeated number
- Specifying column with multiple datatypes in Spark Schema
- pyspark json to dataframe schema
- Pyspark- Fill an empty strings with a '0' if Data type is BIGINT/DOUBLE/Integer
- Update a highly nested column from string to struct
- Write PySpark dataframe to BigQuery "Numeric" datatype
- how to sequentially iterate rows in Pyspark Dataframe
- How to create dataframe with struct column in PySpark without specifying a schema?
- parse pyspark dataframe column of varying keys into new column for one key's values
- AttributeError: 'DataFrameWriter' object has no attribute 'schema'
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
For anyone who faces the same issue, you just have to cast the column to decimal type.