Why does starting a streaming query lead to "ExitCodeException exitCode=-1073741515"?

16.9k Views Asked by At

Been trying to get used to the new structured streaming but it keeps giving me below error as soon as I start a .writeStream query.

Any idea what could be causing this? Closest I could find was an ongoing Spark bug if you split checkpoint and metadata folders between Local and HDFS, but. Running on Windows 10, Spark 2.2 and IntelliJ.

17/08/29 21:47:39 ERROR StreamMetadata: Error writing stream metadata StreamMetadata(41dc9417-621c-40e1-a3cb-976737b83fb7) to C:/Users/jason/AppData/Local/Temp/temporary-b549ee73-6476-46c3-aaf8-23295bd6fa8c/metadata
ExitCodeException exitCode=-1073741515: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
    at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
    at org.apache.spark.sql.execution.streaming.StreamMetadata$.write(StreamMetadata.scala:76)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$6.apply(StreamExecution.scala:116)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$6.apply(StreamExecution.scala:114)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.execution.streaming.StreamExecution.<init>(StreamExecution.scala:114)
    at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:240)
    at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278)
    at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:282)
    at FileStream$.main(FileStream.scala:157)
    at FileStream.main(FileStream.scala)
Exception in thread "main" ExitCodeException exitCode=-1073741515: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
    at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
    at org.apache.spark.sql.execution.streaming.StreamMetadata$.write(StreamMetadata.scala:76)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$6.apply(StreamExecution.scala:116)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$6.apply(StreamExecution.scala:114)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.execution.streaming.StreamExecution.<init>(StreamExecution.scala:114)
    at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:240)
    at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278)
    at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:282)
    at FileStream$.main(FileStream.scala:157)
    at FileStream.main(FileStream.scala)
17/08/29 21:47:39 INFO SparkContext: Invoking stop() from shutdown hook
17/08/29 21:47:39 INFO SparkUI: Stopped Spark web UI at http://192.168.178.21:4040
17/08/29 21:47:39 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/08/29 21:47:39 INFO MemoryStore: MemoryStore cleared
17/08/29 21:47:39 INFO BlockManager: BlockManager stopped
17/08/29 21:47:39 INFO BlockManagerMaster: BlockManagerMaster stopped
17/08/29 21:47:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/08/29 21:47:39 INFO SparkContext: Successfully stopped SparkContext
17/08/29 21:47:39 INFO ShutdownHookManager: Shutdown hook called
17/08/29 21:47:39 INFO ShutdownHookManager: Deleting directory C:\Users\jason\AppData\Local\Temp\temporary-b549ee73-6476-46c3-aaf8-23295bd6fa8c
17/08/29 21:47:39 INFO ShutdownHookManager: Deleting directory C:\Users\jason\AppData\Local\Temp\spark-117ed625-a588-4dcb-988b-2055ec5fa7ec

Process finished with exit code 1
7

There are 7 best solutions below

0
On

For me ..above @Moises Trelles solutons works. I just installed vcredist_x64.exe from https://www.microsoft.com/en-au/download/details.aspx?id=26999. and it worked

With this i had to copy 64 bit Winutils.exe in C:\hadoop271\bin
if any sub-Folder is there then its issue (i.e C:\hadoop271\sub\bin\ --> this creates issue or me)

Thank you very Much stackoverflow

0
On

As others have pointed out, it seems WinUtils relies on a file called msvcr100.dll which is included in the Microsoft Visual C++ 2010 Redistributable Package.

ExitCodeException with exitCode= -1073741515 suggests the mscvr100.dll file is not present on your machine and as such, WinUtils is not working correctly.

To resolve this, install the redistributable package (with Service Pack 1) from the following location:

Microsoft Download Center

Note that later versions of the redistributable (e.g. 2012, 2013 … ) do not seem to include the required msvcr100.dll file.

0
On

Actually, I had the same problem while running Spark unit tests on my local machine. It was caused by the failing WinUtils.exe in %HADOOP_HOME% folder:

Input: %HADOOP_HOME%\bin\winutils.exe chmod 777 %SOME_TEMP_DIRECTORY%

Output:

winutils.exe - System Error
The code execution cannot proceed because MSVCR100.dll was not found.
Reinstalling the program may fix this problem.

After some surfing the Internet I found out an issue on winutils project of Steve Loughran: Windows 10: winutils.exe doesn't work.
In particular it says that installing the VC++ redistributable packages should fix the problem (and that worked in my case): How do I fix this error "msvcp100.dll is missing"

0
On

In my case I was using Windows 10 and had to change the Environment Variables-> User Variables

TMP and TEMP to Custom Location in some other volume (D:\Temp or E:\Temp etc..) instead of default

%USERPROFILE%\AppData\Local\Temp

and also set the HADOOP_HOME

System.setProperty( "hadoop.home.dir" , "$HADOOP_HOME\winutils-master\hadoop-2.x.x" )

Don't forget to copy the hadoop.dll to C:\Windows\System32.

You can download the appropriate version from this link. DOWNLOAD WINUTILS.exe

For me hadoop-2.7.1 version fixed the issue.

0
On

i have solved this by installing https://www.microsoft.com/en-au/download/details.aspx?id=26999 and after that copied hadoop.dll alongwith winutils.exe

0
On

This is a windows problem

The program can't start because MSVCP100.dll is missing from your computer. Try reinstalling the program to fix this problem.

You will need to install the VC++ redistributable packages:

  • Download Microsoft Visual C++ 2010 Service Pack 1 Redistributable Package from the Official Microsoft Download Center

https://www.microsoft.com/en-au/download/details.aspx?id=26999

The download provides options for x86 and x64 packages.

0
On

The problem I had occurred while converting a DataFrame to a Parquet file. It would create the directory, but fail with “ExitCodeException exitCode=-1073741515”.

I ran Spark from Intellij 2020.2.2 x64 on Windows 10 (version 2004). I have spark-3.0.1-bin-hadoop3.2 installed in GitBash on my C drive. I downloaded the winutils from this Github repo https://github.com/cdarlint/winutils after being redirected from this repo https://github.com/steveloughran/winutils.

I installed them in C:\winutils directory (top level), which contained one subdirectory named /bin/, which contained the winutil.exe and associated files. This top-level path was added as a Windows environment variable in System variables named HADOOP_HOME. I also have a variable for SPARK_HOME to C:\Users\name\spark-3.0.1-bin-hadoop3.2\

I was getting this error, until I found this SO post and Moises Trelles' answer above, and also this page https://answers.microsoft.com/en-us/insider/forum/insider_wintp-insider_repair/how-do-i-fix-this-error-msvcp100dll-is-missing/c167d686-044e-44ab-8e8f-968fac9525c5?auth=1

I have a 64-bit system, so installed both x86 and x64 versions of msvcp100.dll as recommended in the answers.microsoft.com answer. I did not reboot, but I did close Intellij and reload and upon rerunning, the correct output (parquet file) was generated. Good luck! I'm so thankful for stackoverflow/google/internet/community of helpful people.