Upgraded Azure Data Explorer to python sandbox image 3.10 - impossible to install (extract) external packages

47 Views Asked by At

I am using an Azure Data Explorer cluster with an upgraded python 3.10 sandbox image. I want to test the latest prophet package (v1.1.5) within the sandbox. I followed the procedure here: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/python-plugin?pivots=azuredataexplorer It works without problem for the "old" fbprophet package (v0.7.1): https://learn.microsoft.com/en-us/azure/data-explorer/kusto/functions-library/series-fbprophet-forecast-fl?tabs=query-defined&pivots=azuredataexplorer

The error has to do with the way the wheel packages are extracted from the zip file:

{"Count":1,"Text":"Partial query failure: 0x80131500 (message: 'Encountered an error during execution of local sandbox. Error details: Python code execution failed with the following error: EOFError: ; Traceback (most recent call last):\n  File \"C:\\python\\execute_python.py\", line 215, in exec_python\n    exec(code_to_execute, local_variables, local_variables)\n  File \"<string>\", line 6, in <module>\n  File \"C:\\python\\Lib\\site-packages\\sandbox_utils\\zipackage.py\", line 24, in install\n    zip_ref.extractall(output_dir)\n  File \"zipfile.py\", line 1645, in extractall\n  File \"zipfile.py\", line 1700, in _extract_member\n  File \"shutil.py\", line 195, in copyfileobj\n  File \"zipfile.py\", line 925, in read\n  File \"zipfile.py\", line 993, in _read1\n  File \"zipfile.py\", line 1028, in _read2\nEOFError\n.  ==> ExecutePluginOperator failure: ', details: 'Source: DataNode\n[0]Kusto.Cloud.Platform.Sandbox.Exceptions.SandboxExecutionException: Encountered an error during execution of local sandbox. Error details: Python code execution failed with the following error: EOFError: ; Traceback (most recent call last):\n  File \"C:\\python\\execute_python.py\", line 215, in exec_python\n    exec(code_to_execute, local_variables, local_variables)\n  File \"<string>\", line 6, in <module>\n  File \"C:\\python\\Lib\\site-packages\\sandbox_utils\\zipackage.py\", line 24, in install\n    zip_ref.extractall(output_dir)\n  File \"zipfile.py\", line 1645, in extractall\n  File \"zipfile.py\", line 1700, in _extract_member\n  File \"shutil.py\", line 195, in copyfileobj\n  File \"zipfile.py\", line 925, in read\n  File \"zipfile.py\", line 993, in _read1\n  File \"zipfile.py\", line 1028, in _read2\nEOFError\n.\r\nTimestamp=2024-02-06T14:32:20.9866693Z\r\nClientRequestId=KE.RunQuery;266613a4-4136-4ccb-99ce-1e1614c0ac0d\r\nActivityId=b18a353e-8dde-413c-9998-c5c31517a90c\r\nActivityType=DN.DEQP.EvalPass.Plugin.python\r\nServiceAlias=ADXPRDALVA\r\nMachineName=KEngine000000\r\nProcessName=Kusto.WinSvc.Svc\r\nProcessId=14340\r\nThreadId=7052\r\nActivityStack=(Activity stack: CRID=KE.RunQuery;266613a4-4136-4ccb-99ce-1e1614c0ac0d ARID=b18a353e-8dde-413c-9998-c5c31517a90c > SubqueryPull/5001436b-2f94-446d-89e4-ac1ed9e433dd > DN.DEQP.EvalPass.Plugin.python/9bdee143-9c68-4bb4-8867-64e6d5c85399)\r\nErrorCode=\r\nErrorReason=\r\nErrorMessage=\r\nDataSource=\r\nDatabaseName=\r\nClientRequestId=\r\nActivityId=00000000-0000-0000-0000-000000000000\r\nDetails=Python code execution failed with the following error: EOFError: ; Traceback (most recent call last):\n  File \"C:\\python\\execute_python.py\", line 215, in exec_python\n    exec(code_to_execute, local_variables, local_variables)\n  File \"<string>\", line 6, in <module>\n  File \"C:\\python\\Lib\\site-packages\\sandbox_utils\\zipackage.py\", line 24, in install\n    zip_ref.extractall(output_dir)\n  File \"zipfile.py\", line 1645, in extractall\n  File \"zipfile.py\", line 1700, in _extract_member\n  File \"shutil.py\", line 195, in copyfileobj\n  File \"zipfile.py\", line 925, in read\n  File \"zipfile.py\", line 993, in _read1\n  File \"zipfile.py\", line 1028, in _read2\nEOFError\n\r\n\r\n
   at Kusto.DataNode.QueryService.PluginsV2.SandboxedPluginBase.ExecuteInSandbox(SandboxKind sandboxKind, ISandboxManager sandboxManager, ClientRequestProperties clientRequestProperties, IDictionary`2 argumentsPropertyBag, IStreamSource inputStreamSource, IDictionary`2 externalArtifacts, Boolean spillToDisk, OperationStatistics& operationStatistics) in .\\Src\\Engine\\DataNode\\QueryService\\PluginsV2\\DistributedPlugins\\SandboxedPluginBase.cs:line 70\r\n
   at Kusto.DataNode.QueryService.PluginsV2.ScriptExecutionPluginBase.Execute(QueryPluginExecutionContext executionContext, PluginDistributionCapsule distributionCapsule, ClientRequestProperties clientRequestProperties, IStreamSource inputStreamSource, OperationStatistics& operationStatistics) in .\\Src\\Engine\\DataNode\\QueryService\\PluginsV2\\DistributedPlugins\\Languages\\ScriptExecutionPluginBase.cs:line 166\r\n
   at Kusto.DataNode.DataEngineQueryPlan.DataEngineQueryProcessor.DataEngineQueryCallback.ExecutePluginOperator(String pluginName, DataSourceStreamFormat inputStreamFormat, DataSourceStreamFormat outputStreamFormat, String pluginSerializedContext, String serializedQueryContextProperties, IStreamSource inputStream, OperationStatistics& operationStatistics) in .\\Src\\Engine\\DataNode\\QueryService\\DataEngineQueryPlan\\DataEngineQueryProcessor.cs:line 480')"}

I tried packing different wheel files and changing their names in the zip without success.

1

There are 1 best solutions below

0
Adi E On

Indeed that implementation was built on Python 3.6.5 (which is being deprecated) and is not compatible with the updated Python 3.10.8 image. I updated series_fbprophet_forecast_fl() to work with Prophet 1.1.5 on ADX Python 3.10.8 sandbox image. I also improved the performance. Please test it and share your feedback.

Thanks for reportng this issue!