What is the best way to pass data using the Apache Arrow format from Node.js to Rust? Storing the data in each language is easy enough, but its the sharing memory that is giving me challenges.
I'm using Napi-rs to generate the node.js API bindings.
I'm getting a "Failed to create reference from Buffer" for the JavaScript code below. When I try to pass arrowVector.data[0].buffers to the rust function I get "../src/node_buffer.cc:245:char *node::Buffer::Data(Local<v8::Value>): Assertion `val->IsArrayBufferView()' failed."
I think I'm missing something core here.
Here is my sample Node test code:
import { makeVector } from 'apache-arrow';
import {testFn} from './index.js';
// Create arrow Vec
const LENGTH = 2000;
const rainAmounts = Float32Array.from(
{ length: LENGTH },
() => Number((Math.random() * 20).toFixed(1)));
const arrowVector = makeVector(rainAmounts);
// how to get buffers from vec? and send to rust function
testFn(arrowVector);
Here is my sample Rust code:
use napi::bindgen_prelude::Buffer;
#[napi]
pub fn test_fn(buffers: Buffer) {
println!("test_fn called");
}
NAPI-RS does mention:
But... what you are passing directly to the Rust function is not a buffer.
It is an Arrow vector.
That is not compatible with the Rust side:
Hence, the
Failed to create reference from Buffererror message.In your Node.js code, get the buffers for the Arrow vector and pass them directly to Rust.
In Rust, the Napi-rs buffer should be able to directly take a buffer from Node.js.
However, keep in mind that Arrow vectors often contain multiple buffers. Depending on the data types involved, you might need to pass multiple buffers from Node.js to Rust and interpret them correctly in Rust.
See
apache/arrow/js/src/data.ts#buffers()This approach allows you to avoid copying the data, but the memory is still managed by Node.js, so you need to be careful about lifetimes. If Node.js garbage collects the original Arrow vector, the buffer's data you passed to Rust might be deallocated. To avoid this, ensure that the Arrow vector in Node.js remains in scope and alive as long as the Rust code might access its data.
See "how long does variables stay in memory in Node.js" for more.