I'm encountering a challenging issue with the UnstructuredLoader
from the LangChain library in a Node.js application running on AWS Lambda. Despite providing a valid API key, the loader throws an ENOENT error, indicating an invalid API key.
Here's the error log:
2023-12-05T14:20:10.745Z 9482936f-6dd0-5bc7-99ce-ce865e1eb340 ERROR Invoke Error {
"errorType": "Error",
"errorMessage": "Failed to partition file /tmp/README.md with error 401 and message {\"detail\":\"API key is invalid, please provide a valid API key in the header.\"}",
"stack": [
"Error: Failed to partition file /tmp/README.md with error 401 and message {\"detail\":\"API key is invalid, please provide a valid API key in the header.\"}",
" at UnstructuredLoader._partition (/var/task/node_modules/langchain/dist/document_loaders/fs/unstructured.cjs:189:19)",
" at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
" at async UnstructuredLoader.load (/var/task/node_modules/langchain/dist/document_loaders/fs/unstructured.cjs:198:26)",
" at async processFile (/var/task/index.js:103:21)",
" at async Runtime.handler (/var/task/index.js:151:5)"
]
}
The issue arises specifically when using the UnstructuredLoader for processing file types that are not natively supported by other loaders in LangChain (e.g., PPTX files). I have confirmed that the API key is correctly set in the environment variables and works fine for other loaders.
Here's a relevant snippet of the implementation:
async function processFile(filename: string, key: string) {
try {
// Fetch the File content from S3 using the new method
const command = new GetObjectCommand({
Bucket: process.env.S3_BUCKET_NAME,
Key: filename,
});
const content: any = await s3Client.send(command);
const tempFilePath = path.join('/tmp', path.basename(filename));
await fs.writeFile(tempFilePath, content.Body); // Directly save the buffer
// Load and split the File
const fileExtension = path.extname(tempFilePath).toLowerCase();
let loader: any;
switch (fileExtension) {
...
default:
loader = new UnstructuredLoader(tempFilePath, { apiKey: key });
}
...
} catch (error) {
console.error(`Error processing and ingesting File: ${filename}. Error: ${error}`);
throw error;
}
}
I've ensured that the API key is being passed correctly to the UnstructuredLoader
. This setup works flawlessly for other file types but not for those requiring the UnstructuredLoader
. Is there something I'm missing in the configuration or usage of the UnstructuredLoader
? Any insights or experiences with similar issues would be immensely helpful.