admin管理员组

文章数量:1335624

I'm trying to use LangChains S3FileLoader or S3DirectoryLoader, but both are returning empty page_contents. I know it's finding the files, because the metadata source is properly filled in.

My code is very simple,

from langchain_community.document_loaders.s3_directory import S3DirectoryLoader
from langchain_community.document_loaders.s3_file import S3FileLoader

loader = S3FileLoader(bucket = s3_bucket,
                      key='myfile.json',
                      region_name="us-east-1")

documents = loader.load()

And when I print documents, I get something like the following:

Loader - <langchain_community.document_loaders.s3_file.S3FileLoader object at 0x0000014CBB850B90>Documents - [Document(metadata={'source': 's3://myS3bucket/myfile.json'}, page_content='')]

For simplicity, my Access Key, Secret Key and Session Token are all set from Windows command prompt where I'm running the python script from.

In AWS, my role/policy has PutObject, GetObject and ListObject permissions on that bucket.

I can't figure out why my page_contents are empty. The same occurs for S3FileLoader and S3DirectoryLoader APIs.

I have also tried running from both a Windows command prompt and from Ubuntu via WSL.

本文标签: pythonLangchain s3DirectoryLoader and s3FileLoader have empty pagecontentsStack Overflow