Skip to content

Setting a custom endpoint_url #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
npezolano opened this issue Mar 11, 2018 · 7 comments
Closed

Setting a custom endpoint_url #120

npezolano opened this issue Mar 11, 2018 · 7 comments

Comments

@npezolano
Copy link

How do I pass a custom endpoint url to s3fs.S3FileSystem ?

I've tried:

kwargs = {'endpoint_url':"https://s3.wasabisys.com",
          'region_name':'us-east-1'}
self.client = s3fs.S3FileSystem(key=AWS_ACCESS_KEY_ID, 
                                secret=AWS_SECRET_ACCESS_KEY,
                                use_ssl=True,
                                **kwargs)

However I get the error:

  File "s3fs/core.py", line 215, in connect
    **self.kwargs)
TypeError: __init__() got an unexpected keyword argument 'endpoint_url'

I've also tried passing kwargs as the parameter config_kwargs and s3_additional_kwargs with similar errors. The problem with these is that endpoint_url gets passed to botocore.Config when it shouldn't.

I can verify boto3 is working with the following:

client = boto3.client("s3",
        aws_access_key_id=AWS_ACCESS_KEY_ID,
        aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
        endpoint_url="https://s3.wasabisys.com",
        use_ssl=True,
        region_name="us-east-1", 
        api_version=None,verify=None, config=None)

fs-s3fs also works

@martindurant
Copy link
Member

Duplicate of #119

It's client_kwargs={'endpoint_url': 'https:...'} that you want.

@npezolano
Copy link
Author

Thanks for the quick reply! That needs to get added to the documentation

@martindurant
Copy link
Member

@npezolano , since you've just looked through the documentation, you may be the best placed to consider where this information should appear. Would you be willing to provide a PR?

@dapperfu
Copy link

dapperfu commented Mar 26, 2020

My $0.02 after finding this solution after extensive googling, somewhere where new users will find it 'earlier'.

We are using a self hosted minio setup. Almost all online documentation starts with the assumption that everyone is using Amazon's S3 cloud. I spent multiple hours trying to figure out just the search terms that I needed to find endpoint_url.

A note about using s3fs with a self hosted or non-AWS server in an example would probably have gone a long way in helping.

@martindurant
Copy link
Member

This would naturally find a place in the S3FileSystem docstring and the RST documentation (docs/source/ directory). Would you be willing to add extra information in a PR? I have personally never used s3fs with anything other than AWS, and I suppose it shows. We have had it used by others with other s3-compatible services.

@michaelosthege
Copy link

Is my understanding correct, that it is currently not possible/supported to use a non Amazon S3 endpoint like this:

df.to_csv("s3://my_big_bucket/testdata.csv")

instead of:

fs = s3fs.S3FileSystem(
    anon=False,
    key=os.environ.get("aws_access_key_id", None),
    secret=os.environ.get("aws_secret_access_key", None),
    client_kwargs=dict(
        endpoint_url=os.environ.get("aws_endpoint_url", None)
    )
)

with fs.open("my_big_bucket/testfile.csv", "w") as file:
    df.to_csv(file)

?

@martindurant
Copy link
Member

Watch this space! Once pandas-dev/pandas#34266 is in, we will be adding storage_parameters= to the various pandas IO functions, as is already available in the Dask versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants