I automated the creation and configuration of my Postgres db for my Django project hosted as an S3 bucket on AWS through Heroku using a dj_database_url.config() method inside my settings.py. Using this tool, I can just export the Postgres db env variables in my shell using the format:
and then I can enter data from my local dev server as if it were in the cloud. Very convenient!
The problem is, I got an email notification from Amazon yesterday saying that my db contents are public so this prompted me to check out my S3 permissions. I found some guides and docs on AWS, on Heroku, and elsewhere around the web, which explain how to use AWS and how to view and modify S3 permissions. Guides I came across include:
These guides explain how to set up an S3 bucket from scratch for a Python app, however in my case, dj_database_url has already done all the heavy lifting. None of these guides explain how to access my HEROKU_POSTGRESQL_<color>_URL instances already existing on AWS configured by dj_database_url as they appear as env variables within my Heroku Dashboard.
Iāve got an AWS root account set up but I canāt figure out how to connect it to the S3 instances initialized by the helpful dj_database_url script.
My question: How do I access the Postgres db S3 buckets through the AWS Dashboard so that I can view and change their permissions?
This might not be true for aws, but with the mongodb integration on heroku you can just click the addon and iād take you to the dashboard of whichever provider it is using. I believe itās setup trough your heroku account, not the aws account.
Iām a bit unfamiliar with your scenario, can you explain how you are using the database with S3? When using AWS, I would have assumed the database is on RDS and the django applicationās static and media content to be uploaded to S3.
Based on your question, Iāve identified the issue. I was confusing AWSā RDS with S3. The email that I got from Amazon (that I mentioned in my original post) was warning me that my S3 files are public and may be indexed on public search engines. If that means my static files such as images, javascript, and css files are exposed, then Iām not concerned at all. What concerned me was the production data (blog posts like essay material) Iāve entered into Postgres which I had intended on protecting from public search engines behind a gateway.
Copied at the bottom of this post is a copy and paste verbatim of the email from Amazon.
Postgres is similar. For each Postgres db instance on Heroku, you can drill down and view categories of options like Durability, Settings, and Dataclips. Iām not sure how to use all of them. But under Settings, you can reveal the config vars such as the hostname, user, password, port, Heroku CLI access point, URI, and others. This is all useful information, but there is nothing about assigning db permissions to certain users. At any rate, Iām not concerned about this being a vulnerability any more given that Amazonās email was referring to the contents of the S3 bucket which isnāt a problem for the needs of my project.
We are writing to notify you that you have configured your S3 bucket(s) to be publicly accessible, and this may be a larger audience than you intended. By default, S3 buckets allow only the account owner to access the contents of a bucket; however, customers can configure S3 buckets to permit public access. Public buckets are accessible by anyone on the Internet, and content in them may be indexed by search engines.
We recommend enabling the S3 Block Public Access feature on buckets if public access is not required. S3 bucket permissions should never allow āPrincipalā:ā*ā unless you intend to grant public access to your data. Additionally, S3 bucket ACLs should be appropriately scoped to prevent unintended access to āAuthenticated Usersā (anyone with an AWS account) or āEveryoneā (anyone with Internet access) unless your use case requires it. For AWSās definition of āPublic Access,ā please see The Meaning of "Publicā [1].
The list of buckets which can be publicly accessed is below:
slashtest02 | us-east-2
You can ensure individual buckets, or all your buckets prevent public access by turning on the S3 Block Public Access feature [2]. This feature is free of charge and it only takes a minute to enable. For step by step instructions on setting up S3 Block Public Access via the S3 management console, see Jeff Barrās blog [3], or check out the video tutorial on Block Public Access [4].
If you have a business need to maintain some level of public access, please see Overview of Managing Access [5] for more in-depth instructions on managing access to your bucket to make sure youāve permitted the correct level of access to your objects. If you would like more information about policy configuration in S3, please refer to Managing Access in Amazon S3 [6], and S3 Security Best Practices [7].
We recommend that you make changes in accordance with your operational best practices.
If you believe you have received this message in error or if you require technical assistance, please open a support case[8].
Amazon Web Services, Inc. is a subsidiary of Amazon.com, Inc. Amazon.com is a registered trademark of Amazon.com, Inc. This message was produced and distributed by Amazon Web Services Inc., 410 Terry Ave. North, Seattle, WA 98109-5210
I donāt fully understand the email from Amazon. What else is Amazon trying to say about my static file data being at risk? @CodenameTim and @Jefwillems: Based on your understanding (even a little insight would be great) of what is said in the email, what other security implications might there be for protecting data for a Django project in general?
Hi @Drone4four, Amazon is trying to warn you that the S3 bucket is available to the public (not your RDS instance / database). Your static content is likely meant to be public. Unless thereās some stuff you donāt want available to everyone.
Your media content is a different story. If all file and image uploads to the web application are meant to be public, then youāre still in the clear. However, if users are uploading something that is specific to themselves or their organization and is not meant to be shared with others, you should consider using configuring your S3 bucket to block public access for that part of the bucket.
In addition to the information that @CodenameTim has provided, Iāll add that my personal concern about such a warning is someone wanting to affect me personally by executing millions of requests for files stored on S3. At some point, they could really drive up your AWS expenses. (Itās been about 6 years since Iāve done an S3-based site - I donāt remember what all the key thresholds are for usage.)
Thanks @KenWhitesell for your reply as well. Iāve seen Matt Drudge pull this āsleight of handā on his news aggregator website where he shows a title of a story on another news site but then embeds a large hi-res āGettysā image/photo hosted by the Associated Press which effectively delegates the bandwidth cost to a third party. Drudge gets tens of millions of hits a day at the expense of the Associated Press.
The scenario you describe, Ken, involving my public S3 bucket data being linked to or embedded by someone else could, hypothetically speaking, be hit by a high amount of third party traffic. If this is what AWS is warning about in the original email, then the problem is much clearer to me now.
If the S3 data were restricted, this would protect it from invasive traffic. The AWS email above includes all sorts of resources and documentation on how to protect S3 buckets. Iāll continue to munch on them for now. If I have any further questions, I will be sure to report back here or perhaps leverage Stack Overflow for more AWS-specific expertise if required.
Actually, I donāt think this is their warning - itās mine, based on past experience. (I donāt think Amazon cares how much you spend on S3 data transfers - or, perhaps, they may want you to spend moreā¦)
No, it really is just a confidentially / privacy issue. All theyāre really saying is that with a public bucket:
Thatās the warning in a nutshell. Everything else in that email is a description of methods to mitigate that risk, in case youāre not ok with everyone-and-their-brother having access to every file in that bucket. (e.g. You might have files that should be accessible to the public, but you might also have files in different folders that you donāt want everyone to be able to see. But by having them in that same bucket, everyone could find those non-public files.)