Rails Sitemap_generator using aws_fog configuration - ruby-on-rails

I am using sitemap_generator gem with rails 6 on heroku. I am told the easiest way is to use an s3 on amazon and bridge with aws_fog.
The implementation is well documented on the gem side... but I am struggling to make sure the amazon config is correct.
I search a lot and couldn't find anything so I was hoping someone could help
I configure an s3 instance lets name it example and I add it to region US East(Ohio). This is all pretty simple.
The Properties tab... what should and shouldn't be selected? i select nothing.
The Permissions tab. I make public, although this feels wrong... the bucket is for a sitemap, so it should be public right?
I set up my region as per the doumentation
SitemapGenerator::Sitemap.default_host = "https://www.example.com"
SitemapGenerator::Sitemap.public_path = 'tmp/'
SitemapGenerator::Sitemap.sitemaps_host = "https://example.s3.amazonaws.com/"
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
SitemapGenerator::Sitemap.adapter = SitemapGenerator::S3Adapter.new(fog_provider: 'AWS',
aws_access_key_id: Rails.application.credentials.aws[:access_key_id],
aws_secret_access_key: Rails.application.credentials.aws[:secret_access_key],
fog_region: 'us-east-2')
when i hit rake sitemap:refresh:no_ping on my local host I get :status_line => "HTTP/1.1 301 Moved Permanently\r\n"
I think maybe i need to add the sitemaps folder to the s3 instance, so i do but i still get the :status_line => "HTTP/1.1 301 Moved Permanently\r\n".
Any tips would be great...

I am also using sitemap-generator gem on my rails application (heroku hosted and rails 6). I have the following code inside config/sitemap.rb, before SitemapGenerator::Sitemap.create. I have configured it with aws-sdk-s3 gem and it goes like this:
require 'aws-sdk-s3'
SitemapGenerator::Sitemap.default_host = "https://www.example.com"
SitemapGenerator::Sitemap.sitemaps_host = 'https://example.s3.eu-west-2.amazonaws.com/'
SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new(Rails.application.credentials.dig(:amazon, :s3, :bucket),
aws_access_key_id: Rails.application.credentials.dig(:amazon, :s3, :access_key_id),
aws_secret_access_key: Rails.application.credentials.dig(:amazon, :s3, :secret_access_key),
aws_region: Rails.application.credentials.dig(:amazon, :s3, :region)
)

Related

aws-sdk-s3 doesn't give off NetworkingError, even if I put a wrong endpoint

I'm quite new to both RoR and AWS, so if I overlook something obvious, please forgive me.
Now I'm trying to create Ruby on Rails system which fetches data from MinIO running on a docker container.
MinIO is an S3 compatible local storage, so they say aws-sdk-s3 gem does the trick.
I wrote something like the following:
Aws.config.update(
endpoint: 'https://minio:9000',
access_key_id: "minio",
secret_access_key: "miniopass",
force_path_style: true,
region: 'us-east-1'
)
client = Aws::S3::Client.new
data = client.get_object(bucket: "bucket-01", :key => 'example_01.txt').body.read
puts data
However, this code puts empty string "".
I found it weird, so I intentionally gave wrong endpoint to Aws.config.update, which should give off Seahourse::Client::NetworkingError.
Aws.config.update(
endpoint: 'https://this-address-is-a-bullshit',
access_key_id: "minio",
secret_access_key: "miniopass",
force_path_style: true,
region: 'us-east-1'
)
However, it doesn't give off any error.
As a test, I put raise RuntimeError in the code, and then
rails aborted! RuntimeError was shown as it should be, which means NetworkingError not being displayed wasn't due to console's problem. It was just that aws-sdk-s3 doesn't give off error.
I can't wrap my head around this. If you share any idea about it, I'd really appreciate it.

Amazon CloudFront doesn't require me to invalidate objects

I have a ruby on rails application where users can upload their avatar or change it. First I stored the images in Amazon s3 but then I realized that content contents were being served slowly and decided to use Amazon cloudfront.
There is no problem for uploading and getting avatar. However, I can see that an updated photo changes immediately but I expect to invalidate it through cloudfront api. And uploading an image takes a lot of time.
At this point I can't decide whether I use cloudfront correctly or not.
This my carrierwave.rb file inside config/initializer:
CarrierWave.configure do |config|
config.fog_provider = 'fog/aws'
config.fog_credentials = {
provider: 'AWS',
aws_access_key_id: 'key',
aws_secret_access_key: 'value',
region: 'us-east-1'
}
config.storage :fog
config.asset_host = 'http://images.my-domain.com'
config.fog_directory = 'bucket_name'
config.fog_public = true
config.fog_attributes = { cache_control: "public, max-age=315576000" }
end
I can't see what I'm missing ? How can I be assure that I'm using cloudfront properly ?
Thanks.
Your images aren't being stored in CloudFront, they're being served through CloudFront's CDN.
First request for an image served through CF looks like this:
Browser -> CloudFront -> S3
|
Browser <- CloudFront <-
The second request for an image just looks like this:
Browser -> CloudFront
|
Browser <-
The second request never hit's CF because CF has cached the result for that URL.
NOW, your avatar's updating immediately is simply probably because it's being uploaded to S3 and resulting in a new URL, and thusly, an immediate update. This is how you want it to work.

How to properly setup Rails + Paperclip + AWS CDN + Heroku

It seems like I finally figured how to setup Rails + Paperclip + AWS CDN + Heroku.
Everything seems to be working. Both CSS and js files load from cdn, as well as images.
Unfortunately sharing functionality is broken. Open graph can't parse image url. I assume it's because links are in this format https:////drex16ydhdd8s.cloudfront.net/...rest_of_url
Originally, long time ago, I've configured CDN link to be //drex16ydhdd8s.cloudfront.net. I understand I need to remove slashes in front of the link, make it drex16ydhdd8s.cloudfront.net instead.
The problem is, if I do it, Heroku gives me Application Error. (displays their static page)
Logs don't display anything helpful, other than it seems it goes over memory limit pretty much immediately.
I've contacted Heroku support, but their response was
You should not need any slashes, it should just be a host name. (As seen in the documentation for config.action_controller.asset_host.)
If removing the slashes causes errors, you'll want to debug those errors.
I tried to do it locally, everything seems to work as expected.
environments/production.rb
config.action_controller.asset_host = ENV.fetch("ASSET_HOST", ENV.fetch("APPLICATION_HOST"))
config.paperclip_defaults = {
storage: :s3,
s3_protocol: :https,
s3_region: ENV["AWS_REGION"],
url: ":s3_alias_url",
path: "/:class/:attachment/:id_partition/:style/:filename",
s3_host_alias: ENV.fetch("ASSET_HOST"),
s3_credentials: {
bucket: ENV["S3_BUCKET_NAME"],
access_key_id: ENV["AWS_ACCESS_KEY_ID"],
secret_access_key: ENV["AWS_SECRET_ACCESS_KEY"]
},
default_url: "https://s3.amazonaws.com/ezpoisk/missing-small.png"
}
env variable
ASSET_HOST = //drex16ydhdd8s.cloudfront.net
on CDN I have 2 befaviors
/assets/* - that points to domain name
default (*) - that points to s3 bucket
Does anyone have any ideas?
Solution.
I had in production.rb
config.assets.compile = true
I'm not strong on details here, I just remember that I made a note on this line to possibly remove it when switching to cdn.
After digging through this answer I have decided to try it out. So I
removed the lined,
deployed,
all works fine.
Tried updating cdn link then.
At first, same issue persisted, url for some reason was /drex16ydhdd8s.cloudfront.net, but after few seconds it now seems
to be all good.

Http links from Paperclip with s3

I'm using Paperclip with s3_permissions = :private
I have some assets of the same model that are public and I want to generate a HTTP link for them (not HTTPS)
for generating the url I'm currently using my_model.my_asset.expiring_url(1000)
how can I do that?
Thank you
You should be able to configure the s3_protocol to be HTTP (it defaults to HTTPS when s3_permissions are not public_read):
# config/application.rb
config.paperclip_defaults = {
storage: :s3,
s3_protocol: 'http',
s3_permissions: :private,
s3_credentials: { ... }
}
That said, if you set the protocol to HTTP, you will be potentially exposing the assets anyway. HTTPS would be preferable if you care about privacy of the assets.
You can read more about the available options here.

rails carrierwave + fog speed optimisation

I currently using carrierwave with fog to store and upload images using an s3 bucket but the images load much slower than they should. These images load almost instantly when stored as part of the application - but stored with carrierwave and fog it takes a few seconds.
Is this a problem with my s3 setup or carrierwave/fog? My carrierwave config is the following:
CarrierWave.configure do |config|
config.fog_credentials = {
:provider => 'AWS', # required
:aws_access_key_id => '***', # required
:aws_secret_access_key => '***', # required
}
config.cache_dir = "#{Rails.root}/tmp/uploads" # To let CarrierWave work on heroku
config.fog_directory = 'bucketname' # required NB: having '.' in the bucket name creates an untrusted certificate
config.fog_public = false # optional, defaults to true
config.fog_attributes = {'Cache-Control'=>'max-age=315576000'} # optional, defaults to {}
end
I do have my s3 bucket configured for the US and I'm located in Australia so that might pose a few problems - but my heroku app is also configured to the US and it loads the same images blazingly quick when they're stored as part of the app itself. Maybe aws isn't the best solution?
Anyway any solutions on how I can improve the speed of image load time would be great. It just seems unnecessarily slower than it should be.
It sounds like you want to use CloudFront, Amazon's CDN (content delivery network) service that integrates with S3. Using a CDN will globally replicate the content you're storing in CDN (for a price), which should improve your load times.
After you set up a CloudFront account and link it to S3, add a line like the following to your CarrierWave configuration:
config.asset_host = "http://1234567.cloudfront.net"
With the URL that you get during CloudFront setup.
Unfortunately it looks like you may also need to set config.fog_public = true for Carrierwave to be able to use Amazon's CDN.

Resources