How can I preserve storage space and load time with Active Storage? - ruby-on-rails

I have a user submission form that includes images. Originally I was using Carrierwave, but with that the image is sent to my server for processing first before being saved to Google Cloud Services, and if the image/s is/are too large, the request times out and the user just gets a server error.
So what I need is a way to upload directly to GCS. Active Storage seemed like the perfect solution, but I'm getting really confused about how hard compression seems to be.
An ideal solution would be to resize the image automatically upon upload, but there doesn't seem to be a way to do that.
A next-best solution would be to create a resized variant upon upload using something like #record.images.first.variant(resize_to_limit [xxx,xxx]) #using image_processing gem, but the docs seem to imply that a variant can only be created upon page load, which would obviously be extremely detrimental to load time, especially if there are many images. More evidence for this is that when I create a variant, it's not in my GCS bucket, so it clearly only exists in my server's memory. If I try
#record.images.first.variant(resize_to_limit [xxx,xxx]).service_url
I get a url back, but it's invalid. I get a failed image when I try to display the image on my site, and when I visit the url, I get these errors from GCS:
The specified key does not exist.
No such object.
so apparently I can't create a permanent url.
A third best solution would be to write a Google Cloud Function that automatically resizes the images inside Google Cloud, but reading through the docs, it appears that I would have to create a new resized file with a new url, and I'm not sure how I could replace the original url with the new one in my database.
To summarize, what I'd like to accomplish is to allow direct upload to GCS, but control the size of the files before they are downloaded by the user. My problems with Active Storage are that (1) I can't control the size of the files on my GCS bucket, leading to arbitrary storage costs, and (2) I apparently have to choose between users having to download arbitrarily large files, or having to process images while their page loads, both of which will be very expensive in server costs and load time.
It seems extremely strange that Active Storage would be set up this way and I can't help but think I'm missing something. Does anyone know of a way to solve either problem?

Here's what I did to fix this:
1- I upload the attachment that the user added directly to my service provider ( I use S3 ).
2- I add an after_commit job that calls a Sidekiq worker to generate the thumbs
3- My sidekiq worker ( AttachmentWorker ) calls my model's generate_thumbs method
4- generate_thumbs will loop through the different sizes that I want to generate for this file
Now, here's the tricky part:
def generate_thumbs
[
{ resize: '300x300^', extent: '300x300', gravity: :center },
{ resize: '600>' }
].each do |size|
self.file_url(size, true)
end
end
def file_url(size, process = false)
value = self.file # where file is my has_one_attached
if size.nil?
url = value
else
url = value.variant(size)
if process
url = url.processed
end
end
return url.service_url
end
In the file_url method, we will only call .processed if we pass process = true. I've experimented a lot with this method to have the best possible performance outcome out of it.
The .processed will check with your bucket if the file exists or not, and if not, it will generate your new file and upload it.
Also, here's another question that I have previously asked concerning ActiveStorage that can also help you: ActiveStorage & S3: Make files public

I absolutely don't know Active Storage. However, a good pattern for your use case is to resize the image when it come in. For this
Let the user store the image in Bucket1
When the file is created in Bucket1, an event is triggered. Plug a function on this event
The Cloud Functions resizes the image and store it into Bucket2
You can delete the image in Bucket1 at the end of the Cloud Function, or keep it few days or move it to cheaper storage (to keep the original image in case of issue). For this last 2 actions, you can use Life Cycle to delete of change the storage class of files.
Note: You can use the same Bucket (instead of Bucket1 and Bucket2), but an event to resize the image will be sent every time that a file is create in the bucket. You can use PubSub as middleware and add filter on it to trigger your function only with the file is created in the correct folder. I wrote an article on this

Related

How to create large CSV file and send it to front end

I'm building a project where the front end is react and the backend is ruby on rails and uses a postgres DB. A required functionality is the ability for users to export a large dataset. So they'll get a table view and click "export" and that will send a request to the backend which should create a CSV file and send it to the front end.
This is the query that displays the data in the table and how it's executed (using find_by_sql)
query = <<-SQL
SELECT * FROM ORDERS WHERE ORDERS.STORE_ID = ? OFFSET ? LIMIT ?
SQL
query_result = Order.find_by_sql([query, store_id.to_i, offset.to_i, 50])
Now whenever the users click export, it's going to make a request to the same endpoint except it'll set a flag to notify the backend that it wants a CSV file and the limit will be much greater than 50...it could be hundreds of thousands to millions of records.
What is the best way to create a CSV to send to the front end, taking into account that the number of records will be large.
You have a couple of options:
Create a temporary file, use the standard CSV library to populate that temporary file, and then use send_file to dispatch that file to the user.
Depending on the size of data and/or your server's ability to host large temporary files, spooling to a tempfile might take too long or be otherwise impractical. In that case, you might want to stream the CSV data as it's generated, which is more complicated to set up but lessens the impact on your server.
This article has some well thought out steps to set up an interface for streaming data. As a bonus, it also delegates the act of generating the CSV to PostgreSQL itself. This will give you the best possible performance, but at the expense of code readability. However, it should set you on your way.

Will this lua code work to download certain files on my GMOD server

I have recently been building my GMOD server and is slowy getting popular but I was interested to create a addon so I put together something that, should, download some worksop links in loading screen and others ingame. This is what I have created.
sv_auto_download:
// Write the map download codes below
resource.addworkshop( "" )
function DownloadFiles()
// Write the the texture codes below
resource.addworkshop( "" )
return ""
end
hook.Add ( "PlayerInitialSpawn", "DownloadFiles" )
No, this will not work.
Firstly, PlayerInitalSpawn runs after the loading screen and also resource.addworkshop is a server side function that is loaded once so that the server knows to load the workshop files, meaning addons will still be downloaded in the loading screen anyhow.
You can not "download some worksop links in loading screen and others ingame" and you should not force players to download 10gigs of models if they don't want to.
The best way to get players to download addons is through the workshop.
Create a collection on the steam workshop for your addons, for example http://steamcommunity.com/sharedfiles/filedetails/?id=1244735564
Head over to http://steamcommunity.com/dev/apikey and use your server's IP address as the website and save your api key somewhere safe (i.e. don't share it)
Go to your launch options for scrds.exe (either a .bat file or in the server dashboard) and add -authkey 3XAMPL3K3YF0RTH3T3ST3 +host_workshop_collection (collection ID), with the collection ID being the ?id=1244735564 part of the collections URL
Then, players will automatically download server content and it is easy for you to add more addons, with it also serving as a way for players to quickly download large models permanently if they wish to play on your server for an extended period of time.
By the way, you forgot to include the function delegate into the hook.Add:
hook.Add ( "PlayerInitialSpawn", "DownloadFiles", DownloadFiles )

Session loses its data with Carrierwave multi files upload using jquery file upload

Simply I've a model called Estate which has many :images (Image model) and
I'm using CarrierWave & jQueryFileUpload to handle these in Estate New/Edit forms.
While creating new estate, and upload an image, jQueryFileUpload is doing that using AJAX request, so I make all uploaded images paths be stored in a session array then I use this array in create or update actions to save the images from tmp directory to actual directory.
This works fine with me, but The problem is when I've selected more than one photo at a time, session array store only last selected image, and not all images be pushed to the session array except the last one.
def images_url_list
#image = Image.new(image_params)
session[:cached_images_paths] << #image.image_file.current_path
end
I've debugged this action and I found that if I selected 5 images at a time, images_url_list action is fired 5 times, so say that I upload an image called "path0" then I uploaded 5 images which paths called ["path1","path2","path3","path4","path5"] and the session already has a path called "path0", first time after uploading first image the session will be ["path0", "path1"], second time ["path0","path2"] and so on until last image which is path5.
So the final count of image paths is only 2 rather than 6 image paths.
Can anyone tell me what exactly the problem is?
I think the problem is the multi-request and the session save.
In my case using a similar environment, only the last uploaded file was saved. I couldn't make it work adding it in a filesUploaded array, so I though about an alternative solution.
A good approach might be:
Generate a session upload hash and store it for each uploaded file on the upload listener
Use it to identify the uploaded files after the multi-upload is finished

Advantages and disadvantages of BLOBS (security)

A year ago, when I did simple PHP sites for people with a simple MySQL database, I was brought up to think that storing an entire image in the database was possible but a terrible idea. Instead you should store the image in the filesystem and simply store an image path in the database. I did agree with that from the start, despite my inexperience. It must keep the database light when you're backing it up to an external service, and makes it faster during actual local use. This later point, however, is complete speculation, and I'd like someone to clarify my theories:
When you store the images associated with objects in the database as a BLOB, when you request this object, is the whole object and its attributes (including this huge amount of image information) written to memory, even when it's not needed? E.g.
2.0.0p247 :001 > Object.column_names
=> ["id", "name", "blob"]
2.0.0p247 :001 > Object.first.blob
=> # not sure what this will return! I'm guessing a matrix-like wall of image information?
2.0.0p247 :003 > Object.first.name
User Load (0.8ms) SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT 1
=> "Kitty"
I understand that the call to Object.first.blob will take a relatively long amount of time because we're retrieving a large amount of image information. But will Object.first.name take the same amount of time because Object.first writes everything, id, name and blob all to memory? If the answer to this question is yes, that's a pretty good reason to never use BLOBS. However, if the answer is no, and rails is smart enough to only write requested attributes to memory then BLOBS suddenly become very attractive.
To be quite honest with you guys I'm really crossing my fingers that you'll say storing images in a BLOB is fine and dandy. It'll make things so much easier. Backing up will be simple. It'll feel very nice to back up the dynamic content of the site in one 'modular' upload instead of resorting to some elaborate whenever augmented rake task to make sure the paths and their respective images are uploaded to an external location.
More so it is absolutely impossible to make certain images private with Rails. I've searched high, I've searched low, I've asked here on SO. Got a few upvotes, but no solid response. No tutorials online. Nothing. Bags of tutorials on how to store images in the assets folder, but nothing to make images private.
Let's say I have three types of user, typeA, typeB and typeC. And let's say I have three types of images. So database schema would be as follows:
images
=> ["image_path","blob","type"]
users
=> ["name","type"]
What I want is that the users can request only the following:
typeA:
Can only view images with a type of A
Cannot view images with a type of B
Cannot only view images with a type of C
typeB:
Can only view images with a type of B
Cannot view images with a type of A
Cannot only view images with a type of C
typeC:
Can only view images with a type of C
Cannot view images with a type of A
Cannot only view images with a type of B
And yes, I could have given you the example with two types of user and image, but I really want to make sure you understand the problem; the actual system I have in mind will have hundreds of types.
Like I say, pretty simple idea, but I've found it impossible with rails, because all images are stored in the public folder! So a typeB user can just type /assets/typeAImage.jpg and they've got it! Heck, even someone who isn't a user can do it.
The send_file method won't work for me, because I'm not sending them the image as download per sae, I'm showing them image in the view.*
Now, using BLOBS would very neatly solve this problem if the images were stored in the database. I'm not sure of the actual syntax, but I could do something like this in a user view:
<% if current_user.type == image.type do %>
<%= image_tag image.blob #=> <img src="/assets/typaAImage.jpg" alt="..." class="..."> %>
<% end %>
And yeah, you could do exactly the same thing with a path:
<% if current_user.type == image.type do %>
<%= image_tag image.path #=> <img src="/assets/typaAImage.jpg" alt="..." class="..."> %>
<% end %>
but like I say, someone who isn't even a user could simply request /assets/typeAImage.jpg. Impossible to do this if it's stored in a BLOB.
In conclusion:
What's the problem with popplers BLOBS? I'm running a postgres database on Heroku with dozens of users per second. So yeah, not Facebook, but not Allegory on the Pointless of Life either, so performance matters. They've also got a strong mobile following so speed is of the essence. Will using BLOBS clash with this?
How do I display an image stored in a BLOB in a view?
And just to confirm, BLOBS will allow me to securely show secure secret images to certain members over https?
What about database backup speed? That'll take a hit, but I want to backup the images anyway and it's a nightly thing so who cares if it's slow? Right?
The images will be secure so long as the backup is encrypted, right? And just as passwords are stored as hashes within the database, should I store my super-secret BLOBS in an encrypted format as well? I'm thinking yes... Do you reckon bcrypt will be up to the task? I don't see why not.
Are BLOBS considered amateurish and lazy?
...and finally a bonus point (possibly outside the scope of the question):
*= As I wrote this I was thinking 'yes, but showing the image in the view is downloading the image to them. So can the send_file method be used to create private images in the way I describe and use the filesystem to store the images?
To answer the first question: Yes, it's possible in rails to lazy load some attributes, but by default Active Record does not support it There is a gem for this though. (DataMapper does support this by default, and there is a plugin for Sequel as well).
For the second part: the biggest drawback of your approach is performance. Assets are best served via a fast, static web server that can load the files from the filesystem, and not through a dynamic application. Clogging up the database server to query large blobs is not recommended, especially since it's much harder to scale a database than a file system.
And for the last part: there are various options you can use to hide files from the user, and only serve them when needed. One of the options is X-SendFile, or X-Accel-Redir where you can specify a filename inside the returned headers in Rails, and the web-server that is proxying the requests (and which also support this header) will pick that up, and serve that file. This is not a redirect, so the URL will still be the same, and the file can still be hidden from normal access. Of course for this you have to proxy your requests to rails through a web-server, which is usually already happening at least at the load-balancing level, or if you are using passenger.
Also note that you can tell Rails to send X-SendFile headers when serving the ordinary asset files as well.
Also see this answer.

How to store a picture within Active Directory using Ruby in a Rail3App?

All I want to do is to upload an image into the Active Directory. So far I can update any AD information but the image. I have tried to search for some idea but came up with nothing so far.
Do I have to encode an image in a certain way? Do I just ldap-replace the jpegPhoto attribute with a byte-string of the photo?
Any hint towards a solution would be great.
Thanks in advance!
First of all, there is an attribute in Active directory called thumbnailPhoto. According to this Microsoft article The thumbNailPhoto attribute contains octet string type data. The AD interprets octet string data as an array of bytes.
If you want a sample code in C# you can get something here.
On the theorical point of view you can also inject a photo with LDIF using tools like "B64" to code your image file in base 64.
Secondly, On my point of view a Directory is not a database.
So, even if the attribute exists (created by netscape according to the OID 2.16.840.1.113730.3.1.35), even if Microsoft explain us how to put a picture into Active Directory, I think that it's better to register an URL, or a path to a file from a file system into a Directory.
I have no idea of the impact on performance of AD if I load each entry with 40 Ko (average size of a thumbnail photo). But I know that if there are bad written programs on the network, I mean kind of program that load all the attributes when they search an entry into the directory, this will considerably load the network.
I hope it helps.
JP
I had this issue and was able to get it working by creating a File stream and passing it through to #ldap.replace_attribute as a binary file. i.e.
thumbnail_stream = open("path_to_file")
#ldap.replace_attribute USERS_DN, :thumbnailPhoto, File.binread(thumbnail_stream)
Where #ldap is an instance of net/ldap, bound to AD. i.e.
#ldap = Net::LDAP.new
#ldap.host = ''
#ldap.port = ''
#ldap.auth USERNAME, PASSWORD
#ldap.bind

Resources