I have a site where a user can upload a photo. I have no idea how to handle photos. To make a thumbnail out of a large res photo, do I just resize the width and the height? Or is there a better way to do this?
If you could point me to any resources or give me any tips, that would be great.
I'm using Ruby on Rails, if that matters. I don't really want gems for this because I want to learn how to do it myself.
For some of this, "learning how to do it yourself" is going to be a significant undertaking. Resizing the image, for example. Go ahead and find an open source library (such as a gem) that resizes images and look through its source code. It's not impossible to do on your own, but a lot of that sort of thing is built on the expertise that's come before, etc. There's nothing wrong with making use of a tool somebody else has created, provided that you understand what the tool is doing.
A few points to hopefully help you out:
Go with a white-list approach of which image formats you support. Don't just let users upload anything that they call an image.
Each format you support is going to have its own standards (possibly multiple) for meta-data. Stripping out that data wholesale may or may not be a good idea. For example, a jpeg may contain its orientation in its EXIF data and if you strip that out you may be effectively rotating the image. Certain fields, such as geotagging, you may want to strip out in the effort to protect your users' privacy, etc. Again, look into existing libraries for this and see how they do it.
DO NOT implicitly trust the file name extension for determining the type of the image. It's possible for a user to construct a malicious file that isn't really an image, pass it off as an image to an unsuspecting host, and effectively open a security flaw on that host as it tries to process the file as an image. There was a question about determining file type in Ruby here, and I'm sure there's a lot more to be found on the subject.
David answered the question well, but I thought I might be able to provide some more specific information regarding your question.
Use the Paperclip gem, combined with RMagick, an ImageMagick wrapper for Ruby. You can set post-processing options and create multiple resizings.
If you really want to do it yourself, checkout the actual gem at https://github.com/thoughtbot/paperclip and you'll see how that author does it. Part of the thought behind Ruby on Rails is DRTW (Don't Reinvent the Wheel). Utilize what's out there and build on it. It will save you time, and enable you to do more in the longrun.
An alternative is to use a third party service such as http://resizer.co (I am not affiliated)
Replace a URL to an image from your site, say:
http://example.com/images/abc123.jpg
with the following (for a 200 x 200 image)
http://www.resizer.co?image=http://example.com/images/abc123.jpg&w=200&h=200
You may run into issues with distortion and the 'aspect ratio' not being maintained, but it could be a good start.
Related
please bear with me as I'm not trying to frustrate anyone with inane questions, and I did google search this but I couldn't really find anything recent or helpful.
I am a novice programmer and I am using a classic asp web application. I just enabled the users to upload and download images, but I'm quickly regretting it as it's eating up all of the router bandwidth. I am finding my solution inadequate, so I wanted to start over.
My desire is threefold with this functionality:
Compression. I understand that this is impossible to do BEFORE uploading without some kind of Java/Silverlight/Flash portion of the application to handle uploads, correct? What is the common way most places go about this? Just allow regular file uploads and compress once they are on the server?
Resizing. I want to resize all images before they are uploaded to a reasonable size, instead of just telling users that try and upload huge camera images that they can't upload. I figure I just want to let them upload and have it resize for them before uploading. Does this functionality exist already?
Changing filetype. I want to allow users to upload all image file types but make them .jpg on the server after the upload.
With these three requirements, how hard is it to implement something like this in just pure code and libraries? Would it be better to just use a 3rd party plugin, such as ASPjpeg or ASPupload? Have you encountered something similar, and what was your solution?
Thanks.
Take a look at ASPJpeg and ASPUpload from Persits. We use these components to upload a full size image (can be png even though the library is "ASPJpeg"), resize it to several different sizes we need on our site, then store the resized images on the server in a variety of folders. The ASPUpload component is a little tricky but if you follow their sample code you'll be fine.
I never found a good component for decompressing uploaded zip files and had to write my own, which I've since abandoned. In the end with upload speeds increasing and storage getting so cheap, it started to matter less and less that the files were compressed before being uploaded.
EDIT: Just noticed you mentioned these components in your question. Consider this an endorsement of your idea to use them. :-)
I am trying to make an application which make a editable document file(doc or pdf) from an image. I am planning to use tesseract for extraction of the text. But i am not yet sure how to get the basic formatting of the text(size,bold,italic,underline) & images that might be present in the document image. I am planning to use J2EE, to make a Web Based App(Have to use J2EE). I think i might be able to recognize the components and formatting of the document using OpenCV, but i am not really sure.
Given that you are planning to use Tesseract for the basic OCR capabilities, try looking into the hORC formatted output. That includes quite a lot of additional information about font-size, font-face, position, etc.
You can find a description of hOCR here:
https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview#heading=h.e903b9bca924
If that doesn't work out, it depends on how much effort you want to put into Tesseract. It's internal APIs (available in Java via Tess4J, among others) do provide much of the information that you would need to reconstruct the page layout.
First post on SO; hopefully I am doing it right :-)
Have a situation where users need to upload and view very high resolution files (they need to pan, tilt, zoom, and annotate images). A single file sometimes crosses 1 GB so loading complete file on client side is not an option.
We are thinking about letting the users upload files to the server (like everyone does), then apply some encryption on server side creating multiple, relatively small low resolution images with varying sizes. We then give users thumbnails with canvas size option on the webpage for them to pick and start their work.
Lets assume a user opens low grade image with 1280 x 1028 canvas size. Image will be broken into tiles before display, and when user clicks on a title it will be like zooming in to a specific tile. Client will send request to the server asking for higher resolution image for the title. Server will send the image which will be broken into titles again for the user to click and get another higher resolution image from server and so on ... Having multiple images with varying resolution will help us break images into tiles and serve user needs ('keep zooming in' or out using tiles).
Has anyone dealt with humongous image files? Is there a preferred technical design you can suggest? How to handle areas that have been split across tiles is bothering me a lot so not sure how above approach can be modified to address this issue.
We need to plan for 100 to 200 users connected to the website simultaneously, and ours is .NET environment if it matters
Thanks!
The question is a little vague. I assume you are looking for hints, so here are a few:
I see uploading the images is a problem in the firstplace. Where I come from, upload-speeds are way slower than download speeds. (But there is litte you can do if you need your user to upload gigabytes...) Perhaps offer some more stable upload than web. FTP if you must.
Converting in smaller pieces should be no big problem. Use one of the availabe tools. Perhaps imagemagick. I see there is a .net wrapper out: https://magick.codeplex.com/
More than converting alone I think it is important not to do it everytime on the fly (you would need a realy big machine) but only once the image is uploaded. If you want to scale you can outsource this to another box in the network.
For the viewer. This is the interessting part. There are some ready to use ones. Google has one. It's called 'Maps' :). But there is a free alternative: OpenLayers from the OpenStreetmap Project: http://wiki.openstreetmap.org/wiki/OpenLayers All you have to do is naming your generated files in the right way and a litte configuration.
Even if you must for some reasons create the tiles on the fly or can't use something like OpenLayers I would try to stick to its naming scheme. Having something working to start with is never a bad idea.
I am interested in building a Rails based system for handling the display and organization of large amounts of photos. This is sort of like Flickr but smaller. Each photo will have metadata associated with it. Photos will be shown in a selectable list and grid view. It would be nice to be able to load images as they are needed as well (as this would probably speed things up).
At the moment I have a test version of my database working by images loading from the assets/images directory but it is beginning to run slow when displaying several images (200-600 images). This is due to the way I have my view setup. I am using a straight loop to display the images in both list and grid layouts.
I also manually resized the thumbnails and a medium sized image from a full sized source image. I am investigating other resizing methods. Any advice is appreciated here as well.
As I am new to handling the images this way, could someone point me in a direction based on experience designing and implementing something like Flickr?
I am investigating the following tools:
Paperclip
http://railscasts.com/episodes/134-paperclip
Requirements: ImageMajick
attachment_fu
http://clarkware.com/blog/2007/02/24/file-upload-fu#FileUploadFu
Requirement: One of the following: ImageScience, RMagick, miniMagick, ImageMajick?
CarrierWave
http://cloudinary.com/blog/ruby_on_rails_image_uploads_with_carrierwave_and_cloudinary
http://cloudinary.com/blog/advanced_image_transformations_in_the_cloud_with_carrierwave_cloudinary
I'd go with Carrierwave anyday. It is very flexible and has lot of useful strategies. It generates it's on Uploader class and has all nifty and self explanatory features such as automatic generation of thumbnails (as specified by you), blacklisting, formatting image, size constraints etc; which you can put to your use.
This Railscast by Ryan Bates - http://railscasts.com/episodes/253-carrierwave-file-uploads is very useful, if you haven't seen it already.
Paperclip and CarrierWave are totally appropriate tools for the job, and the one you choose is going to be a matter of personal preference. They both have tons of users and active, ongoing development. The difference is whether you'd prefer to define your file upload rules in a separate class (CarrierWave), or if you'd rather define them inline in your model (Paperclip).
I prefer CarrierWave, but based on usage it's clear plenty of people feel otherwise.
Note that neither gem is going to do anything for your slow view with 200-600 images. These gems are just for handling image uploads, and don't help you with anything beyond that.
Note also that Rails is really pretty bad at handling file uploads and file downloads, and you should avoid this where possible by letting other services (a cdn, your web server, s3, etc) handle these for you. The central gotcha is that if you handle a file transfer with rails, your entire web application process is busy for the duration of the transfer. (For related discussion on this topic, see: Best Ruby on Rails Architecture for Image Heavy App).
I'm using Google's Custom Search API to dynamically provide web search results. I very intensely searched the API's docs and could not find anything that states it grants you access to Google's site image previews, which happen to be stored as base64 encodes.
I want to be able to provide image previews for sites for each of the urls that the Google web search API returns. Keep in mind that I do not want these images to be thumbnails, but rather large images. My question is what is the best way to go about doing this, in terms of both efficiency and cost, in both the short and long term.
One option would be to crawl the web and generate and store the images myself. However this is way beyond my technical ability, and plus storing all of these images would be too expensive.
The other option would be to dynamically fetch the images right after Google's API returns the search results. However where/how I fetch the images is another question.
Would there be a low cost way of me generating the images myself? Or would the best solution be to use some sort of site thumbnailing service that does this for me? Would this be fast enough? Would it be too expensive? Would the service provide the image in the correct size for me? If not, how could I change the size of the image?
I'd really appreciate answers that are comprehensive and for any code examples to be in ruby using rails.
So as you pointed out in your question, there are two approaches that I can see to your issue:
Use an external service to render and host the images.
Render and host the images yourself.
I'm no expert in field, but my Googling has so far only returned services that allow you to generate thumbnails and not full-size screenshots (like the few mentioned here). If there are hosted services out there that will do this for you, I wasn't able to find them easily.
So, that leaves #2. For this, my first instinct was to look for a ruby library that could generate an image from a webpage, which quickly led me to IMGKit (there may be others, but this one looked clean and simple). With this library, you can easily pass in a URL and it will use the webkit engine to generate a screenshot of the page for you. From there, I would save it to wherever your assets are stored (like Amazon S3) using a file attachment gem like Paperclip or CarrierWave (railscast). Store your attachment with a field recording the original URL you passed to IMGKit from WSAPI (Web Search API) so that you can compare against it on subsequent searches and use the cached version instead of re-rendering the preview. You can also use the created_at field for your attachment model to throw in some "if older than x days, refresh the image" type logic. Lastly, I'd put this all in a background job using something like resque (railscast) so that the user isn't blocked when waiting for screenshots to render. Pass the array of returned URLs from WSAPI to background workers in resque that will generate the images via IMGKit--saving them to S3 via paperclip/carrierwave, basically. All of these projects are well-documented, and the Railscasts will walk you through the basics of the resque and carrierwave gems.
I haven't crunched the numbers, but you can against hosting the images yourself on S3 versus any other external provider of web thumbnail generation. Of course, doing it yourself gives you full control over how the image looks (quality, format, etc.), whereas most of the services I've come across only offer a small thumbnail, so there's something to be said for that. If you don't cache the images from previous searches, then your costs reduces even further, since you'll always be rendering the images on the fly. However I suspect that this won't scale very well, as you may end up paying a lot more for server power (for IMGKit and image processing) and bandwidth (for external requests to fetch the source HTML for IMGKit). I'd be sure to include some metrics in your project to attach some exact numbers to the kind of requests you're dealing with to help determine what the subsequent costs would be.
Anywho, that would be my high-level approach. I hope it helps some.
Screen shotting web pages reliably is extremely hard to pull off. The main problem is that all the current solutions (khtml2png, CutyCapt, Phantom.js etc) are all based around QT which provides access to an embedded Webkit library. However that webkit build is quite old and with HTML5 and CSS3, most of the effects either don't show, or render incorrectly.
One of my colleagues has used most, if not all, of the current technologies for generating screenshots of web pages for one of his personal projects. He has written an informative post about it here about how he now uses a SaaS solution instead of trying to maintain a solution himself.
The TLDR version; he now uses URL2PNG to do all his thumbnail and full size screenshots. It isn't free, but he says that it does the job for him. If you don't want to use them, they have a list of their competitors here.