Automatically reduce thumbnail image size on Quarto website - quarto

I've set up a website using the Quarto blog template, which includes a list of several posts and thumbnails for each of the posts. By default the thumbnails are taken from the first image included the post, if there is no image: specified in the YAML-like section at the start of the post's .qmd file. This works great, but many of the images I use in are very high definition (thousands of px across) – this is what I want in the posts themselves, but means that even on fast internet the website's homepage can take a while to load all the images.
Is there any automatic way to resize each of the thumbnail images so they're at most a few hundred px across?
If there isn't already a simple way to do this, I could write a script that found the thumbnail image for each post, created a resized version, saved it to the directory for the post, then specified that thumbnail as the image: for the post – I think I could get that to run using the pre-render: project option. But thought it worth checking there isn't already an easier way to achieve the same thing!

Related

How can we compress images in DAM AEM6.3?

We are trying to increase the page score (google) for our website. One of the options to do this is "Image optimization".
As we have a huge number of images in the DAM, how can we compress/optimize them? Does AEM have any such tool to achieve this?
ImageMagick is one of the tool to achieve this. Do we need to integrate that with AEM or we'll have to re-upload all the images after compressing them using the tool?
Any suggestions?
In contrast to CSS, JS and HTML files which can be gzipped using dispatcher, images can be compressed only by reducing quality or resizing them.
It is a quite common case for AEM projects and there are a couple of options to do that, some of them are coming out-of-the-box and do not even require programming:
You can extend DAM Update Asset with CreateWebEnabledImageProcess Workflow Process Step. It allows you to generate new image rendition with parameters like size, quality, mime-type. Depending on workflow launcher configuration, this rendition can be generated during creation or modification of assets. You can also trigger the workflow to be run on chosen or all assets.
In case that CreateWebEnabledImageProcess configuration is not sufficient for your requirements, you can implement your own Workflow Process Step and generate proper rendition programmatically, using for example ImageHelper or some Java framework for images transformation. That might be also needed if you want to generate the compressed images on the fly, for example, instead of generating rendition for each uploaded image, you can implement servlet attached to proper selectors and image extensions (i.e. imageName.mobile.png) which return the compressed image.
Eventually, integration with ImageMagick is possible, Adobe documentation describes how it can be achieved using CommandLineProcess Workflow Process Step. However, you need to be aware of security vulnerabilities related to this mentioned in the documentation.
It is also worth to mention that if your client needs more advanced solutions for images transformation in the future, then integration with Dynamic Media can also be considered as a possibility, however, this is the most costly solution.
There are many ways to optimise Images in AEM. Here I will go through 3 of those ways.
1) Using DAM Update Asset Workflow.
This is an out of the box workflow in AEM, Where on upload of images renditions get created . You can use those renditions path in img src attribute.
2) Using ACS commons Image transformer
Install ACS commons Package , Use Image transformer Servlet config to generate optimised or transformed images acc to requirement. For more Info on this check ACS AEM commons.
3) Using Google PageSpeed in dispatcher level
If you want to reduce the size of image, Google PageSpeed is an option to consider. Install PageSpeed in dispatcher level and add image optimise rules to achieve your requirement.
This rule Insights detects the images on the page that can be optimized to reduce their filesize without significantly impacting their visual quality.
for more info check here Optimising Images
AEM offers options for "image optimisation" but this is a broad topic so there is no "magic" switch you can turn to "optimise" your images. It all boils down to the amount of kilo- or megabytes that are transferred from AEM to the users browser.
The size of an asset is influenced by two things:
Asset dimension (width and height).
Compression.
The biggest gains can be achieved by simply reducing the assets dimensions. AEM does that already. If you have a look at your assets renditions you will notice that there is not just the so called original rendition but several other renditions with different dimensions.
MyImage.jpg
└── jcr:content
└── renditions/
├── cq5dam.thumbnail.140.100.png
├── cq5dam.thumbnail.319.319.png
├── cq5dam.thumbnail.48.48.png
└── original
The numbers in the renditions name are the width and height of the rendition. So there is a version of MyImage.jpg that has a width of 140px and a height of 100px and so on.
This is all done by the DAM Update Asset workflow when the image is uploaded and can be modified to generate more renditions with different dimensions.
But generating images with different dimensions is only half of the story. AEM has to select the rendition with the right dimension at the right moment. This is commonly referred to as "responsive images". The AEM image component does not support "responsive" images out of the box and there are several ways to implement this feature.
The gist of it is that your image component has to contain a list of URLs for different sized renditions. When the page is rendered client side JavaScript determines which rendition is the best for current screen size and adds the URL to the img tags src attribute.
I would recommend that you have a look at the fairly new AEM Core components which are not included with AEM. Those core components contain an image component that supports responsive images. You can read more about those here:
AEM Core Components Image Component (GitHub)
AEM Core Components Documentation
Usually, components like that will not use "static" renditions that were already generated by the DAM Update Asset workflow but will rely on a Adaptive Image Servlet. This servlet basically gets the asset path and the target width and will return the asset in the requested width. To avoid doing this over and over you should allow the Dispatcher to cache the resulting image.
Those are just the basic things you can do. There are a lot of other things that can be done but all of them with less and less gains in terms of "optimisation".
I had the same need, and I looked at ImageMagick too and researched various options. Ultimately I customized the workflows that we use to create our image renditions to integrate with another tool. I modified them to use the Kraken.io API to automatically send the rendition images AEM produced to Kraken where they would be fully web-optimized (using the default Kraken settings). I used their Java integration library to get the basic code for this integration. So eventually I ended up with properly web-optimized images for all the generated renditions (and the same could be done to the original) that were automatically optimized during a workflow without authors having to manually re-upload images. This API usage required a Kraken license.
So I believe the answer is that at this time AEM does not provide a feature to achieve this, and your best bet is to integrate with another tool that does it (custom code).
TinyPng.com was another image optimization service that looked like it would be good for this need and that also had an API.
And for the record, I also submitted this as a feature request to our AEM rep. It seems like a glaring product gap to me, and something I am surprised hasn't been built into the product yet to allow customers to make images fully web-optimized.

Handling very large image files in web browsers

First post on SO; hopefully I am doing it right :-)
Have a situation where users need to upload and view very high resolution files (they need to pan, tilt, zoom, and annotate images). A single file sometimes crosses 1 GB so loading complete file on client side is not an option.
We are thinking about letting the users upload files to the server (like everyone does), then apply some encryption on server side creating multiple, relatively small low resolution images with varying sizes. We then give users thumbnails with canvas size option on the webpage for them to pick and start their work.
Lets assume a user opens low grade image with 1280 x 1028 canvas size. Image will be broken into tiles before display, and when user clicks on a title it will be like zooming in to a specific tile. Client will send request to the server asking for higher resolution image for the title. Server will send the image which will be broken into titles again for the user to click and get another higher resolution image from server and so on ... Having multiple images with varying resolution will help us break images into tiles and serve user needs ('keep zooming in' or out using tiles).
Has anyone dealt with humongous image files? Is there a preferred technical design you can suggest? How to handle areas that have been split across tiles is bothering me a lot so not sure how above approach can be modified to address this issue.
We need to plan for 100 to 200 users connected to the website simultaneously, and ours is .NET environment if it matters
Thanks!
The question is a little vague. I assume you are looking for hints, so here are a few:
I see uploading the images is a problem in the firstplace. Where I come from, upload-speeds are way slower than download speeds. (But there is litte you can do if you need your user to upload gigabytes...) Perhaps offer some more stable upload than web. FTP if you must.
Converting in smaller pieces should be no big problem. Use one of the availabe tools. Perhaps imagemagick. I see there is a .net wrapper out: https://magick.codeplex.com/
More than converting alone I think it is important not to do it everytime on the fly (you would need a realy big machine) but only once the image is uploaded. If you want to scale you can outsource this to another box in the network.
For the viewer. This is the interessting part. There are some ready to use ones. Google has one. It's called 'Maps' :). But there is a free alternative: OpenLayers from the OpenStreetmap Project: http://wiki.openstreetmap.org/wiki/OpenLayers All you have to do is naming your generated files in the right way and a litte configuration.
Even if you must for some reasons create the tiles on the fly or can't use something like OpenLayers I would try to stick to its naming scheme. Having something working to start with is never a bad idea.

Can I use Amazon Elastic Transcoder to only create thumbnails?

I have a Rails app using Paperclip to upload and store videos on Amazon S3. I'm not particularly interested converting the video files into another format, or adding watermarks, nothing fancy. I just want to create thumbnails from the videos to use as poster images on my video players.
I see that Amazon Elastic Transcoder allows for free thumbnail creation (or rather, they don't charge for thumbnail creation), and since I'm already using Amazon services, I wanted to see if I can use this for my thumbnails.
Does anyone know how to set the input/output options such that no file is generated aside from thumbnails? Could I just do the following?
transcoder = AWS::ElasticTranscoder::Client.new
transcoder.create_job(
pipeline_id: APP_CONFIG[Rails.env][:pipeline_id],
input: {
key: VIDEOPATH,
frame_rate: 'auto',
resolution: 'auto',
aspect_ratio: 'auto',
interlaced: 'auto',
container: 'auto'
},
output: {
key: , #LEAVE THIS BLANK TOO?
preset_id: , #LEAVE THIS BLANK?
thumbnail_pattern: "thumbnail",
rotate: '0'
}
)
No.
There are no functions for creating only thumbnails.
It also is not possible to create a new transcoding job without actually transcoding anything. The input parameters require, at minimum, the name of an input video. The output parameters require, at minimum, the name of the output file and a preset ID. Parameters are checked prior to starting the job, and there are no options which would prevent the job from executing while creating a thumbnail.
You can read about all of the available functions here:
http://docs.aws.amazon.com/elastictranscoder/latest/developerguide/api-reference.html
Give ffmpeg a look. It can be a little bit of a hassle to install, but it can create thumbnails from videos.
Amazon Elastic Transcoder does provide functionality for thumbnails.
http://docs.aws.amazon.com/elastictranscoder/latest/developerguide/preset-settings.html#preset-settings-thumbnails
It looks like you do indeed have to transcode a video file in order to get thumbnails though.
As mentioned in other comments, you need to pay the transcoding price for Elastic transcoder to generate a thumbnail.
Another similar option Amazon provides is MediaConvert. With MediaConvert, you can add an additional output of a number of image files that will be taken using a formula you need to provide (pick an image every X frames). As with Elastic Transcoder, this is expensive for getting only a thumbnail, and you are still not sure that the thumbnails you get are good images (not blurry and representative of the video).
As mentioned in another comment, using FFMpeg will work better in comparison. It's a pretty solution if you can maintain the infrastructure to do it (some sort of processing queue, running ffmpeg, and then uploading thumbnails).
Full disclosure: We faced a similar problem. Our volume was large enough that generating thumbnails by hand was getting cumbersome, and we'd often get blank thumbnails because it's hard to predict which frame is good across different videos. So we built a product that fixes this pain for us (and others in the same boat): https://mediamachine.io/
Instead of getting random thumbnails with no meaning (and, what is worst, paying for them), we use a ML algorithm to get the most representative thumbnail of the video, saving time AND money.

Need assistance choosing a image management gem

I am interested in building a Rails based system for handling the display and organization of large amounts of photos. This is sort of like Flickr but smaller. Each photo will have metadata associated with it. Photos will be shown in a selectable list and grid view. It would be nice to be able to load images as they are needed as well (as this would probably speed things up).
At the moment I have a test version of my database working by images loading from the assets/images directory but it is beginning to run slow when displaying several images (200-600 images). This is due to the way I have my view setup. I am using a straight loop to display the images in both list and grid layouts.
I also manually resized the thumbnails and a medium sized image from a full sized source image. I am investigating other resizing methods. Any advice is appreciated here as well.
As I am new to handling the images this way, could someone point me in a direction based on experience designing and implementing something like Flickr?
I am investigating the following tools:
Paperclip
http://railscasts.com/episodes/134-paperclip
Requirements: ImageMajick
attachment_fu
http://clarkware.com/blog/2007/02/24/file-upload-fu#FileUploadFu
Requirement: One of the following: ImageScience, RMagick, miniMagick, ImageMajick?
CarrierWave
http://cloudinary.com/blog/ruby_on_rails_image_uploads_with_carrierwave_and_cloudinary
http://cloudinary.com/blog/advanced_image_transformations_in_the_cloud_with_carrierwave_cloudinary
I'd go with Carrierwave anyday. It is very flexible and has lot of useful strategies. It generates it's on Uploader class and has all nifty and self explanatory features such as automatic generation of thumbnails (as specified by you), blacklisting, formatting image, size constraints etc; which you can put to your use.
This Railscast by Ryan Bates - http://railscasts.com/episodes/253-carrierwave-file-uploads is very useful, if you haven't seen it already.
Paperclip and CarrierWave are totally appropriate tools for the job, and the one you choose is going to be a matter of personal preference. They both have tons of users and active, ongoing development. The difference is whether you'd prefer to define your file upload rules in a separate class (CarrierWave), or if you'd rather define them inline in your model (Paperclip).
I prefer CarrierWave, but based on usage it's clear plenty of people feel otherwise.
Note that neither gem is going to do anything for your slow view with 200-600 images. These gems are just for handling image uploads, and don't help you with anything beyond that.
Note also that Rails is really pretty bad at handling file uploads and file downloads, and you should avoid this where possible by letting other services (a cdn, your web server, s3, etc) handle these for you. The central gotcha is that if you handle a file transfer with rails, your entire web application process is busy for the duration of the transfer. (For related discussion on this topic, see: Best Ruby on Rails Architecture for Image Heavy App).

HTML parsing: How to find the image in the document, which is surrounded by most text?

I am writing a news scraper, which has to determine the main image (thumbnail), given an HTML document of a news article.
In other words, it's basically the same challenge: How does Facebook determine which images to show as thumbnails when posting a link?
There are many useful techniques (preferring higher dimensions, smaller ratio, etc.), but sometimes after parsing a web page the program ends up with a list of similar size images (half of which are ads) and it needs to pick just one, which illustrates the story described in the document.
Visually, when you open a random news article, the main picture is almost always at the top and surrounded by text. How do I implement an HTML parser (for example, using xpath / nokogiri), which finds such an image?
There is no good way to determine this from code unless you have pre-knowledge about the site's layout.
HTML and DHTML allow you to position elements all over the page, either using CSS or JavaScript, and can do it after the page has loaded, which is inaccessible to Nokogiri.
You might be able to do it using one of the Watir APIs after the page has fully loaded, however, again, you really need to know what layout a site uses. Ads can be anywhere in the HTML stream and moved around the page after loading, and the real content can be loaded dynamically and its location and size can be changed on the fly. As a result, you can't count on the position of the content in the HTML being significant, nor can you count on the content being in the HTML. JavaScript or CSS are NOT your friends in this.
When I wrote spiders and crawlers for site analytics, I had to deal with the same problem. Because I knew what sites I was going to look at, I'd do a quick pre-scan and find my landmark tags, then write some CSS or XPath accessors for those. Save those with the URLs in a database, and you can quickly fly through the pages, accurately grabbing what you want.
Without some idea of the page layout your code is completely at the mercy of the page-layout people, and anything that modifies the page's element's locations.
Basically, you need to implement the wet-ware inside your brain in code, along with the ability to render the page graphically so your code can analyze it. When you, as a user, view a page in your browser, you are using visual and contextual clues to locate the significant content. All that contextual information is what's missing and what you'll need to write.
If I understand you correctly, your problem lies less with parsing the page, but with implementing a logic that successfully decides which image to select.
The first step I think is to decide, which images are news images and which are not (ads for example).
You can find that out by reading the image URL (src-attibute of the image-tag) and checking th host against the article host the middle part ("nytimes" in your example) should be the same.
The second step is to decide which of these is the most important one. For that you can use image size in article, position on page, etc. For step 2 you would have to try out what works best, for most sites. tweak your algorithm, until it produces the best results for most news sites.
Hope this helps

Resources