How can I set a maximum file size for pages in Scanbot - cordova-plugins

In my app documents are scanned and handled by Scanbot.io and then uploaded to the backend server. Is there away to configure a maximum size for this documents in Scanbot.
I checked the documentation could not find a relevant setting
Ideally scanned Page objects that are too large for upload would be flagged in the detectionResult field and could be handled accordingly.
Are there any experiences on achieving something similar?

A bit late, but in my experience with Scanbot there are a few settings to help control resulting size. It's not based on physical file size, but based on quality/resolution desires:
In initializeSdk you can set a storageImageQuality setting. I've
heard that this can confidently be set at 80 since quality above that
likely won't be noticeable to the eye.
A quality option exists on calls to other functions too (e.g. detectDocument, applyImageFilter), similar to above
The startDocumentScanner method has the ability to pass a documentImageSizeLimit object, which allows you to restrict the max width/height values.

Related

Resampling large bitmaps for lists in Android using MVVM Cross

I have a long list of cells which each contain an image.
The images are large on disk, as they are used for other things in the app like wallpapers etc.
I am familiar with the normal Android process for resampling large bitmaps and disposing of them when no longer needed.
However, I feel like trying to resample the images on the fly in a list adapter would be inefficient without caching them once decoded, otherwise a fling would spawn many threads and I will have to manage cancelling unneeded images etc etc.
The app is built making extensive use of the fantastic MVVMCross framework. I was thinking about using the MvxImageViews as these can load images from disk and cache them easily. The thing is, I need to resample them before they are cached.
My question is, does anybody know of an established pattern to do this in MVVMCross, or have any suggestions as to how I might go about achieving it? Do I need to customise the Download Cache plugin? Any suggestions would be great :)
OK, I think I have found my answer. I had been accidentally looking at the old MVVMCross 3.1 version of the DownloadCache Plugin / MvxLocalFileImageLoader.
After cloning the up to date (v3.5) repo I found that this functionality has been added. Local files are now cached and can be resampled on first load :)
The MvxImageView has a Max Height / Width setter method that propagates out to its MvxImageHelper, which in turn sends it to the MvxLocalFileImageLoader.
One thing to note is that the resampling only happens if you are loading from a file, not if you are using a resource id.
Source is here: https://github.com/MvvmCross/MvvmCross/blob/3.5/Plugins/Cirrious/DownloadCache/Cirrious.MvvmCross.Plugins.DownloadCache.Droid/MvxAndroidLocalFileImageLoader.cs
Once again MVVMCross saves my day ^_^
UPDATE:
Now I actually have it all working, here are some pointers:
As I noted in the comments, the local image caching is only currently available on the 3.5.2 alpha MVVMCross. This was incompatible with my project, so using 3.5.1 I created my own copies of the 3.5.2a MvxImageView, MvxImageHelper and MvxAndroidLocalFileImageLoader, along with their interfaces, and registered them in the Setup class.
I modified the MvxAndroidLocalFileImageLoader to also resample resources, not just files.
You have to bind to the MvxImageView's ImageUrl property using the "res:" prefix as documented here (Struggling to bind local images in an MvxImageView with MvvmCross); If you bind to 'DrawableId' this assigns the image directly to the underlying ImageView and no caching / resampling happens.
I needed to be able to set the customised MvxImageview's Max Height / Width for resampling after the layout was inflated/bound, but before the images were retrieved (I wanted to set them during 'OnMeasure', but the images had already been loaded by then). There is probably a better way but I hacked in a bool flag 'SizeSet'. The image url is temporarily stored if this is false (i.e. during the initial binding). Once this is set to true (after OnMeasure), the stored url is passed to the underlying ImageHelper to be loaded.
One section of the app uses full screen images as the background of fragments in a pager adapter. The bitmaps are not getting garbage collected quick enough, leading to eventual OOMs when trying to load the next large image. Manually calling GC.Collect() when the fragments are destroyed frees up the memory but causes a UI stutter and also wipes the cache as it uses weak refs.
I was getting frequent SIGSEGV crashes on Lollipop when moving between fragments in the pager adapter (they never happened on KitKat). I managed to work around the issue by adding a SetImageBitmap(null) to the ImageView's Dispose method. I then call Dispose() on the ImageView in its containing fragment's OnDestroyView().
Hope this helps someone, as it took me a while!

Google Cloud Dataflow: 413 Request Entity Too Large

Any suggestions on how to work around this error beside reducing the number of transformations in the flow (or, likely, reducing total serialized size of all transformation objects in flow graph)?
Thanks,
Dataflow currently has a limitation in our system that caps requests at 1MB. The size of the job is specifically tied to the JSON representation of the pipeline; a larger pipeline means a larger request.
We are working on increasing this limit. In the meantime, you can work around this limitation by breaking your job into smaller jobs so that each job description takes less than 1MB
To estimate the size of your request run your pipeline with the option
--dataflowJobFile = <path to output file>
This will write a JSON representation of your job to a file. The size of that file is a good estimate of size of the request. The actual size of the request will be slightly larger due to additional information that is part of the request.
Thank you for your patience.
We will update this thread once the limit has been increased.
This kind of errors usually come up when your bundle batch size for ingestion is over limitation (20MB).
I'm not sure if you're using WriteToBigQuery. If you're not, feel free to ignore this answer. I usually get solved by trying one of these 2 solutions:
Solution1: Set batch_size of WriteToBigQuery to the number lower than 500. The Default is 500.
Solution2: Set method of WriteToBigQuery to "FILE_LOADS", and also set other necessary parameters, such as triggering_frequency and custom_gcs_temp_location.
If above 2 solutions cannot solve your problem or are not suitable for your case, you have to make the granularity of each row smaller, so that the size of each row becomes smaller. This will need to modify parsing logic and BigQuery table schema.
To see the detail of parameters, please see the reference link.
Reference:
https://beam.apache.org/releases/pydoc/2.39.0/apache_beam.io.gcp.bigquery.html#apache_beam.io.gcp.bigquery.WriteToBigQuery
https://cloud.google.com/dataflow/docs/guides/common-errors#json-request-too-large
Are you serializing a large amount of data as part of your pipeline specification? For example, are you using the Create Transform to create PCollections from inlined data?
Could you possible share the json file? If you don't want to share it publicly you could email it privately to the Dataflow team.
This has been merged into Beam on Nov, 16 2018. It should not be too much longer before this is included in Dataflow.

How can i regard a page as volatile or predict the next time of a page's content modification?

I'm running virtual machine, so all the system information i can get. How can I use them to detect a page or revalant pages volatile? The result can be just a approximate volatile time of empirical conclusion. I want to use time series analysis to predict the next time of a page's content modification, is it possible and accurate? Are there any better methods? Thanks very much!
I'm going to answer for pages inside a process as the question if related to the OS gets very complex.
You can use VirtualQuery() and VirtualQueryEx()to determine the status of a given memory page. This includes if it is read only, guard page, DLL image section, writeable, etc. From these statuses you can infer the volatility of some pages.
All the read only pages can be assumed to be none-volatile. But that isn't strictly accurate since you can use VirtualProtect() to change the protection status of a page. And you can use VirtualProtextEx() to the same in a different application. So you'd need to re-check these.
What about the other pages? Any writeable page you're going to have to periodically check them. For example calculate a checksum and compare to previous checksum's to see if they've changed. And then record the time between changes.
You could use the NTDLL Function NtQueryInformationProcess() with ProcessWorkingSetWatch to get data on the page faults for the system.
Note sure if this what you're looking for but it's the simplest approach I can think of. It's potentially a bit CPU hungry. And reading each page regularly to calculate the checksums will trash your cache.

handling large file image upload

On my asp.net mvc 4 site I have a feature where a user can upload a photo, via standard file uploader. The photo gets saved in to a file table within sql server.
I have run in to an issue recently where users are uploading very large photos which in return means bandwidth being eaten up when image is being rendered.
What is the best way to handle this? Can I restrict the size of file being uploaded? Or is there a way of reducing the number of bytes being uploaded while maintaining quality?
Refer to this post for the maximumRequestLength config setting and way to provide a more friendly error
This question and answer may also be helpful
You can also check the size of the file in javascript before uploading so that it doesn't even get sent to the server if it is too big (the code below check for anything bigger than 15MB):
if( Math.floor( file.size / 1024 /1024 ) >= 15 )
{
alert( 'File size is greater than maximum allowed. Please make sure that the file is smaller than 15 MegaBytes.' );
return false;
}
Alternatively, on the server side you can use WebImage.Resize() to resize once the file has been uploaded. It won't help with the bandwidth during upload, but it will make subsequent downloads a lot faster. Making an image smaller will cause some loss in quality, but generally it does a good job, just make sure that you choose the option to maintain the aspect ratio to prevent distortion.
As for reducing the bytes before uploading there isn't any way I know of to do this in the browser. You could provide a separate client-side application that will resize the files for them before the upload using the WebImage.Resize method in you app.

What's the best way of saving/displaying images? (not blob vs. txt)

I’m making a gallery on site. And don’t know what the best solution for it. Need advice.
For my opinion there are two ways of operating with images.
User uploads image. I save it on server only once, only with its original size. And then, when there’s a need of displaying that image on screen I resize it to the necessary size, for example as avatar. So I store only ONE original-sized image and resize it to ANY proper size RIGHT BEFORE displaying.
User uploads image. I save it on server with original size and also make and save several copies (thumbnails-sized), for example, avatar-sized, erc. So that if the image is displayed it’s not resized every time it is displayed, just proper-sized copy taken.
I think that the second way is better. Because there’s no need to spend server strength on resizing images every time. But what if I’ll decide to change design of my site and some dimensions of images on it will be resized too? I’ll get the situation of having lots of images on server that doesn’t fit new design.
All around different forums they explain how to make galleries and every time they say that thumbnail-sized copies are also made and saved. But it looks like it doesn’t make sense if design is changed in time. Please, advise. Language – PHP.
One solution that others have come up with is a mix between the two. So, the user uploads the photo and you save it in its original form on your server. Then, when an avatar is needed, you check to see if you have the avatar saved on disk (maybe user12345_50x50.jpg - where 50x50 is widthxheight). If it does exist, show that image. If not, then use the server to resize/crop whatever, then save that image to disk and serve that to the user. This will allow you to request any size file and serve it as-needed -- taking advantage of caching those that have already been requested [Note that this is a server-side cache, so would apply for all users].
You sort of get the best of both worlds. You don't need to handle all of the image manipulation up front, just as needed. The first time the image is processed, that user will have to wait, but any other request will get the processed file.
One implementation that uses this solution in PHP is phpthumb: http://phpthumb.sourceforge.net/

Resources