Single Bigger POST to a google-script or several smaller? - post

I have a script which collect some data about certain files in my computer and then make a POST to a google-script published as service.
I was wondering what should be better: collect all the data (which couldn't be more than few MB, maybe 10) and make a single POST, or make one POST request for each piece (which are just some kb) ?
Which is better for performance at both sides, my local computer and for google servers?
Could be understood as abuse if I make a hundred of POST? it will run just once a month.

There are a lot of factors that would go into this decision -
In general, I would argue its better to do one upload as 10mb isn't a large amount of data
Is this Asynchronous (or automatic) or is there a user clicking button? If its happening automatically then you don't have to worry about reporting progress accurately to the user. If there is a user watching the upload then smaller uploads are better as you'll be able to measure how many of the units (or chunks) are properly uploaded.
Your computer should not be in the picture at all - Google Apps Script runs on the Google Servers. Perhaps there is some confusion here?

Related

Rails 6 - Cost risks involved while making a large video Public, hosted on AWS S3 and distributed on AWS CloudFront

I have already asked this question on AWS Developer Forum but dont have any answer, hence posting the same question here to get some help.
I have a quite well organized and fast Rails 6 app where users can upload large videos(4gb)/images and also make them public to others. Its using AWS SDK for S3 upload and CloudFront to distribute and make the content available globally.All uploaded videos are transcoded into mp4,HD.Full HD videos using Input S3 bucket - MediaConvert - Lambda - Ootput S3 bucket - Cloudfront workflow.
Now my query is -
As users are allowed to upload upto 4GB of videos and also can make them public, so does this feature of making large videos public will also increase the cost/billing, as the video is public and more and more people will watch it, raising concerns to more incoming request for CloudFront...Can someone correct me here?
If the above point is correct and will happen, what are the ways I can make videos public without effecting the billing/cost, for example using Cache(cloudfront cache) or any other way to minimize the increasing cost.
What are the ways I can allow users to share uploaded videos to share with others, without increasing the AWS billing?
There is no legal way to avoid the data transfer cost increase for the use case you described in AWS. Even if CloudFront cache your data, you still need to pay the CloudFront outbound data transfer cost.
As users are allowed to upload up to 4GB of videos and also can make them public, so does this feature of making large videos public will also increase the cost/billing, as the video is public and more and more people will watch it, raising concerns to more incoming request for CloudFront...Can someone correct me here?
You are correct. You will be paying for all these outgoing data transfers.
If the above point is correct and will happen, what are the ways I can make videos public without affecting the billing/cost, for example using Cache(CloudFront cache) or any other way to minimize the increasing cost.
The only way is to earn money based on such a service. So you either charge the users to view the videos, your charge the uploaded for the upload of the videos and subsequent data transfers, or you earn by advertisements on your portal. So still upload and download are free, but you somehow make your users go through a bunch of advertisement links to compensate for that. There are other ways to monetize a website, but it depends on how popular your website will become, e.g. collect user data and sell it out.
What are the ways I can allow users to share uploaded videos to share with others, without increasing the AWS billing?
See point two. You have to monetize your website, or simply change its architecture. Instead of you storing all the files, let users exchange torrent links. Then the files are not stored on your account nor you incur any cost associated with data transfers.
I truly appreciate the time and effort of all those who tried to help me.
I didnt got any answer though I was able to get quite close to what I needed in the link below - which is a blog post explaining the parameters and other relevant areas that needs to he considered while working on large videos, cost involved and how much total bit rate is viewed.
https://aws.amazon.com/blogs/media/frequently-asked-questions-about-the-cost-of-live-streaming/
Hope it helps someone looking for more specific answers.

Firebase Storage: How to reduce requests? (iOS)

I'm developing a chat app with Firebase. Am currently still in development phase.
Profile pictures of test users are uploaded to Firebase Storage, and are downloaded in the home screen (with all the pictures). I realized that with that I very quickly used up storage download requests (easily hit 3,000 requests in one night, and hit the free plan quota!).
What are some best practices I could use to minimize download requests? Just to be sure I'm doing it right - I'm sending a GET request to the Firebase Storage url directly: https://firebasestorage.googleapis.com/... to download the image. Is that the right way to do it?
Two suggestions that might help:
Cache your images! If you keep requesting the same images over and over again over the network, that's going to use up your quota pretty fast. Not to mention your user's battery and network traffic. After you retrieve an image from the network, save it locally, and then the next time you need an image, look for it locally before you make another network request. Or consider using a library like PINRemoteImage that does most of the work for you. (Both on the retrieving as well as the caching side)
Consider uploading smaller versions of your image if you think you might be using them often. If your chat app, for instance, saves profile pictures as 1024x768 images, but then spend most of its time showing them as 66x50 thumbnails, you're probably downloading a lot of data you don't need. Consider saving both the original image and a thumbnail, and then grabbing the larger one only if you need it.
Hope that helps...

How to manage millions of tiny HTML files

I have taken over a project which is a Ruby on Rails app that basically crawls the internet (in cronjobs). It crawls selected sites to build historical statistics about their subject over time.
Currently, all the raw HTML it finds is kept for future use. The HTML is saved into a Postgres database table and is currently at about 1,2 million records taking up 16 gigabytes and growing.
The application currently reads the HTML once to extract the (currently) relevant data and then flags the row as processed. However, in the future, it may be relevant to reprocess ALL files in one job to produce new information.
Since the tasks in my pipeline includes setting up backups, I don't really feel like doing daily backups of this growing amount of data, knowing that 99,9% of the backup is just static records with a practically immutable column with raw HTML.
I want this HTML out of the PostgreSQL database and into some storage. The storage must be easily manageable (inserts, reads, backups) and relatively fast. I (currently) have no requirement to search/query the content, only to resolve it by a key.
The ideal solution should allow me to insert and get data very fast, easy to backup, perhaps incrementally, perhaps support batch reads and writes for performance.
Currently, I am testing a solution where I push all these HTML files (about 15-20 KB each) to Rackspace Cloudfiles, but it takes forever and the access time is somewhat slow and relatively inconsistent (between 100 ms and several seconds). I estimate that accessing all files sequentially from rackspace could take weeks. That's not an ideal foundation for devising new statistics.
I don't believe Rackspace's CDN is the optimal solution and would like to get some ideas on how to tackle this problem in a graceful way.
I know this question has no code and is more a question about architecture and database/storage solutions, so it may be treading on the edges of SO's scope. If there's a more fitting community for the question, I'll gladly move it.

How can I determine the quality of a connection in iOS?

I'm familiar with using Reachability to determine the type of internet connection (if any) being used on an iOS device. Unfortunately that's not a decent indicator of connection quality. Wifi with low signal strength is pretty sketchy and 3G with anything less than 3 bars is a disaster (not to mention networks that only allow EDGE connections).
How can I determine the quality of my connection so I can help my users decide if they should be downloading larger files on their current connection?
A pragmatic approach would be to download one moderately large-sized file hosted on a reliable, worldwide CDN, at the start of your application. You know the filesize beforehand, you just have to measure the time it takes, make a simple computation and then you've got your estimate of the quality of the connection.
For example, jQuery UI source code, unminified, gzipped weighs roughly 90kB. Downloading it from http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.14/jquery-ui.js takes 327ms here on my Mac. So one can assume I have at least a decent connection that can handle approximately 300kB/s (and in fact, it can handle much more).
The trick is to find the good balance between the original file size and the latency of the network, as the full download speed is never reached on a small file like this. On the other hand, downloading 1MB right after launching your application will surely penalize most of your users, even if it will allow you to measure more precisely the speed of the connection.
Cyrille's answer is a good pragmatic answer, but is not really in the end a great solution in the mobile context for these reasons:
It involves doing a test "at the start of your application" by which I assume he means when your app launches. But your app may execute for a long while, may go background and then back into the foreground, and all the while the user is changing network contexts with changes in underlying network performance - so that initial test result may bear no relationship to the "current" performance of the network connection.
For the reason he rightly points out, that it is "penalizing" your user by making them download a test file over what may already be constrained network conditions.
You also suggest in your original post that you want your user to decide if they should download based on information you present to them. But I would suggest that this is not a good way to approach interacting with mobile users - that you should not be asking them to make complicated decisions. If absolutely necessary, only ask if they want to download the file if you think it may present a problem, but keep it that simple - "Do you want to download XYZ file (100 MB)?" I personally would even avoid even that.
Instead of downloading a test file, the better solution is to monitor and adapt. Measure the performance of the connection as you go along, keep track of the "freshness" of that information you have about how well the connection is performing, and only present your user with a decision to make if based on the on-going performance of the connection it seems necessary.
EDIT: For example, if you determine a patience threshold that in your opinion represents tolerable download performance, keep track of each download that the user does in order to determine if that threshold is being reached. That way, instead of clogging up the users connection with test downloads, you're using the real world activity as the determining factor for "quality of the connection", which is ultimately about the end-user experience of the quality of the connection. If you decide to provide the user with the ability to cancel downloads, then you have an excellent "input" about the user's actual patience threshold, and can adapt your functionality to that situation, by subsequently giving them the choice before they start the download. If you've flipped into this type of "confirmation" mode, but then find that files are starting to download faster, you could dynamically exit the confirmation mode.
Rob's answer is very good, but for a more specific implementation start with (https://developer.apple.com/library/archive/samplecode/SimplePing/Introduction/Intro.html#//apple_ref/doc/uid/DTS10000716)Apple's Simple Ping example source code
Target the domain for the server that you want to monitor connection quality to. Use the ping library to "ping" it on a regular basis (say 1 or 10 seconds depending upon your UI needs). Measure how long it takes to get a response to your ping (or if it never returns) to develop an estimate of the connection quality to communicate to your user.

Twitter app development best practices?

Let's imagine app which is not just another way to post tweets, but something like aggregator and need to store/have access to tweets posted throught.
Since twitter added a limit for API calls, app should/may use some cache, then it should periodically check if tweet was not deleted etc.
How do you manage limits? How do you think good trafficed apps live while not whitelistted?
To name a few.
Aggressive caching. Don't call out to the API unless you have to.
I generally pull down as much data as I can upfront and store it somewhere. Then I operate off the local store until it runs out and needs to be refreshed.
Avoid doing things in real time. Queue up requests and make them on a timer.
If you're on Linux, cronjobs are the easiest way to do this.
Combine requests as much as possible.
Well you have 100 requests per hour, so the question is how do you balance it between the various types of requests. I think the best option is the way is how TweetDeck which allows you to set the percentage and saves the rest of the % for posting (because that is important too):
(source: livefilestore.com)
Around the caching a database would be good, and I would ignore deleted ones - once you have downloaded the tweet it doesn't matter if it was deleted. If you wanted to, you could in theory just try to open the page with the tweet and if you get a 404 then it's been deleted. That means no cost against the API.

Resources