How to slow down file downloads on local ruby webserver? - ruby-on-rails

I'd like to mock large (>100MB) and slow file downloads locally by a ruby service - rails, sinatra, rack or something else.
After starting server and writing something like: http://localhost:3000/large_file.rar, I'd like to slooowly download a file (for testing purposes).
My question is, how to throttle local webserver to certain maximum speed? Because if file is stored locally, it will by default download very fast.

You should use curl for this, which allows you to specify a maximum transfer speed with the --limit-rate option. The following would download a file at about 10KB per second:
curl --limit-rate 10K http://localhost:3000/large_file.rar
From the documentation:
The given speed is measured in bytes/second, unless a suffix is
appended. Appending ‘k’ or ‘K’ will count the number as kilobytes, ‘m’
or M’ makes it megabytes, while ‘g’ or ‘G’ makes it gigabytes.
Examples: 200K, 3m and 1G.
The given rate is the average speed counted during the entire
transfer. It means that curl might use higher transfer speeds in short
bursts, but over time it uses no more than the given rate.
More examples here (search for "speed limit"): http://www.cs.sunysb.edu/documentation/curl/index.html

Related

Total read time in Dataflow job has high variance

I have a Dataflow job that reads log files from my GCS bucket that are split by time and host machine. The structure of the bucket is as follows:
/YYYY/MM/DD/HH/mm/HOST/*.gz
Each job can end up consuming on the order of 10,000+ log files of around 10-100 KB in size.
Normally our job takes approximately 5 minutes to complete. We at times see our jobs spike to 2-3x that amount of time, and find that the majority of the increase is seen in the work items related to reading the data files. How can we reduce this variance on our job execution time? Is there a throughput issue with reading from GCS?
Most likely the variance you see in your jobs is due to GCS network latency. While typically the latency to retrieve items from GCS is rather small, it can spike depending on various factors like network conditions and time of day. There is no SLA around latency when it comes to reading from GCS. The throughput from GCS is probably not the factor as the size of the data files you are reading are rather small.
If network conditions are such that your latency increases significantly, this effect will grow proportionally by the number of files you are reading.
One way to alleviate the variance in job time is to try and combine your log files before reading them in such that you have fewer files to read that are larger in size.
I have a different insight on this. Reading a gzip file implies it is being unzipped first on the worker machine. The unzip on the gzip file is a single core (ideally 1 n1-standard-1 worker) operation because gzip as a compression format is not splittable.
In the above scenario, there might be some files which are significantly larger compared to other files and will most likely result into creating stragglers (workers in the dataflow job that lag behind) which will increase the job execution time.
I can think of two solutions to keep the job execution time as minimum as possible -
Change to a splittable compression format like bzip2 so that all the files are massively parallelised and the read operation is completed as fast as possible.
Make the gzip files as minimum in size as possible so that a large number of workers consume a huge number of files and total execution time is less. For example, having 10 workers read 10 gzip files of 100KB each, have 100 workers read zip files of 10KB each. GCP bills you per minute, so cost should most probably remain the same.

Highload rails server with backgrounder architecture

Here is my task.
Every second backgrounder task should generate json based on some data. This operation is not cpu intensive ( mostly network) and it generates JSON content (5-10KB). Operations take about 200ms.
Also I have about 1000 clients asking for this content once every few seconds. Let's say it's about 200 requests/sec.
Server should just output current actual json.
Currently i already have rails 4+nginx+passenger+debian sever doing other jobs, related to this work.
Being a student I want to make my server in a most cost-effective way having an ability to easy-scale in this ways:
Adding few more backgrounder jobs, generating more json's
Increasing number of requests to 10 000 per second
Currently I have linode 2048 ssd with 2 CPU Cores. My questions are:
What gem/solution should I use for my backgrounder tasks ( the are currently written in ruby )
How to effectivly store actual json and pass it from backgrounder(s) to rails/nginx.
How to make serving json as fast as possible.
you mentioned "Server should just output current actual json", I guess the JSON generation may not become a bottleneck as you can cache it to Memcache and serve Memcache directly:
1) Periodically background process -> dump data to Memcache (even gzip to speed it up)
2) User -> Nginx -> Memcache
See the Nginx memcache built in module http://nginx.org/en/docs/http/ngx_http_memcached_module.html
The bottleneck is any backend with blocking mechanism, GIL, IO locks etc, try to avoid these type of problems by split request/response cycle with intermediate Memcache data point.

Dataflow job takes too long to start

I'm running a job which reads about ~70GB of (compressed data).
In order to speed up processing, I tried to start a job with a large number of instances (500), but after 20 minutes of waiting, it doesn't seem to start processing the data (I have a counter for the number of records read). The reason for having a large number of instances is that as one of the steps, I need to produce an output similar to an inner join, which results in much bigger intermediate dataset for later steps.
What should be an average delay before the job is submitted and when it starts executing? Does it depend on the number of machines?
While I might have a bug that causes that behavior, I still wonder what that number/logic is.
Thanks,
G
The time necessary to start VMs on GCE grows with the number of VMs you start, and in general VM startup/shutdown performance can have high variance. 20 minutes would definitely be much higher than normal, but it is somewhere in the tail of the distribution we have been observing for similar sizes. This is a known pain point :(
To verify whether VM startup is actually at fault this time, you can look at Cloud Logs for your job ID, and see if there's any logging going on: if there is, then some VMs definitely started up. Additionally you can enable finer-grained logging by adding an argument to your main program:
--workerLogLevelOverrides=com.google.cloud.dataflow#DEBUG
This will cause workers to log detailed information, such as receiving and processing work items.
Meanwhile I suggest to enable autoscaling instead of specifying a large number of instances manually - it should gradually scale to the appropriate number of VMs at the appropriate moment in the job's lifetime.
Another possible (and probably more likely) explanation is that you are reading a compressed file that needs to be decompressed before it is processed. It is impossible to seek in the compressed file (since gzip doesn't support it directly), so even though you specify a large number of instances, only one instance is being used to read from the file.
The best way to approach the solution of this problem would be to split a single compressed file into many files that are compressed separately.
The best way to debug this problem would be to try it with a smaller compressed input and take a look at the logs.

Concurrent SOAP api requests taking collectively longer time

I'm using savon gem to interact with a soap api. I'm trying to send three parallel request to the api using parallel gem. Normally each request takes around 13 seconds to complete so for three requests it takes around 39 seconds. After using parallel gem and sending three parallel requests using 3 threads it takes around 23 seconds to complete all three requests which is really nice but I'm not able to figure out why its not completing it in like 14-15 seconds. I really need to lower the total time as it directly affects the response time of my website. Any ideas on why it is happening? Are network requests blocking in nature?
I'm sending the requests as follows
Parallel.map(["GDSSpecialReturn", "Normal", "LCCSpecialReturn"], :in_threads => 3){ |promo_plan| self.search_request(promo_plan) }
I tried using multiple processes also but no use.
I have 2 theories:
Part of the work load can't run in parallel, so you don't see 3x speedup, but a bit less than that. It's very rare to see multithreaded tasks speedup 100% proportionally to the number of CPUs used, because there are always a few bits that have to run one at a time. See Amdahl's Law, which provides equations to describe this, and states that:
The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program
Disk I/O is involved, and this runs slower in parallel because of disk seek time, limiting the IO per second. Remember that unless you're on an SSD, the disk has to make part of a physical rotation every time you look for something different on it. With 3 requests at once, the disk is skipping repeatedly over the disk to try to fulfill I/O requests in 3 different places. This is why random I/O on hard drives is much slower than sequential I/O. Even on an SSD, random I/O can be a bit slower, especially if small-block read-write is involved.
I think option 2 is the culprit if you're running your database on the same system. The problem is that when the SOAP calls hit the DB, it gets hit on both of these factors. Even blazing-fast 15000 RPM server hard drives can only manage ~200 IO operations per second. SSDs will do 10,000-100,000+ IO/s. See figures on Wikipedia for ballparks. Though, most databases do some clever memory caching to mitigate the problems.
A clever way to test if it's factor 2 is to run an H2 Database in-memory DB and test SOAP calls using this. They'll complete much faster, probably, and you should see similar execution time for 1,3, or $CPU-COUNT requests at once.
That's actually is big question, it depends on many factors.
1. Ruby language implementation
It could be different between MRI, Rubinus, JRuby. Tho I am not sure if the parallel gem
support Rubinus and JRuby.
2. Your Machine
How many CPU cores do you have in your machine, you can leverage this using parallel process? Have you tried using process do this if you have multiple cores?
Parallel.map(["GDSSpecialReturn", "Normal", "LCCSpecialReturn"]){ |promo_plan| self.search_request(promo_plan) } # by default it will use [number] of processes if you have [number] of CPUs
3. What happened underline self.search_request?
If you running this in MRI env, cause the GIL, it actually running your code not concurrently. Or put it precisely, the IO call won't block(MRI implementation), so only the network call part will be running concurrently, but not all others. That's why I am interesting about what other works you did inside self.search_request, cause that would have impact on the overall performance.
So I recommend you can test your code in different environments and different machines(it could be different between your local machine and the real production machine, so please do try tune and benchmark) to get the best result.
Btw, if you want to know more about the threads/process in ruby, highly recommend Jesse Storimer's Working with ruby threads, he did a pretty good job explaining all this things.
Hope it helps, thanks.

Magento Catalog URL Rewrites - Long time to index

We are using magento 1.4.1 for our store, with 30+ categories and 2000+ products, every time i try to reindex the indexes "Catalog URL Rewrites" takes longer time to complete, please suggest us on how we can improve its speed?
Unfortunately catalog_url_rewrites is the slowest index in Magento when you have a large number of SKUs and the time is multiplied if you have a large number of store views. If you still have the default french/german store views - be sure to delete them, this will speed things up by a factor of 3x.
There are no means to speed up the re-index other than beefing up hardware (or optimising server configuration).
Running re-index via command line will relieve the burden of HTTP, but if the php.ini is the same, then its going to take the same amount of time.
You compare by running
php -i | grep php.ini
And comparing it to the output of of a script accessed via HTTP
phpinfo();
Otherwise, server tuning is everything, improving PHP and MySQL performance (which is a bit beyond the scope of this reply).
I don't know the way to make this process faster. What I would suggest you to do is:
Setup a cronjob which will do like this:
php (mageroot)/shell/indexer.php reindexall
php (mageroot)/shell/indexer.php --reindex catalog_url
I am sure about first one, but not sure about second one.
Cron should run every night, for example.

Resources