How to stress-test HTTP Live Streaming - ruby-on-rails

We built an youtube-like Rails application that serves videos using HTTP Live Streaming which are hosted on our company's S3-like (actually Ceph Object Gateway S3 API) cloud service.
It's the first public application on that storage service and we would like to know how much concurrent viewers it can handle beforehand.
We know that the network connection (10Gbps) will become the bottle neck at a certain stage, but we have no idea how much load the actual storage cloud service is able to handle.
How would you stress-test the HTTP Live Streaming?
Is something similar to this (UDP) suggestion an option in this (TCP) case?

You can use either a JMeter SAAS or cloud servers to overcome the network issue, and for JMeter you can use this commercial plugin which simulates realistically the Players behaviour and give useful metrics:
http://www.ubik-ingenierie.com/blog/easy-and-realistic-load-testing-of-http-live-stream-hls-with-apache-jmeter/
Metrics provided by plugin are:
Buffer fill time (time it took to start playin)
Lag Time (How many seconds play paused)
Lag Ratio (waiting time over watching time)
Disclaimer : We are behind the development of this solution

If you're testing HTTP streams you might be able to test it using JMeter though you'd probably need a hosted JMeter solution to create enough traffic.
I'm not sure if you'd be able to get any helpful response time info, but you would at least be able to easily create and ramp up the load.
Let me know if you need help with the JMeter side.

Related

Performance testing of Dockerized application hosted on Kubernetes

Our project involves containerisation of services / application and later they will be deployed on Kuberentes. My job is to do performance testing using Jmeter after the services are hosted on Kubernetes.
I am relatively new to Performance testing and have basic experience on Jmeter that I gained from working on it. I have understood how the app is load / perf tested using basic URLs or APIs but I want to know how I should go about handling performance testing for Docker containers hosted on Kubernetes.
How could I handle the above scenario?
JMeter doesn't know anything about the underlying technologies used at the backend, it just sends requests via Samplers, waits for responses and measures the elapsed time of the request and some other performance metrics. Later on you can generate HTML Reporting Dashboard to visualize the results
So your goal is to:
Identify the business use cases you need to implement for the performance testing
Identify network protocols which are being used under the hood of these business use cases
Create a JMeter Test Plan to precisely mimic the real user (or other application) accessing your system and doing what it supposed to be doing

How to use Kubernetes to do multiplayer online game with websocket?

If develop a online real time game with websocket, multiplayers running on the different containers, how to sync data when add or reduce containers if they are playing?
Does kubernetes has any good feature on this case?
ThatBrianDude already gave an awesome answer, and mine will not be that good. But I think your last comment gave us more hints about the architecture you have in mind. I hope my humble answer will shed a light on more ideas to your game. Here are some suggestions:
First, avoid keeping any state in the websocket apps.
The basic idea with containers is that they should be stateless.
ThatBrianDude
So, why not use caches and a messaging layer to help you with that. Imagine the following examples:
Situation 1: if the client sends an action to the websocket server, the server should put it in a queue/topic (some other service will process it later on).
Situation 2: The server might also listen to a(some) topic(s) for some types of messages, and send them back to the clients that need that information.
Situation 3: when the client asks for information or if the websocket server needs some information to send to the client, the server must read it from a cache, as reading from DB might be slow for a multiplayer game.
Situation 4: eventually a container is killed. The clients connected to that server will receive a connection error, and should reconnect. That means another handshake, and the player might feel it, depending on what the game was doing, so killing a container should not happen that often. But that would be just it, no information is lost.
This way, the websocket server containers are totally stateless, and the messaging topics and caches will help you to: provide all the information needed to containers, and; keep websockets, persistance and processing isolated and scalable.
Summing up, the information would flow like this:
clients are showering the websocket server containers with actions
websocket servers just send them to the messaging layer
processing containers (which can be scalled too!) receive those messages, process them, save to the database and/or to a cache and eventually send more messages to other topics
(optional) websocket servers receive those messages and send them to the clients.
Or like this:
clients ask for information or websocket servers periodically need to send the world state to clients
websocket servers look up the information in the cache
and send it to the clients.
Or even like this:
Some processing servers are independent of messages, they just read the game/world state (from the cache?) periodically
they process the physics and mechanics of the game
and save the result back in the cache, which will be sent to the clients by the websocket servers periodically, or send it in a topic so the websocket server can listen to it and send it to the clients.
Lastly, don't forget the suggestion to have one machine responsible for one game/world. It would be nice if each processing server (or each thread of a server) works with one game/world. That would make it easier to persist things without the need to sync stuff.
The basic idea with containers is that they should be stateless.
This means that any persistant data your game might have (highscores etc.) must be saved to a persistant DB whereas other temporary data like current ingame score or nickname etc. can stay inside the memory of the container and be gone once the container dies.
how to sync data when add or reduce containers if they are playing?
This sounds like you want to use multiple containers computing one game world?
Thats a whole other beast on its own but you might want to take a look at SpatialOS which pretty much allows for massive multiplayer worlds and is designed for games that require more than one machine per world.
If thats not what you are looking for I would recommend you to keep one machine responsible for one game/world as you will avoid high complexity when you try to sync stuff later on.

I have a TURN server, but what do I lack ?

I am building a WebRTC videoconferencing service for iOS and Web. We have so far used tokBox, they deliver the whole package (client API (ios and web) + TURN server. Their solution also takes care of generating tokens and session ID etc. But we want to have our own setup, and a partner has given us a TURN server. But, what do we miss to be able to have a webRTC video conference between iOS and a web client? What service will let us just plug in the TURN server address/credentials and have it work both from a web and iOS client? Are these suitable packages: EasyRTC, SkyLink, AppRTC ? We don't need a lot of features, just 1-1 videocall with no bugs. Which one is best?
at minima you will need a signaling server.
apprtc is a complete application, it is not suitable for what you have.
tokbox is a PaaS, so you could replace it with another PaaS (skylink, forge, ...)
easyrtc gives you the code of a signaling server, but not the infrastructure (load balancing, ...), you can use it if you re ok to host it.
you might want to go for peerJS (open source, both hosted and DIY options) if you really want to do it yourself. Otherwise, just changing PaaS is not making a lot of sense. You have to think about everything you're gonna lose as well (recording, archiving, media server, ...).
If your use case is well defined, you can ping me offline, and i'll point you to additional resources.
You can look to Kurento media server.
You can use the service provided by anyconnect SDK. They provide the leading Peer to Peer connectivity between any two nodes whether browser, desktop or mobile platforms. They also provide STUN,TURN and Signaling (SIP,XMPP) server support. Using their SDK will let you just plug in your server credentials and transfer any type of data.

Cloud computing: Learn to scale server up/down automatically

I'm really impressed with the power of cloud computing when it comes to the possibility to scale up and down your facilities depending on your load.
How can I shift my paradigm and learn to write my applications in that way? Write it once and forget(no matter of the future load) would be the best solution.
How can I practice my skills in that area?
Setup virtualization environment when I can add another VMs into the private cloud(via command line?) on some smart algorithms to foresee the load for some period of time?
Ideally I want to practice it without buying actual Cloud computing services and just on my hardware.
The only thing I want to practice here is app/web role and/or message queue systems scaling when current workers have too many jobs in queue. So let's rule out database scaling from the question's goal as too big topic.
One option I will throw out is to use a native Cloud execution framework. You might look at CloudIQ Platform. One component is CloudIQ Engine. It allows you to develop cloud native apps in C/C++, Java and .NET. You get the capabilities of scale up by simply adding workers to your cloud. The framework automatically distributes your applications to the new machine(s), and once installed, will begin sending work to them as requests come in. So in effect the cloud handles your queueing issue for you.
Check out the Download and Community links for more information.
You should try AWS- Amazon's offering a free tier that gives you storage, messaging and micro instances (only linux). you can start developing small try-outs without paying. writing an application that scales isn't that hard- try to break your flow into small, concurrent tasks. client-server applications are even easier- use a load balancer to raise\terminate servers by demand.

Best practice for rate limiting users of a REST API?

I am putting together a REST API and as I'm unsure how it will scale or what the demand for it will be, I'd like to be able to rate limit uses of it as well as to be able to temporarily refuse requests when the box is over capacity or if there is some kind of slashdotted scenario.
I'd also like to be able to gracefully bring the service down temporarily (while giving clients results that indicate the main service is offline for a bit) when/if I need to scale the service by adding more capacity.
Are there any best practices for this kind of thing? Implementation is Rails with mysql.
This is all done with outer webserver, which listens to the world (i recommend nginx or lighttpd).
Regarding rate limits, nginx is able to limit, i.e. 50 req/minute per each IP, all over get 503 page, which you can customize.
Regarding expected temporary down, in rails world this is done via special maintainance.html page. There is some kind of automation that creates or symlinks that file when rails app servers go down. I'd recommend relying not on file presence, but on actual availability of app server.
But really you are able to start/stop services without losing any connections at all. I.e. you can run separate instance of app server on different UNIX socket/IP port and have balancer (nginx/lighty/haproxy) use that new instance too. Then you shut down old instance and all clients are served with only new one. No connection lost. Of course this scenario is not always possible, depends on type of change you introduced in new version.
haproxy is a balancer-only solution. It can extremely efficiently balance requests to app servers in your farm.
For quite big service you end-up with something like:
api.domain resolving to round-robin N balancers
each balancer proxies requests to M webservers for static and P app servers for dynamic content. Oh well your REST API don't have static files, does it?
For quite small service (under 2K rps) all balancing is done inside one-two webservers.
Good answers already - if you don't want to implement the limiter yourself, there are also solutions like 3scale (http://www.3scale.net) which does rate limiting, analytics etc. for APIs. It works using a plugin (see here for the ruby api plugin) which hooks into the 3scale architecture. You can also use it via varnish and have varnish act as a rate limiting proxy.
I'd recommend implementing the rate limits outside of your application since otherwise the high traffic will still have the effect of killing your app. One good solution is to implement it as part of your apache proxy, with something like mod_evasive

Resources