Puppeteer: Load testing using puppeteer - load-testing

I am at design phase for a load test setup using headless chrome controlled by Puppeteer. Whats the best approach to follow? I have thought of below two approaches.
Say I need to simulate 1000 users login.
Using await puppeteer.launch() api, create 1000 headless chrome instances. And then access the login page, type user / password, hit the log in button. This looks strait forward, but it may take lot of system resources and simulation of 1000 user logins may not be possible (will it be?)
Launch only one chrome instance. Create 1000 CDP sessions. However I am not sure if this approach will work, because of same cache (userDataDir) path. Is it possible to set different cache for each CDP session.
Or is there any better approach for load testing using Puppeteer?

Try to use puppeteer-loadtest. Another project puppeteer-cluster manage instances and optimize performance, see related post

Related

Aren't PWAs user unfriendly if the service worker is not immediately active?

I posted another question as a brute-force solution to this one (Angular: fully install service worker before anything else) but I thought I'd make a separate one to discuss the use case for when a service worker is used as intended.
According to the service worker life cycle (https://developers.google.com/web/fundamentals/primers/service-workers/lifecycle), the SW is installed but it's only active once you then reload the page (you can claim() the page but that's only for calls that happen after the service worker is installed). The reasoning is that if and existing version is updated, the old one and the new one do not mix states and caches. I can agree with that decision.
What I have trouble understanding is why it is not immediately active once it is initially installed. Instead, it requires a page reload unless you explicitly define precaching rules in the SW. If you define caching rules with wildcards, it's not possible to precache those so you need the reload.
Given a single page PWA (like Angular), a user will discover the site and browser around on it but the page will never be reloaded during that session. If they then want to use the site offline later, they need to have refreshed or re-opened the tab at least one other time. That seems like a pretty big pitfall to me.
Am I missing something here?
Your understanding of the service worker lifecycle is correct but I do not think the pitfall you mentioned is as severe as you think it is.
If I understand you correctly, the user experience will only be negatively affected if the user loses connectivity during the initial browsing of the page (before the service worker is active) and is missing an offline asset. If this is truly a scenario you want to account for then that offline asset can be pre-cached in the browser-side javascript. Alternatively, as you mentioned, you can skipWaiting() and claim() to make the service worker active without the user refreshing the page.

Keeping users in sync with each other in an social network app?

I am wondering about what the best way to keep users in sync with each other in a social network is. The concerned stack is an iOS app with a NodeJS backend. Let me give you an example:
Say X and Y are friends on a social network. Y's posts appear in X's feed, and as such, Y is cached somewhere on the X's phone. This morning, Y decided to change profile pictures however. Everything is well, the new picture is uploaded to the server, but how do we go about letting X know about the change of profile picture?
My possible solution: Create a route /<UID>/updates that contains a stack of "cookies" which lets the user know what and who changed since the last time they made a GET request to the route.
This seems elegant enough, but what worries me is what happens on the client side (am I supposed to make a GET request every 2 minutes during my app's uptime?). Are there any other solutions?
One solution is indeed to poll the server, but that's not very elegant. A better way is to make use of websockets:
WebSockets is an advanced technology that makes it possible to open an interactive communication session between the user's browser and a server. With this API, you can send messages to a server and receive event-driven responses without having to poll the server for a reply.
They are a 2-way connection between client and server, allowing the server to notify the client of any changes. This is the underlying technology used in the Meteor framework for example.
Take a look at this blogpost for an example of how to use websockets between an iOS client and a NodeJS backend. They make use of the open source SocketRocket iOS library.

How to properly handle asynchronous database replication?

I'm considering using Amazon RDS with read replicas to scale our database.
Some of our controllers in our web application are read/write, some of them are read-only. We already have an automated way for identifying which controllers are read-only, so my first approach would have been to open a connection to the master when requesting a read/write controller, else open a connection to a read replica when requesting a read-only controller.
In theory, that sounds good. But then I stumbled open the replication lag concept, which basically says that a replica can be several seconds behind the master.
Let's imagine the following use case then:
The browser posts to /create-account, which is read/write, thus connecting to the master
The account is created, transaction committed, and the browser gets redirected to /member-area
The browser opens /member-area, which is read-only, thus connecting to a replica. If the replica is even slightly behind the master, the user account might not exist yet on the replica, thus resulting in an error.
How do you realistically use read replicas in your application, to avoid these potential issues?
I worked with application which used pseudo-vertical partitioning. Since only handful of data was time-sensitive the application usually fetched from slaves and from master only in selected cases.
As an example: when the User updated their password application would always ask master for authentication prompt. When changing non-time sensitive data (like User Preferences) it would display success dialog along with information that it might take a while until everything is updated.
Some other ideas which might or might not work depending on environment:
After update compute entity checksum, store it in application cache and when fetching the data always ask for compliance with checksum
Use browser store/cookie for storing delta ensuring User always sees the latest version
Add "up-to-date" flag and invalidate synchronously on every slave node before/after update
Whatever solution you choose keep in mind it's subject of CAP Theorem.
This is a hard problem, and there are lots of potential solutions. One potential solution is to look at what facebook did,
TLDR - read requests get routed to the read only copy, but if you do a write, then for the next 20 seconds, all your reads go to the writeable master.
The other main problem we had to address was that only our master
databases in California could accept write operations. This fact meant
we needed to avoid serving pages that did database writes from
Virginia because each one would have to cross the country to our
master databases in California. Fortunately, our most frequently
accessed pages (home page, profiles, photo pages) don't do any writes
under normal operation. The problem thus boiled down to, when a user
makes a request for a page, how do we decide if it is "safe" to send
to Virginia or if it must be routed to California?
This question turned out to have a relatively straightforward answer.
One of the first servers a user request to Facebook hits is called a
load balancer; this machine's primary responsibility is picking a web
server to handle the request but it also serves a number of other
purposes: protecting against denial of service attacks and
multiplexing user connections to name a few. This load balancer has
the capability to run in Layer 7 mode where it can examine the URI a
user is requesting and make routing decisions based on that
information. This feature meant it was easy to tell the load balancer
about our "safe" pages and it could decide whether to send the request
to Virginia or California based on the page name and the user's
location.
There is another wrinkle to this problem, however. Let's say you go to
editprofile.php to change your hometown. This page isn't marked as
safe so it gets routed to California and you make the change. Then you
go to view your profile and, since it is a safe page, we send you to
Virginia. Because of the replication lag we mentioned earlier,
however, you might not see the change you just made! This experience
is very confusing for a user and also leads to double posting. We got
around this concern by setting a cookie in your browser with the
current time whenever you write something to our databases. The load
balancer also looks for that cookie and, if it notices that you wrote
something within 20 seconds, will unconditionally send you to
California. Then when 20 seconds have passed and we're certain the
data has replicated to Virginia, we'll allow you to go back for safe
pages.

taking a screenshot of user's current page in ruby on rails

this is fo debugging purpose, please take into account the following:
the user logs in to his/her account so manually fetching a url will not work - the screenshot must happen together when the user access his admin pages.
would love to receive guidelines specific for ruby on rails and heroku (i guess heroku is not much an issue i just dump the screenshot to s3).
so ideally like i mentioned in #1, when a user access a page, my app also takes a screenshot of the entire page and dumps it in a tmp folder.
can anyone point me how to handle that?
In order to get a screenshot of what the user is currently seeing, you have to have some code on the user's machine that uses the underlying operating system API to take the screenshot. The API calls involved are different for Windows, Mac OS X and Linux.
Ruby on Rails executes on the remote server and generates HTML and JavaScript etc. that is sent to the user's web browser. The HTML is rendered by the browser and the JavaScript executes within the browser's sandbox, where it has no direct access to the operating system API. The important point is that there is no direct interaction between the server-side code and the OS running on the user's computer. If this were possible then it would be a massive security hole.
Therefore it's not possible to do what you want programmatically unless you can first install a client-side program on the user's computer that can talk to your server-side code. It cannot be done using Ruby on Rails alone because it's a server-side web framework.
You can't do this without a user sending a screenshot themselves.

How can I detect when user browses certain url?

I'm writing an application, which becomes "useful" once user is browsing certain url.
I want to add feature to my application, that it will be automatically launched once user browses this url, I was thinking of writing some sort of watchdog to trigger it.
My question is, whether there is a generic way to get notified when user browses to urls, I want to support at least IE and FireFox, chrome and safari is nice to have.
I read about DDE and WWW_RegisterURLEcho, but from what I understand it's not supported by FireFox, and also little sample I wrote didn't work with IE as well.
Thank you in advance
some more questions **
Do Url Monikers and Asynchronous Pluggable Protocols help me here ? Is it supported by FireFox ?
If you have control over the website, you could have it write a cookie to the computer. Then have your application monitor for that cookie.
You can implement this in many ways and at many different layers.
At the highest level, you could implement a browser plugin. There is no cross-browser solution at this layer that will let you write the code once and work for every browser. On the easy end of the spectrum, Firefox, you could implement it entirely as a Javascript + XUL plugin and use built-in XPCom interfaces (nsIProcess) for launching your helper process. For IE you would need to write a COM, C++ and win32 BHO that handles DWebBrowserEvents2::BeforeNavigate2. This is the hardest thing to do. There are mechanisms for Safari, Chrome and other webbrowsers that you could use to achieve this same behavior, with varying degrees of difficulty.
At the next level you could implement an HTTP proxy, similar to Fiddler2, that redirects all HTTP traffic through your local proxy first. Each browser has a different way of configuring its proxy settings, but they're all basically registry settings or config files.
At the most basic level you could just snif all IP traffic going out of the machine, similar to the way Wireshark does it, and just look for http requests to your URL. This is probably more difficult to code, but would work for all browsers without any special per-browser configuration stuff going on. You may need to write a driver. I dunno, I've never done work at this level in the stack.

Resources