Number of users to a webpage - scalability

I have a webpage that reads data from an Access file (Microsoft Access file) on my website. How many users can visit that page at the same time?
Would the page crash at some time if too many users tried to visit that page at the same time? Is it better to use a PHP file that reads data from a text file or its just the same?

There are many variables that influence how many people can simultaneousness use your website (loosely known as scalability), including your database, hardware, network, caching and more. And yes, at some point your performance will degrade if more and more users access the page.
It would be really hard to say from the information you provided how scalable your website is. PHP could be faster but not necessarily. Always be skeptical about technologies that promise superior performance.
For the moment your best option is to try and estimate how many concurrent users you are expecting and then use a load testing tool like JMeter, Apache Bench or others to assess if you're website will stand up to the load.

It turns out that my website was hosted on Domain.com. Domain.com say that I have unlimited bandwidth frequency. But in reality I don't.
My website was crashing, because it was hosted with thousands of websites on the same server. So the bandwidth is limited even though it says unlimited. My only solution was to host my website on a VPS. Basically hosting my website on a server by itself.

Related

Impact of Fiori Apps on Server

We are following Embedded Architecture for our S4HANA 1610 system.
Please let me know what will be the impact on Server if we implement 200+ Standard Fiori Apps in our System ?
Regards,
Sayed
When you say “server”, are you referring to the ABAP backend, consisting of one or more SAP application servers and usually one database server?
In this case, you might get a very first impression using transaction ST03.
Here, you get a detailed analysis of resource consumption on the SAP application server.
You also get information about database access times, as seen from the application server.
This can give you a good hint about resource consumption on the database server.
Usually, the ABAP backend is accessed from Fiori via OData calls.
Not every user interaction causes an OData call, some interactions are handled locally at the frontend.
In general, implemented apps only require some space on the hard disk, as long as nobody is using them.
So the important questions for defining the expected workload are:
How many users are working with these apps in which frequency (Avg.
thinktime)?
How many OData calls are sent from these apps to the backend and how
many dialog steps are handled by the frontend itself?
How expensive are these OData calls (see ST03)?
Every app reflects one or more typical business processes, which need to be defined.
Your specific Customizing also plays an important role, because it controls different internal functionality.
It’s also mandatory, to optimize database access, because in productive use, tables might get bigger in size, which might slow down database access over time.
Usually, this kind of sizing is done by SAP Hardware and Technology partners.

Rails based hub and spoke distributed web site. Anything out there?

I need to design a system where we have a central Rails website for maintaining product information, some of which is rich media (photos, movies etc.) and we need a way to efficiently access this central information from a series of information kiosks. The central system will be used to update and control access to the information and the kiosks will primarily display this with no editing required. The only traffic which is likely to move back from kiosk to central site is usage information which is not bandwidth constrained.
My initial thoughts are to run separate Rails servers on each kiosk and 'somehow' (eg. scheduled rake task) synchronise the relevant content from the central server to each kiosk. Note that the kiosks won't all have the same content on them as it will be location dependent. We might need to employ something like Amazon S3 storage to host content.
Another option would be to employ some sort of advanced caching (ie. more advanced than standard browser caching) on each kiosk to minimise network bandwidth requirements and speed things up. I've used 'squid' before but only as a general purpose site cache server, I don't know if it can step up to what I need here.
So, my question is whether anyone out there has attempted anything like this before and what sort of architecture you found to work. I'd be interested in hearing if there are any Rails plugins which are relevant to my requirements and/or any smart caching servers.
Many thanks,
Craig.
I know it's not possible for every application, but you could generate static cache of the content and use a scheduled task to update each kiosk from that cache. Then you don't have to maintain rails servers in each one.
Depending on what you're running on kiosks, if you need a bit more interactivity, you can run a sinatra or a camping app. Those are a fair bit lighter weight than rails. You can communicate through XML. If you're running a flash app on the kiosk, look at rubyamf library.

Detecting end-user connection speed problems in Apache for Windows

Our company provides web-based management software (servicedesk, helpdesk, timesheet, etc) for our clients.
One of them have been causing a great headache for some months complaining about the connection speed with our servers.
In our individual tests, the connection and response speeds are always great.
Some information about this specific client :
They have about 300 PC's on their local network, all using the same bandwith/server for internet access.
They dont allow us to ping their server, so we cant establish a trace route.
They claim every other site (google, blogs, news, etc) are always responding fast. We know for a fact they have no intention to mislead us and know this to be true.
They might have up to 100 PC's simulateneously logged in our software at any given time. They have a need to increase that amount up to 300 so this is a major issue.
They are helpfull and colaborative in this issue we are trying to resolve for a long time.
Some information about our server and software :
We have been able to allocate more then 400 users at a single time without major speed losses for other clients.
We have gone extensive lengths to make good use of data caching and opcode caching in the software itself, and we did notice the improvement (from fast to faster)
There are no database, CPU or memory bottlenecks or leaks. Other clients are able to access the server just fine.
We have little to no knowledge on how to do some analyzing on specific end-user problems (Apache running under Windows server), and this is where I could use a lot of help.
Anything that might be related to Apache configuration would also be helpfull.
While all signs points to it being an internal problem in this specific client network, we are dedicating this effort to solve that too, if that is the case, but do not have capable or instructed professionals to deal with network problems (they do, however, while their main argument is that 'all other sites are fast, only yours is slow')
you might want to have a look at the tools from google "page speed family": http://code.google.com/speed/page-speed/docs/overview.html
your customer should maybe run the page speed extension for you. maybe then you can find out what is the problem: http://code.google.com/speed/page-speed/docs/extension.html

how does spider in a search engine works?

How does crawler or spider in a search engine works
Specifically, you need at least some of the following components:
Configuration: Needed to tell the crawler how, when and where to connect to documents; and how to connect to the underlying database/indexing system.
Connector: This will create the connections to a web page or a disk share or anything, really.
Memory: The pages already visited must be known to the crawler. This is usually stored in the index but it depends on the implementation and the needs. The content is also hashed for de-duplication and updates validation purposes.
Parser/Converter: Needed to be able to understand the content of a document and extract meta-data. Will convert the extracted data to a format usable by the underlying database system.
Indexer: Will push the data and meta-data to an database/indexing system.
Scheduler: Will plan runs of the crawler. Might need to handle a large number of running crawlers at the same time and take into consideration what is currently being done.
Connection algorithm: When the parser finds links to other documents, it is needed to analyse when, how, and where the next connections must be made. Also, some indexing algorithm take into consideration the page connection graphs so it might be needed to store and sort information related to that.
Policy Management: Some sites requires crawlers to respect certain policies (robots.txt for example).
Security/User Management: The crawler might need to be able to login in some system to access data.
Content compilation/execution: The crawler might need to execute certain things to be able to access what's inside, like applets/plugins.
Crawlers needs to be efficient at working together from different starting points, speed, memory usage and using a high number of threads/processes. I/O is key.
The world wide web is basically a connected directed graph of web documents,images,multimedia files etc. .Each node of the graph is a component of a web page-for example-a web page consists of image,text,video etc, all of them are linked.Crawler traverses the graph using Breadth First Search using links in web pages.
A crawler initially starts with one (or more) seed points.
It scans the webpage and explores the links in that page.
This process continues until all the graph is explored(some predefined constraint can be used to limit search depth).
From How Stuff Works
How does any spider start its travels over the Web? The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the most widely used portions of the Web.

How do you build a torrent file indexer?

I am curious about the technology behind a search engine like torrentz.com. From what I could observe, it doesn't host any torrent files, but rather connects you to other servers that do.
you search for keywords, it brings up a list of potential titles matching your search.
then you pick one of these and it provides you with another list of potential servers hosting the corresponding torrent file.
What I'm interested in particularly is the strategy behind gathering and indexing all that content:
How do they collect then aggregate the data?
Is it a submission base service, where each of these servers submits its content for indexing?
Is it a crawling algorithm? If so how do you even start crawling a site like piratebay.org?
Do they have access to these other servers' databases?
My knowledge and understanding of the bittorrent protocol is not very elaborate, but the documentation that I found online pointed me more toward the processes involved in building a tracker service, which isn't exactly what I'm interested in. Any insight and recommended reading material is appreciated.
For beginning start indexing their rss feeds and gather data from it. The next step would be indexing of portal's (like Mininova, tpb, etc) pages but watch out for the fact that you can be banned (ip based) for doing so, since that would provoke huge amount of data requested from their servers (i don't think that they be too happy about that)..
That said i doubt that they have access to other server's databases, but rather it's crawling +rss.
Another thing that you can use is that when somebody make a query of an item which you don't have in qyour database, you make the query on the main bt portal's, cache the result in your db, and then display results. Then if another user make the same query (which is pretty common scenario) you can show him cached data + new data from rss.

Resources