Do CPUs make mistakes? - memory

Imagine that a regular computer intensively works for 5 years non-stop. The CPU always works at 100% and is constantly reading and writing to memory. Is it true that the computer will not make a single mistake?

Even in the absence of any errors caused by the CPU, storage elements are subject to bit flips (known as Single Event Upsets) from cosmic radiation. More information on that in Compiling an application for use in highly radioactive environments.
Radiation effects are more severe at higher altitudes, where the atmosphere provides less protection, so computers in Denver experience more bit flips than computers in Miami or Los Angeles. And similarly if you are designing equipment for use in a hospital near an X-ray machine.
Unless your hypothetical computer has an extremely small amount of memory, it is unlikely to work without any mistake for 5 years. Note however that some of the bit flips may occur in parts of the memory that you are not using, in which case they won't affect you.
You may find it interesting to read How to Kill a Supercomputer. Typical ECC (Error Correcting Code) memory can correct any single bit flip in a word, and can detect but not correct any two bit flips in a word. Note also that in some cases radiation can permanently damage memory cells, and those cells will never recover even after a cold start.

Related

Memory efficiency vs Processor efficiency

In general use, should I bet on memory efficiency or processor efficiency?
In the end, I know that must be according to software/hardware specs. but I think there's a general rule when there's no boundaries.
Example 01 (memory efficiency):
int n=0;
if(n < getRndNumber())
n = getRndNumber();
Example 02 (processor efficiency):
int n=0, aux=0;
aux = getRndNumber();
if(n < aux)
n = aux;
They're just simple examples and wrote them in order to show what I mean. Better examples will be well received.
Thanks in advance.
I'm going to wheel out the universal performance question trump card and say "neither, bet on correctness".
Write your code in the clearest possible way, set specific measurable performance goals, measure the performance of your software, profile it to find the bottlenecks, and then if necessary optimise knowing whether processor or memory is your problem.
(As if to make a case in point, your 'simple examples' have different behaviour assuming getRndNumber() does not return a constant value. If you'd written it in the simplest way, something like n = max(0, getRndNumber()) then it may be less efficient but it would be more readable and more likely to be correct.)
Edit:
To answer Dervin's criticism below, I should probably state why I believe there is no general answer to this question.
A good example is taking a random sample from a sequence. For sequences small enough to be copied into another contiguous memory block, a partial Fisher-Yates shuffle which favours computational efficiency is the fastest approach. However, for very large sequences where insufficient memory is available to allocate, something like reservoir sampling that favours memory efficiency must be used; this will be an order of magnitude slower.
So what is the general case here? For sampling a sequence should you favour CPU or memory efficiency? You simply cannot tell without knowing things like the average and maximum sizes of the sequences, the amount of physical and virtual memory in the machine, the likely number of concurrent samples being taken, the CPU and memory requirements of the other code running on the machine, and even things like whether the application itself needs to favour speed or reliability. And even if you do know all that, then you're still only guessing, you don't really know which one to favour.
Therefore the only reasonable thing to do is implement the code in a manner favouring clarity and maintainability (taking factors you know into account, and assuming that clarity is not at the expense of gross inefficiency), measure it in a real-life situation to see whether it is causing a problem and what the problem is, and then if so alter it. Most of the time you will not have to change the code as it will not be a bottleneck. The net result of this approach is that you will have a clear and maintainable codebase overall, with the small parts that particularly need to be CPU and/or memory efficient optimised to be so.
You think one is unrelated to the other? Why do you think that? Here are two examples where you'll find often unconsidered bottlenecks.
Example 1
You design a DB related software system and find that I/O is slowing you down as you read in one of the tables. Instead of allowing multiple queries resulting in multiple I/O operations you ingest the entire table first. Now all rows of the table are in memory and the only limitation should be the CPU. Patting yourself on the back you wonder why your program becomes hideously slow on memory poor computers. Oh dear, you've forgotten about virtual memory, swapping, and such.
Example 2
You write a program where your methods create many small objects but posses O(1), O(log) or at the worst O(n) speed. You've optimized for speed but see that your application takes a long time to run. Curiously, you profile to discover what the culprit could be. To your chagrin you discover that all those small objects adds up fast. Your code is being held back by the GC.
You have to decide based on the particular application, usage etc. In your above example, both memory and processor usage is trivial, so not a good example.
A better example might be the use of history tables in chess search. This method caches previously searched positions in the game tree in case they are re-searched in other branches of the game tree or on the next move.
However, it does cost space to store them, and space also requires time. If you use up too much memory you might end up using virtual memory which will be slow.
Another example might be caching in a database server. Clearly it is faster to access a cached result from main memory, but then again it would not be a good idea to keep loading and freeing from memory data that is unlikely to be re-used.
In other words, you can't generalize. You can't even make a decision based on the code - sometimes the decision has to be made in the context of likely data and usage patterns.
In the past 10 years. main memory has increased in speed hardly at all, while processors have continued to race ahead. There is no reason to believe this is going to change.
Edit: Incidently, in your example, aux will most likely end up in a register and never make it to memory at all.
Without context I think optimising for anything other than readability and flexibilty
So, the only general rule I could agree with is "Optmise for readability, while bearing in mind the possibility that at some point in the future you may have to optimise for either memory or processor efficiency in the future".
Sorry it isn't quite as catchy as you would like...
In your example, version 2 is clearly better, even though version 1 is prettier to me, since as others have pointed out, calling getRndNumber() multiple times requires more knowledge of getRndNumber() to follow.
It's also worth considering the scope of the operation you are looking to optimize; if the operation is time sensitive, say part of a web request or GUI update, it might be better to err on the side of completing it faster than saving memory.
Processor efficiency. Memory is egregiously slow compared to your processor. See this link for more details.
Although, in your example, the two would likely be optimized to be equivalent by the compiler.

Defining minimum memory and free disk space requirements?

On page 42 of Code Code complete there's a checklist of requirement items you might want to consider during a requirement phase.
On of the items (near the bottom of the list) says Are minimum machine memory and free disk space specified
Has this ever been a requirement in any project you did and how did you define such a requirement before even starting to build stuff?
I know this is only a suggestion and frankly I don't think I will ever include that in my requirements, but it got me thinking (and this is the real question)..
How would one ever make an estimation of system requirements...
This is in the requirements phase so isn't it more about identifing the minimum specification of machine that the application has to run on than estimating the resources your application will use?
I've developed systems for corporate clients where they have standard builds and are able to identify the minimum spec machine that will be used. Often you won't know the minimum specifications of the machines that you will be installing on but you will know the operating systems that you have to support, and may be able to infer them from that.
I have specified this before but its always been a ballpark figure using the 'Standard' specification of the day. For example at the moment I would simply say that my App was designed to be deployed to servers with at least 4GB of RAM. Because that's what we develop and test on.
For client apps you might need to get a bit more detailed, but its generally best to decide on the class of machine you are targeting and then make sure that your app fits within those constraints. Only when your application has particularly high requirements in one area (eg if it stores a lot of images, or needs a powerful graphics processor) do you need to go into more detail.
These sure are considerations in the early stages of some projects I've worked on. A lot of scientific codes boil down to working with large matrices. It's often possible to identify early on that code X will need to manipulate a dense matrix with, say, 100,000 rows and columns of complex doubles. Do the sums. Sometimes the answer is (a) pack a PC with RAM, sometimes it is (b) we'll have to parallelise this for memory even if it's not necessary for performance.
Sometimes our users would like to checkpoint their programs every N iterations. Checkpointing with very large datasets can use a lot of disk space. Get out your calculator again.
I know it's all very niche, but it matters when it matters.
Machine memory is a tricky one with virtual memory being so common, but disk space isn't that hard depending on the system. We've got a system at work that was built to deal with a number of external devices (accepting input, transforming data and delivering to a customer) and that was fairly easy to size given that we knew the current and projected data volumes that the devices were generating.
You can check how much memory is used by your software during testing, and then estimate how much more you may need if you process bigger chunks, i.e. if you process 1000 items in your biggest test suite and you need 4 MB, then you will probably need 4 GB to process 1 million items.
I've seen software in embedded systems have minimum machine memory requirements - often derived from limitations on the custom built hardware. If the box can only be X by Y by Z dimensions, and has to have other physical requirements satisfied, the limitations on available memory for the software can be absolute and the bare minimum should get set up front.
It's never been a big deal for me in the web app world - after all, there will probably be a new model of the target hardware released before I'm done with the code and memory will be cheaper... so why waste time trying to fit a small size when you can just add on?
I've seen large data projects mention free space - you can really gunk up a system if your database doesn't have some amount of slack to move data around. I've seen requirements that specify bells and whistles and emergency measures to make sure that there is always enough room to keep the database humming.

Virtual Memory

Most of the literature on Virtual Memory point out that the as a Application developer,understanding Virtual Memory can help me in harnessing its powerful capabilities. I have been involved in developing applications on Linux for sometime but but didn't care about Virtual Memory intricacies while I code. Am I missing something? If so, please shed some light on how I can leverage the workings of Virtual Memory. Else let me know if am I not making sense with the question!
Well, the concept is pretty simple actually. I won't repeat it here, but you should pick up any book on OS design and it will be explained there. I recommend the "Operating System Concepts" from Silberscahtz and Galvin - it's what I had to use in the University and it's good.
A couple of things that I can think of what Virtual Memory knowledge might give you are:
Learning to allocate memory on page boundaries to avoid waste (applies only to virtual memory, not the usual heap/stack memory);
Lock some pages in RAM so they don't get swapped to HDD;
Guardian pages;
Reserving some address range and committing actual memory later;
Perhaps using the NX (non-executable) bit to increase security, but im not sure on this one.
PAE for accessing >4GB on a 32-bit system.
Still, all of these things would have uses only in quite specific scenarios. Indeed, 99% of applications need not concern themselves about this.
Added: That said, it's definately good to know all these things, so that you can identify such scenarios when they arise. Just beware - with power comes responsibility.
It's a bit of a vague question.
The way you can use virtual memory, is chiefly through the use of memory-mapped files. See the mmap() man page for more details.
Although, you are probably using it implicitly anyway, as any dynamic library is implemented as a mapped file, and many database libraries use them too.
The interface to use mapped files from higher level languages is often quite inconvenient, which makes them less useful.
The chief benefits of using mapped files are:
No system call overhead when accessing parts of the file (this actually might be a disadvantage, as a page fault probably has as much overhead anyway, if it happens)
No need to copy data from OS buffers to application buffers - this can improve performance
Ability to share memory between processes.
Some drawbacks are:
32-bit machines can run out of address space easily
Tricky to handle file extending correctly
No easy way to see how many / which pages are currently resident (there may be some ways however)
Not good for real-time applications, as a page fault may cause an IO request, which blocks the thread (the file can be locked in memory however, but only if there is enough).
May be 9 out of 10 cases you need not worry about virtual memory management. That's the job of the kernel. May be in some highly specialized applications do you need to tweak around them.
I know of one article that talks about computer memory management with an emphasis on Linux [ http://lwn.net/Articles/250967 ]. Hope this helps.
For most applications today, the programmer can remain unaware of the workings of computer memory without any harm. But sometimes -- for example the case when you want to improve the footprint of your program -- you do end up having to manipulate memory yourself. In such situations, knowing how memory is designed to work is essential.
In other words, although you can indeed survive without it, learning about virtual memory will only make you a better programmer.
And I would think the Wikipedia article can be a good start.
If you are concerned with performance -- understanding memory hierarchy is important.
For small data sets which are fully contained in physical memory you need to be concerned with caching (accessing memory from the cache is much faster).
When dealing with large data sets -- which may be paged out due to lack of physical memory you need to be careful to keep your access patterns localized.
For example if you declare a matrix in C (int a[rows][cols]), it is allocated by rows. Thus when scanning the matrix, you need to scan by rows rather than by columns. Otherwise you will be paging the same data in and out many times.
Another issue is the difference between dirty and clean data held in memory. Clean data is information loaded from file that was not modified by the program. The OS may page out clean data (perhaps depending on how it was loaded) without writing it to disk. Dirty pages must first be written to the swap file.

Was there something in Cobol intrinsically making it susceptible to Y2K issues?

I know that a lot of Y2K efforts/scare was somehow centered on COBOL, deservedly or not.
(heck, I saw minor Y2K bug in a Perl scripts that broke 1/1/2000)
What I'm interested in, was there something specific to COBOL as a language which made it susceptible to Y2K issues?
That is, as opposed to merely the age of most programs written in it and subsequent need to skimp on memory/disk usage driven by old hardware and the fact that nobody anticipated those programs to survive for 30 years?
I'm perfectly happy if the answer is "nothing specific to COBOL other than age" - merely curious, knowing nothing about COBOL.
It was 80% about storage capacity, pure and simple.
People don't realize that the capacity of their laptop hard drive today would have cost millions in 1980. You think saving two bytes is silly? Not when you have a 100,000 customer records and a hard drive the size of a refrigerator held 20 megabytes and required a special room to keep cool.
Yes and No. In COBOL you had to declare variables such that you actually had to say how many digits there were in a number i.e., YEAR PIC 99 declared the variable YEAR such that it could only hold two decimal digits. So yes, it was easier to make that mistake than in C were you would have int or short or char as the year and still have plenty of room for years greater than 99. Of course that doesn't protect you from printfing 19%d in C and still having the problem in your output, or making other internal calculations based on thinking the year would be less than or equal to 99.
It seemed to be more a problem of people not knowing how long their code would be used, so they chose to use 2 digit years.
So, nothing specific to COBOL, it is just that COBOL programs tend to be critical and old so they were more likely to be affected.
Was there something in Cobol intrinsically making it susceptible to Y2K issues?
Programmers1. And the systems where COBOL programs run2.
1: They didn't design forward looking 30 years. I can't blame them really. If I had memory constraints, between squeezing 2 bytes per date and making it work 30 years latter, most likely I would make the same decision.
2: The systems could have had the same problem if the hardware stored the year in two digits.
Fascinating question. What is the Y2K problem, in essence? It's the problem of not defining your universe sufficiently. There was no serious attempt to model all dates, because space was more important (and the apps would be replaced by then). So in Cobol at every level, that's important: to be efficient and not overdeclare the memory you need, both at the store and at the program level.
Where efficiency is important, we commit Y2Kish errors... We do this every time we store a date in the DB without a timezone. So modern storage is definitely subject to Y2Kish errors, because we try to be efficient with space used (though I bet it's over-optimizing in many cases, especially at the enterprise overdo-everything level).
On the other hand, we avoid Y2Kish errors on the application level because every time you work with, say, a Date (in Java, let's say) it always carries around a ton of baggage (like timezone). Why? Because Date (and many other concepts) are now part of the OS, so the OS-making smart dudes try to model a full-blown concept of date. Since we rely on their concept of date, we can't screw it up... and it's modular and replaceable!
Newer languages with built-in datatypes (and facilities) for many things like date, as well as huge memory to play with, help avoid a lot of potential Y2Kish problems.
It was two part. 1- the age/longevity of Cobol software, and 2- the 80 character limit of data records.
First- Most software of that age used only 2 digit numbers for year storage, since no one figured their software would last that long! COBOL had been adopted by the banking industry, who are notorious for never throwing away code. Most other software WAS thrown away, while the banks didn't!
Secondly, COBOL was constrained to 80 characters per record of data (due to the size of punch cards!), developers were at an even greater pressure to limit the size of fields. Because they figured "year 2000 won't be here till I'm long and retired!" the 2 characters of saved data were huge!
It was much more related to storing the year in data items that could only hold values from 0 to 99 (two characters, or two decimal digits, or a single byte). That and calculations that made similar assumptions about year values.
It was hardly a Cobol-specific thing. Lots of programs were impacted.
There were some things about COBOL that aggravated the situation.
it's old, so less use of library code, more homegrown everything
it's old, so pre-internet, pre-social-networking, more NIH, fewer frameworks, more custom stuff
it's old, so, less abstract, more likely to have open-coded solutions
it's old, so, go back far enough and saving 2 bytes might have, sort of, been important
it's old, so, so it predates SQL. Legacy operating software even had indexed record-oriented disk files to make rolling-your-own-database-in-every-program a little bit easier.
"printf" format strings and data type declaration were the same thing, everything had n digits
I've seen giant Fortran programs with no actual subroutines. Really, one 3,000-line main program, not a single non-library subroutine, that was it. I suppose this might have happened in the COBOL world, so now you have to read every line to find the date handling.
COBOL never came with any standard date handling library.
So everyone coded their own solution.
Some solutions were very bad vis-a-vis the millennium. Most of those bad solutions did not matter as the applications did not live 40+ years. The not-so tiny minority of bad solutions cause the well-known Y2K problem in the business world.
(Some solutions were better. I know of COBOL systems coded in the 1950s with a date format good until 2027 -- must have seemed forever at the time; and I designed systems in the 1970s that are good until 2079).
However, had COBOL had a standard date type....
03 ORDER-DATE PIC DATE.
....Industry wide solutions would have been available at the compiler level, cutting the complexity of any remediation needed.
Moral: use languages with standard libraries.
COBOL 85 (the 1985 standard) and earlier versions didn't have any way to obtain the current century**, which was one factor intrinsic to COBOL that discouraged the use of 4-digit years even after 2 bytes extra storage space was no longer an issue.
** Specific implementations may have had non standard extensions for this purpose.
The only intrinsic issue with Cobol was it's original (late 1960s) standard statement for retrieving the current system date, which was:
ACCEPT todays-date FROM DATE
This returned a 6-digit number with the date in YYMMDD format.
However, even this was not necessarily a problem as we wrote code in the 90's using this statement which just checked if the year portion was less than 70 and assumed that the date was 20YY, which would have made it a Y2K070 problem. :-)
The standard was extended later (COBOL-85, I think) so you could ask for the date in different formats, like:
ACCEPT todays-date FROM CENTURY-DATE
Which gave you an 8-digit number with the date as CCYYMMDD.
As you, and others have pointed out, many other computer programming languages allowed for 'lossy' representation of dates/years.
The problem was really about memory and storage constraints in the late 70s early 80s.
When your quarter of a million bucks computer had 128K and 4 disks totalling about 6 megabytes you could either ask your management for another quarter mill for a 256K machine with 12 meg of disk storage or be very very efficient about space.
So all sorts of space saving tricks were usered. My favourite was to store YYMMDD date as 991231 in a packed decimal field x'9912310C' then knock of the last byte and store it as '991231'. So instead of 6 bytes you only took up 3 bytes.
Other tricks included some hokey Hooky type encodeing for prices -- code 12 -> $19.99.

What is the most critical piece of code you have written and how did you approach it?

Put it another way: what code have you written that cannot fail. I'm interested in hearing from those who have worked on projects dealing with heart monitors, water testing, economic fundamentals, missile trajectories, or the O2 concentration on the space shuttle.
How did you prepare for writing this sort of code: methodologically, intellectually, and emotionally?
Edit
I've marked this wiki in case the rep issue is keeping people from replying. I thought there would be a good deal more perspective on this issue than there has been.
While I am not personally involved in what is described there, this article will hopefully contribute to the spirit of your question: They Write the Right Stuff.
I wrote a driver for a blood pressure measuring device for hospital use. If it "fails", the patient will not have his blood pressure checked at the scheduled time; if his blood pressure is abnormal, no alarm (in the larger system) will be triggered. Such an event could be clinically significant.
My approach was to thoroughly read the spec/documentation in a non-work environment (to avoid the temptation to start coding right away), then read it again at work. After that, I summarized the possible states and actions on paper and "flowcharted" an algorithm, and annotated all the potential real-world "bad events" (cables getting unplugged, batteries dying, etc). Finally, I wrote and rewrote the driver three times, each with different mechanisms (e.g. FSM), and compared their results. Each iteration helped me identify weaknesses I hadn't yet discovered. The third rewrite was the "official" result. I reviewed each iteration with my co-worker.
Emotional preparation consisted of convincing myself that should the unthinkable happen, at least I wasn't willfully negligent -- just incompetent (the old "I'm only human" excuse). ;-)
I have written computer interface to a MRI machine. It had no chance of hurting the end user as it was just record management, but it could potentially have given an incorrect diagnosis or omit important information.
Tests, lots and lots of tests.
Unit tests, mid and high level tests. Simulate all possible input combinations. Also a great deal of testing with the hardware itself. Testing must be done in a complete and methodical way. It should take a great deal more time to test than to write.
Error Reporting
All errors must be reported and be obvious. If it won't hurt the patent to do so, fail fast.
For something that is actively keeping a person alive things are even worse. It must never stop working. If it fails it needs to restart and keep trying. Redundant internals are also a must in case the hardware fails.
At the wrong company it can really a difficult kind of situation to work in. However, if things are going well, you are well funded and release pressure is not high, it can be a very rewarding space to work in.
Not really an answer, but:
I've got a friend who writes embedded control software for laser eye surgery machines. When he had laser eye surgery himself, he made sure to go to an ophthalmologist who used his company's system. I have great admiration for this guy. I can't think of a piece of software I've ever written whose level of quality was high enough that I'd trust my own eyesight to it.
Right now I'm working on some base code for a system that retrieves medical patient information from clinics and hospitals for a medical billing office. We're starting out with a smaller client and a long break-in period to ensure quality, but eventually this code needs to securely handle a large variety of report formats from a number of clients at different facilities.
It's not quite in the same scale as your examples, but a bad mistake could result in the wrong people being billed or the right person billed to a defunct address (screwing up credit reports) or open people up to identity theft, so it's still pretty critical. Oh yeah, and it could mean doctors don't get paid quite as quick. That's important, too, especially from a business perspective, but not in the same class as data protection and integrity.
I've heard crazy stories of the processes used to write code at NASA for the spaceshuttles. Every line of code has about 10-20 lines of documentation, along with tests, full revision history, etc. Every time a bug is found, not only is the code evaluated and repaired, but the entire procedure of writing code, the entire command chain, etc. is reviewed to answer the question: "What happened wrong in our process that allowed this bug to get included in the first place?"
While nothing quite so important as an MRI machine or a blood pressure monitor, I did get tapped to do a rewrite of Blackjack when I worked for an online gambling provider. Blackjack is by far the most popular online game, and millions of dollars was going to go through this software (and did).
I wrote the game engine separate from the server and the client, and used Test Driven Development to ensure that what I was assuming was coming through in the results. I also had a wrapper "server" that had console output that would allow me to play. This was actually only useful in that it mimicked the real server interface, since playing a text version of blackjack isn't very fun or easy ("You draw a 10. You now have a 10 and a 6, while the dealer has a 6 showing. [bsd] >")
The game is still being run on some sites out there, and to my knowledge, has never had any financial bugs after years of play.
My first "real" software job was writing a GUI app for planning stereotactic brain surgery. Testing, testing, testing... absolutely no formal methods, engineering-style thoughts, just younger programmers cranking it out. When they started talking about using the software to control a robotic arm with a laser, without any serious engineering methods in place, i got a bit worried, left for more officey lands.
I've created information system application for local government cultures and tourism department in Bali island which were installed in several tourism denstinations, providing extensive informations about the culture, maps, accomodations etc.
if it failed then probably tourists couldnt get the right informations they need most, cheat by brookers, or lost somewhere :)

Resources