How does a "stack overflow" occur and how do you prevent it? - memory

How does a stack overflow occur and what are the ways to make sure it doesn't happen, or ways to prevent one?

Stack
A stack, in this context, is the last in, first out buffer you place data while your program runs. Last in, first out (LIFO) means that the last thing you put in is always the first thing you get back out - if you push 2 items on the stack, 'A' and then 'B', then the first thing you pop off the stack will be 'B', and the next thing is 'A'.
When you call a function in your code, the next instruction after the function call is stored on the stack, and any storage space that might be overwritten by the function call. The function you call might use up more stack for its own local variables. When it's done, it frees up the local variable stack space it used, then returns to the previous function.
Stack overflow
A stack overflow is when you've used up more memory for the stack than your program was supposed to use. In embedded systems you might only have 256 bytes for the stack, and if each function takes up 32 bytes then you can only have function calls 8 deep - function 1 calls function 2 who calls function 3 who calls function 4 .... who calls function 8 who calls function 9, but function 9 overwrites memory outside the stack. This might overwrite memory, code, etc.
Many programmers make this mistake by calling function A that then calls function B, that then calls function C, that then calls function A. It might work most of the time, but just once the wrong input will cause it to go in that circle forever until the computer recognizes that the stack is overblown.
Recursive functions are also a cause for this, but if you're writing recursively (ie, your function calls itself) then you need to be aware of this and use static/global variables to prevent infinite recursion.
Generally, the OS and the programming language you're using manage the stack, and it's out of your hands. You should look at your call graph (a tree structure that shows from your main what each function calls) to see how deep your function calls go, and to detect cycles and recursion that are not intended. Intentional cycles and recursion need to be artificially checked to error out if they call each other too many times.
Beyond good programming practices, static and dynamic testing, there's not much you can do on these high level systems.
Embedded systems
In the embedded world, especially in high reliability code (automotive, aircraft, space) you do extensive code reviews and checking, but you also do the following:
Disallow recursion and cycles - enforced by policy and testing
Keep code and stack far apart (code in flash, stack in RAM, and never the twain shall meet)
Place guard bands around the stack - empty area of memory that you fill with a magic number (usually a software interrupt instruction, but there are many options here), and hundreds or thousands of times a second you look at the guard bands to make sure they haven't been overwritten.
Use memory protection (ie, no execute on the stack, no read or write just outside the stack)
Interrupts don't call secondary functions - they set flags, copy data, and let the application take care of processing it (otherwise you might get 8 deep in your function call tree, have an interrupt, and then go out another few functions inside the interrupt, causing the blowout). You have several call trees - one for the main processes, and one for each interrupt. If your interrupts can interrupt each other... well, there be dragons...
High-level languages and systems
But in high level languages run on operating systems:
Reduce your local variable storage (local variables are stored on the stack - although compilers are pretty smart about this and will sometimes put big locals on the heap if your call tree is shallow)
Avoid or strictly limit recursion
Don't break your programs up too far into smaller and smaller functions - even without counting local variables each function call consumes as much as 64 bytes on the stack (32 bit processor, saving half the CPU registers, flags, etc)
Keep your call tree shallow (similar to the above statement)
Web servers
It depends on the 'sandbox' you have whether you can control or even see the stack. Chances are good you can treat web servers as you would any other high level language and operating system - it's largely out of your hands, but check the language and server stack you're using. It is possible to blow the stack on your SQL server, for instance.

A stack overflow in real code occurs very rarely. Most situations in which it occurs are recursions where the termination has been forgotten. It might however rarely occur in highly nested structures, e.g. particularly large XML documents. The only real help here is to refactor the code to use an explicit stack object instead of the call stack.

Most people will tell you that a stack overflow occurs with recursion without an exit path - while mostly true, if you work with big enough data structures, even a proper recursion exit path won't help you.
Some options in this case:
Breadth-first search
Tail recursion, .Net-specific great blog post (sorry, 32-bit .Net)

Infinite recursion is a common way to get a stack overflow error. To prevent - always make sure there's an exit path that will be hit. :-)
Another way to get a stack overflow (in C/C++, at least) is to declare some enormous variable on the stack.
char hugeArray[100000000];
That'll do it.

Aside from the form of stack overflow that you get from a direct recursion (eg Fibonacci(1000000)), a more subtle form of it that I have experienced many times is an indirect recursion, where a function calls another function, which calls another, and then one of those functions calls the first one again.
This can commonly occur in functions that are called in response to events but which themselves may generate new events, for example:
void WindowSizeChanged(Size& newsize) {
// override window size to constrain width
newSize.width=200;
ResizeWindow(newSize);
}
In this case the call to ResizeWindow may cause the WindowSizeChanged() callback to be triggered again, which calls ResizeWindow again, until you run out of stack. In situations like these you often need to defer responding to the event until the stack frame has returned, eg by posting a message.

Usually a stack overflow is the result of an infinite recursive call (given the usual amount of memory in standard computers nowadays).
When you make a call to a method, function or procedure the "standard" way or making the call consists on:
Pushing the return direction for the call into the stack(that's the next sentence after the call)
Usually the space for the return value get reserved into the stack
Pushing each parameter into the stack (the order diverges and depends on each compiler, also some of them are sometimes stored on the CPU registers for performance improvements)
Making the actual call.
So, usually this takes a few bytes depeding on the number and type of the parameters as well as the machine architecture.
You'll see then that if you start making recursive calls the stack will begin to grow. Now, stack is usually reserved in memory in such a way that it grows in opposite direction to the heap so, given a big number of calls without "coming back" the stack begins to get full.
Now, on older times stack overflow could occur simply because you exausted all available memory, just like that. With the virtual memory model (up to 4GB on a X86 system) that was out of the scope so usually, if you get an stack overflow error, look for an infinite recursive call.

I have recreated the stack overflow issue while getting a most common Fibonacci number i.e. 1, 1, 2, 3, 5..... so calculation for fib(1) = 1 or fib(3) = 2.. fib(n) = ??.
for n, let say we will interested - what if n = 100,000 then what will be the corresponding Fibonacci number ??
The one loop approach is as below -
package com.company.dynamicProgramming;
import java.math.BigInteger;
public class FibonacciByBigDecimal {
public static void main(String ...args) {
int n = 100000;
BigInteger[] fibOfnS = new BigInteger[n + 1];
System.out.println("fibonacci of "+ n + " is : " + fibByLoop(n));
}
static BigInteger fibByLoop(int n){
if(n==1 || n==2 ){
return BigInteger.ONE;
}
BigInteger fib = BigInteger.ONE;
BigInteger fip = BigInteger.ONE;
for (int i = 3; i <= n; i++){
BigInteger p = fib;
fib = fib.add(fip);
fip = p;
}
return fib;
}
}
this quite straight forward and result is -
fibonacci of 100000 is : 2597406934722172416615503402127591541488048538651769658472477070395253454351127368626555677283671674475463758722307443211163839947387509103096569738218830449305228763853133492135302679278956701051276578271635608073050532200243233114383986516137827238124777453778337299916214634050054669860390862750996639366409211890125271960172105060300350586894028558103675117658251368377438684936413457338834365158775425371912410500332195991330062204363035213756525421823998690848556374080179251761629391754963458558616300762819916081109836526352995440694284206571046044903805647136346033000520852277707554446794723709030979019014860432846819857961015951001850608264919234587313399150133919932363102301864172536477136266475080133982431231703431452964181790051187957316766834979901682011849907756686456845066287392485603914047605199550066288826345877189410680370091879365001733011710028310473947456256091444932821374855573864080579813028266640270354294412104919995803131876805899186513425175959911520563155337703996941035518275274919959802257507902037798103089922984996304496255814045517000250299764322193462165366210841876745428298261398234478366581588040819003307382939500082132009374715485131027220817305432264866949630987914714362925554252624043999615326979876807510646819068792118299167964409178271868561702918102212679267401362650499784968843680975254700131004574186406448299485872551744746695651879126916993244564817673322257149314967763345846623830333820239702436859478287641875788572910710133700300094229333597292779191409212804901545976262791057055248158884051779418192905216769576608748815567860128818354354292307397810154785701328438612728620176653953444993001980062953893698550072328665131718113588661353747268458543254898113717660519461693791688442534259478126310388952047956594380715301911253964847112638900713362856910155145342332944128435722099628674611942095166100230974070996553190050815866991144544264788287264284501725332048648319457892039984893823636745618220375097348566847433887249049337031633826571760729778891798913667325190623247118037280173921572390822769228077292456662750538337500692607721059361942126892030256744356537800831830637593334502350256972906515285327194367756015666039916404882563967693079290502951488693413799125174856667074717514938979038653338139534684837808612673755438382110844897653836848318258836339917310455850905663846202501463131183108742907729262215943020429159474030610183981685506695026197376150857176119947587572212987205312060791864980361596092339594104118635168854883911918517906151156275293615849000872150192226511785315089251027528045151238603792184692121533829287136924321527332714157478829590260157195485316444794546750285840236000238344790520345108033282013803880708980734832620122795263360677366987578332625485944906021917368867786241120562109836985019729017715780112040458649153935115783499546100636635745448508241888279067531359950519206222976015376529797308588164873117308237059828489404487403932053592935976454165560795472477862029969232956138971989467942218727360512336559521133108778758228879597580320459608479024506385194174312616377510459921102486879496341706862092908893068525234805692599833377510390101316617812305114571932706629167125446512151746802548190358351688971707570677865618800822034683632101813026232996027599403579997774046244952114531588370357904483293150007246173417355805567832153454341170020258560809166294198637401514569572272836921963229511187762530753402594781448204657460288485500062806934811398276016855584079542162057543557291510641537592939022884356120792643705560062367986544382464373946972471945996555795505838034825597839682776084731530251788951718630722761103630509360074262261717363058613291544024695432904616258691774630578507674937487992329181750163484068813465534370997589353607405172909412697657593295156818624747127636468836551757018353417274662607306510451195762866349922848678780591085118985653555434958761664016447588028633629704046289097067736256584300235314749461233912068632146637087844699210427541569410912246568571204717241133378489816764096924981633421176857150311671040068175303192115415611958042570658693127276213710697472226029655524611053715554532499750843275200199214301910505362996007042963297805103066650638786268157658772683745128976850796366371059380911225428835839194121154773759981301921650952140133306070987313732926518169226845063443954056729812031546392324981793780469103793422169495229100793029949237507299325063050942813902793084134473061411643355614764093104425918481363930542369378976520526456347648318272633371512112030629233889286487949209737847861884868260804647319539200840398308008803869049557419756219293922110825766397681361044490024720948340326796768837621396744075713887292863079821849314343879778088737958896840946143415927131757836511457828935581859902923534388888846587452130838137779443636119762839036894595760120316502279857901545344747352706972851454599861422902737291131463782045516225447535356773622793648545035710208644541208984235038908770223039849380214734809687433336225449150117411751570704561050895274000206380497967960402617818664481248547269630823473377245543390519841308769781276565916764229022948181763075710255793365008152286383634493138089971785087070863632205869018938377766063006066757732427272929247421295265000706646722730009956124191409138984675224955790729398495608750456694217771551107346630456603944136235888443676215273928597072287937355966723924613827468703217858459948257514745406436460997059316120596841560473234396652457231650317792833860590388360417691428732735703986803342604670071717363573091122981306903286137122597937096605775172964528263757434075792282180744352908669606854021718597891166333863858589736209114248432178645039479195424208191626088571069110433994801473013100869848866430721216762473119618190737820766582968280796079482259549036328266578006994856825300536436674822534603705134503603152154296943991866236857638062351209884448741138600171173647632126029961408561925599707566827866778732377419444462275399909291044697716476151118672327238679208133367306181944849396607123345271856520253643621964198782752978813060080313141817069314468221189275784978281094367751540710106350553798003842219045508482239386993296926659221112742698133062300073465628498093636693049446801628553712633412620378491919498600097200836727876650786886306933418995225768314390832484886340318940194161036979843833346608676709431643653538430912157815543512852077720858098902099586449602479491970687230765687109234380719509824814473157813780080639358418756655098501321882852840184981407690738507369535377711880388528935347600930338598691608289335421147722936561907276264603726027239320991187820407067412272258120766729040071924237930330972132364184093956102995971291799828290009539147382437802779051112030954582532888721146170133440385939654047806199333224547317803407340902512130217279595753863158148810392952475410943880555098382627633127606718126171022011356181800775400227516734144169216424973175621363128588281978005788832454534581522434937268133433997710512532081478345067139835038332901313945986481820272322043341930929011907832896569222878337497354301561722829115627329468814853281922100752373626827643152685735493223028018101449649009015529248638338885664893002250974343601200814365153625369199446709711126951966725780061891215440222487564601554632812091945824653557432047644212650790655208208337976071465127508320487165271577472325887275761128357592132553934446289433258105028633583669291828566894736223508250294964065798630809614341696830467595174355313224362664207197608459024263017473392225291248366316428006552870975051997504913009859468071013602336440164400179188610853230764991714372054467823597211760465153200163085336319351589645890681722372812310320271897917951272799656053694032111242846590994556380215461316106267521633805664394318881268199494005537068697621855231858921100963441012933535733918459668197539834284696822889460076352031688922002021931318369757556962061115774305826305535862015637891246031220672933992617378379625150999935403648731423208873977968908908369996292995391977217796533421249291978383751460062054967341662833487341011097770535898066498136011395571584328308713940582535274056081011503907941688079197212933148303072638678631411038443128215994936824342998188719768637604496342597524256886188688978980888315865076262604856465004322896856149255063968811404400429503894245872382233543101078691517328333604779262727765686076177705616874050257743749983775830143856135427273838589774133526949165483929721519554793578923866762502745370104660909382449626626935321303744538892479216161188889702077910448563199514826630802879549546453583866307344423753319712279158861707289652090149848305435983200771326653407290662016775706409690183771201306823245333477966660525325490873601961480378241566071271650383582257289215708209369510995890132859490724306183325755201208090007175022022949742801823445413711916298449914722254196594682221468260644961839254249670903104007581488857971672246322887016438403908463856731164308169537326790303114583680575021119639905615169154708510459700542098571797318015564741406172334145847111268547929892443001391468289103679179216978616582489007322033591376706527676521307143985302760988478056216994659655461379174985659739227379416726495377801992098355427866179123126699374730777730569324430166839333011554515542656864937492128687049121754245967831132969248492466744261999033972825674873460201150442228780466124320183016108232183908654771042398228531316559685688005226571474428823317539456543881928624432662503345388199590085105211383124491861802624432195540433985722841341254409411771722156867086291742124053110620522842986199273629406208834754853645128123279609097213953775360023076765694208219943034648783348544492713539450224591334374664937701655605763384697062918725745426505879414630176639760457474311081556747091652708748125267159913793240527304613693961169892589808311906322510777928562071999459487700611801002296132304588294558440952496611158342804908643860880796440557763691857743754025896855927252514563404385217825890599553954627451385454452916761042969267970893580056234501918571489030418495767400819359973218711957496357095967825171096264752068890806407651445893132870767454169607107931692704285168093413311046353506242209810363216771910420786162184213763938194625697286781413636389620123976910465418956806197323148414224550071617215851321302030684176087215892702098879108938081045903397276547326416916845445627600759561367103584575649094430692452532085003091068783157561519847567569191284784654692558665111557913461272425336083635131342183905177154511228464455136016013513228948543271504760839307556100908786096663870612278690274831819331606701484957163004705262228238406266818448788374548131994380387613830128859885264201992286188208499588640888521352501457615396482647451025902530743172956899636499615707551855837165935367125448515089362904567736630035562457374779100987992499146967224041481601289530944015488942613783140087804311431741858071826185149051138744831358439067228949408258286021650288927228387426432786168690381960530155894459451808735197246008221529343980828254126128257157209350985382800738560472910941184006084485235377833503306861977724501886364070344973366473100602018128792886991861824418453968994777259482169137133647470453172979809245844361129618997595696240971845564020511432589591844724920942930301651488713079802102379065536525154780298059407529440513145807551537794861635879901158192019808879694967187448224156836463534326160242632934761634458163890163805123894184523973421841496889262398489648642093409816681494771155177009562669029850101513537599801272501241971119871526593747484778935488777815192931171431167444773882941064615028751327709474504763922874890662989841540259350834035142035136168819248238998027706666916342133424312054507359388616687691188185776118135771332483965209882085982391298606386822804754362408956522921410859852037330544625953261340234864689275060526893755148403298542086991221052597005628576707702567695300978970046408920009852106980295419699802138053295798159478289934443245491565327845223840551240445208226435420656313310702940722371552770504263482073984454889589248861397657079145414427653584572951329719091947694411910966797474262675590953832039169673494261360032263077428684105040061351052194413778158095005714526846009810352109249040027958050736436961021241137739717164869525493114805040126568351268829598413983222676377804500626507241731757395219796890754825199329259649801627068665658030178877405615167159731927320479376247375505855052839660294566992522173600874081212014209071041937598571721431338017425141582491824710905084715977249417049320254165239323233258851588893337097136310892571531417761978326033750109026284066415801371359356529278088456305951770081443994114674291850360748852366654744869928083230516815711602911836374147958492100860528981469547750812338896943152861021202736747049903930417035171342126923486700566627506229058636911882228903170510305406882096970875545329369434063981297696478031825451642178347347716471058423238594580183052756213910186997604305844068665712346869679456044155742100039179758348979935882751881524675930878928159243492197545387668305684668420775409821781247053354523194797398953320175988640281058825557698004397120538312459428957377696001857497335249965013509368925958021863811725906506436882127156815751021712900765992750370228283963962915973251173418586721023497317765969454283625519371556009143680329311962842546628403142444370648432390374906410811300792848955767243481200090309888457270907750873638873299642555050473812528975962934822878917619920725138309388288292510416837622758204081918933603653875284116785703720989718832986921927816629675844580174911809119663048187434155067790863948831489241504300476704527971283482211522202837062857314244107823792513645086677566622804977211397140621664116324756784216612961477109018826094677377686406176721484293894976671380122788941309026553511096118347012565197540807095384060916863936906673786627209429434264260402902158317345003727462588992622049877121178405563348492490326003508569099382392777297498413565614830788262363322368380709822346012274241379036473451735925215754757160934270935192901723954921426490691115271523338109124042812102893738488167358953934508930697715522989199698903885883275409044300321986834003470271220020159699371690650330547577095398748580670024491045504890061727189168031394528036165633941571334637222550477547460756055024108764382121688848916940371258901948490685379722244562009483819491532724502276218589169507405794983759821006604481996519360110261576947176202571702048684914616894068404140833587562118319210838005632144562018941505945780025318747471911604840677997765414830622179069330853875129298983009580277554145435058768984944179136535891620098725222049055183554603706533183176716110738009786625247488691476077664470147193074476302411660335671765564874440577990531996271632972009109449249216456030618827772947750764777446452586328919159107444252320082918209518021083700353881330983215894608680127954224752071924134648334963915094813097541433244209299930751481077919002346128122330161799429930618800533414550633932139339646861616416955220216447995417243171165744471364197733204899365074767844149929548073025856442942381787641506492878361767978677158510784235702640213388018875601989234056868423215585628508645525258377010620532224244987990625263484010774322488172558602233302076399933854152015343847725442917895130637050320444917797752370871958277976799686113626532291118629631164685159934660693460557545956063155830033697634000276685151293843638886090828376141157732003527565158745906567025439437931104838571313294490604926582363108949535090082673154497226396648088618041573977888472892174618974189721700770009862449653759012727015227634510874906948012210684952063002519011655963580552429180205586904259685261047412834518466736938580027700252965356366721619883672428226933950325930390994583168665542234654857020875504617520521853721567282679903418135520602999895366470106557900532129541336924472492212436324523042895188461779122338069674233980694887270587503389228395095135209123109258159006960395156367736067109050566299603571876423247920752836160805597697778756476767210521222327184821484446631261487584226092608875764331731023263768864822594691211032367737558122133470556805958008310127481673962019583598023967414489867276845869819376783757167936723213081586191045995058970991064686919463448038574143829629547131372173669836184558144505748676124322451519943362182916191468026091121793001864788050061351603144350076189213441602488091741051232290357179205497927970924502479940842696158818442616163780044759478212240873204124421169199805572649118243661921835714762891425805771871743688000324113008704819373962295017143090098476927237498875938639942530595331607891618810863505982444578942799346514915952884869757488025823353571677864826828051140885429732788197765736966005727700162592404301688659946862983717270595809808730901820120931003430058796552694788049809205484305467611034654748067290674399763612592434637719995843862812391985470202414880076880818848087892391591369463293113276849329777201646641727587259122354784480813433328050087758855264686119576962172239308693795757165821852416204341972383989932734803429262340722338155102209101262949249742423271698842023297303260161790575673111235465890298298313115123607606773968998153812286999642014609852579793691246016346088762321286205634215901479188632194659637483482564291616278532948239313229440231043277288768139550213348266388687453259281587854503890991561949632478855035090289390973718988003999026132015872678637873095678109625311008054489418857983565902063680699643165033912029944327726770869305240718416592070096139286401966725750087012218149733133695809600369751764951350040285926249203398111014953227533621844500744331562434532484217986108346261345897591234839970751854223281677187215956827243245910829019886390369784542622566912542747056097567984857136623679023878478161201477982939080513150258174523773529510165296934562786122241150783587755373348372764439838082000667214740034466322776918936967612878983488942094688102308427036452854504966759697318836044496702853190637396916357980928865719935397723495486787180416401415281489443785036291071517805285857583987711145474240156416477194116391354935466755593592608849200546384685403028080936417250583653368093407225310820844723570226809826951426162451204040711501448747856199922814664565893938488028643822313849852328452360667045805113679663751039248163336173274547275775636810977344539275827560597425160705468689657794530521602315939865780974801515414987097778078705357058008472376892422189750312758527140173117621279898744958406199843913365680297721208751934988504499713914285158032324823021340630312586072624541637765234505522051086318285359658520708173392709566445011404055106579055037417780393351658360904543047721422281816832539613634982525215232257690920254216409657452618066051777901592902884240599998882753691957540116954696152270401280857579766154722192925655963991820948894642657512288766330302133746367449217449351637104725732980832812726468187759356584218383594702792013663907689741738962252575782663990809792647011407580367850599381887184560094695833270775126181282015391041773950918244137561999937819240362469558235924171478702779448443108751901807414110290370706052085162975798361754251041642244867577350756338018895379263183389855955956527857227926155524494739363665533904528656215464288343162282921123290451842212532888101415884061619939195042230059898349966569463580186816717074818823215848647734386780911564660755175385552224428524049468033692299989300783900020690121517740696428573930196910500988278523053797637940257968953295112436166778910585557213381789089945453947915927374958600268237844486872037243488834616856290097850532497036933361942439802882364323553808208003875741710969289725499878566253048867033095150518452126944989251596392079421452606508516052325614861938282489838000815085351564642761700832096483117944401971780149213345335903336672376719229722069970766055482452247416927774637522135201716231722137632445699154022395494158227418930589911746931773776518735850032318014432883916374243795854695691221774098948611515564046609565094538115520921863711518684562543275047870530006998423140180169421109105925493596116719457630962328831271268328501760321771680400249657674186927113215573270049935709942324416387089242427584407651215572676037924765341808984312676941110313165951429479377670698881249643421933287404390485538222160837088907598277390184204138197811025854537088586701450623578513960109987476052535450100439353062072439709976445146790993381448994644609780957731953604938734950026860564555693224229691815630293922487606470873431166384205442489628760213650246991893040112513103835085621908060270866604873585849001704200923929789193938125116798421788115209259130435572321635660895603514383883939018953166274355609970015699780289236362349895374653428746875
Now another approach I have applied is through Divide and Concur via recursion
i.e. Fib(n) = fib(n-1) + Fib(n-2) and then further recursion for n-1 & n-2.....till 2 & 1. which is programmed as -
package com.company.dynamicProgramming;
import java.math.BigInteger;
public class FibonacciByBigDecimal {
public static void main(String ...args) {
int n = 100000;
BigInteger[] fibOfnS = new BigInteger[n + 1];
System.out.println("fibonacci of "+ n + " is : " + fibByDivCon(n, fibOfnS));
}
static BigInteger fibByDivCon(int n, BigInteger[] fibOfnS){
if(fibOfnS[n]!=null){
return fibOfnS[n];
}
if (n == 1 || n== 2){
fibOfnS[n] = BigInteger.ONE;
return BigInteger.ONE;
}
// creates 2 further entries in stack
BigInteger fibOfn = fibByDivCon(n-1, fibOfnS).add( fibByDivCon(n-2, fibOfnS)) ;
fibOfnS[n] = fibOfn;
return fibOfn;
}
}
When i ran the code for n = 100,000 the result is as below -
Exception in thread "main" java.lang.StackOverflowError
at com.company.dynamicProgramming.FibonacciByBigDecimal.fibByDivCon(FibonacciByBigDecimal.java:29)
at com.company.dynamicProgramming.FibonacciByBigDecimal.fibByDivCon(FibonacciByBigDecimal.java:29)
at com.company.dynamicProgramming.FibonacciByBigDecimal.fibByDivCon(FibonacciByBigDecimal.java:29)
Above you can see the StackOverflowError is created. Now the reason for this is too many recursion as -
// creates 2 further entries in stack
BigInteger fibOfn = fibByDivCon(n-1, fibOfnS).add( fibByDivCon(n-2, fibOfnS)) ;
So each entry in stack create 2 more entries and so on... which is represented as -
Eventually so many entries will be created that system is unable to handle in the stack and StackOverflowError thrown.
For Prevention :
For Above example perspective
Avoid using recursion approach or reduce/limit the recursion by again one level division like if n is too large then split the n so that system can handle with in its limit.
Use other approach, like the loop approach I have used in 1st code sample. (I am not at all intended to degrade Divide & Concur or Recursion as they are legendary approaches in many most famous algorithms.. my intention is to limit or stay away from recursion if I suspect stack overflow issues)

Considering this was tagged with "hacking", I suspect the "stack overflow" he's referring to is a call stack overflow, rather than a higher level stack overflow such as those referenced in most other answers here. It doesn't really apply to any managed or interpreted environments such as .NET, Java, Python, Perl, PHP, etc, which web apps are typically written in, so your only risk is the web server itself, which is probably written in C or C++.
Check out this thread:
https://stackoverflow.com/questions/7308/what-is-a-good-starting-point-for-learning-buffer-overflow

Stack overflow occurs when your program uses up the entire stack. The most common way this happens is when your program has a recursive function which calls itself forever. Every new call to the recursive function takes more stack until eventually your program uses up the entire stack.

Related

FreeRTOS - The reason for the stack increase?

I have created several tasks, the body of which is the same function. Inside the function, there is a delay that is the same for every task. So, when this delay is large enough, the stack of each task is filled with 6 words less than when the delay is less.
As far as I understand, the stack of tasks increases when there is a situation with several tasks with the status Ready.
1.In this situation, some additional 6-word context is written to the task stack?
2.In my example, it turns out 6 words (24 bytes), can this value change somehow?
3.What else can affect the increase in the stack, such as jumping into an interrupt handler?
int main(void){
xTaskCreate(Task_PrintCountString, "Task_1", mySTACK_SIZE, &param1, 1, NUUL);
xTaskCreate(Task_PrintCountString, "Task_2", mySTACK_SIZE, &param2, 1, NUUL);
xTaskCreate(Task_PrintCountString, "Task_3", mySTACK_SIZE, &param3, 1, NUUL);
vTaskStartScheduler();
}
#define myDELAY 100
void Task_PrintCountString(void *pParams){
uint16_t c=0;
for(;;){
if(xSemaphoreTake(WriteCountMutex, portMAX_DELAY) == pdTRUE){
PrintCountString(*(uint8_t *)pParams, c++);
xSemaphoreGive(WriteCountMutex);
}
vTaskDelay(myDELAY/portTICK_PERIOD_MS);//When myDELAY=1, the task stack is 6 words more than when myDELAY= 100!
}
}
Address 0x20000158 is the top of the stack for one of the tasks. Similarly, others!
The only function PrintCountString always the same depth. But at the same time, the stack can grow to its maximum knowledge in several stages, going through dozens of iterations of the task cycle! It turns out that not only the context is saved to the stack, but something else?
P.S.
I use ARM CM0 port and heap_1.
I made an observation:
In the PrintCountString function there was a block for waiting for the SPI flag - while(). I replaced the DMA transfer method and removed the while() loop. The stack of both tasks began to fill up immediately to the maximum value, regardless of delays.
FreeRTOS is just C code so, with one exception, the stack usage is determined by the compiler, including the selected optimisation level. The one exception be that when a task is not running its context is saved to its stack. The context size is fixed in all cases that matter. The size of the context depends on which FreeRTOS port you are using (there are more than 40).
There are a few things in your post I'm not sure about.
when this delay is large enough, the stack of each task is filled with
6 words less than when the delay is less
As above, the C code doesn't change depending on how long a delay you have - the compiler knows nothing about FreeRTOS or the concept of a delay. You may see different stack depths depending on where the code was in the call tree when an interrupt occurs (including interrupts that cause context switches).
I look at the top of the stack and count the remaining number of bytes
to be 0xA5
You don't say which port you are using, so I don't know if the stack grows up or down. In any case the stack is filled with 0xa5 when the task is created but never touched by the kernel again so the number of 0xa5s left shows the maximum amount of stack the task has used since it was created. That value is returned by calling uxTaskGetStackHighWaterMark().
In general the stack memory is used by the microcontroller itself. There are several instructions which writes/reads some data to/from the stack.
The PUSH instruction copies the content of one register to the stack and modifies the stack pointer register
The POP instruction reads a word from the stack into a register and modifies the stack pointer register
Both instructions are managed by the compiler. If a functions is executed the compiler creates some PUSH instructions to "free" some registers to work with. At the end of the function the original register state will be restored using the POP instructions to guarantee the upcoming control flow.
and each branch instruction stores several register values to the stack
The amount of used stack memory depends on the call hierarchy.
According to your example, if the SemaphoreTake function is called and finds a released semaphore the call hierarchy is different from a situation where the semaphore is blocked. This creates a different stack usage.

Which scenes keyword "volatile" is needed to declare in objective-c?

As i know, volatile is usually used to prevent unexpected compile optimization during some hardware operations. But which scenes volatile should be declared in property definition puzzles me. Please give some representative examples.
Thx.
A compiler assumes that the only way a variable can change its value is through code that changes it.
int a = 24;
Now the compiler assumes that a is 24 until it sees any statement that changes the value of a. If you write code somewhere below above statement that says
int b = a + 3;
the compiler will say "I know what a is, it's 24! So b is 27. I don't have to write code to perform that calculation, I know that it will always be 27". The compiler may just optimize the whole calculation away.
But the compiler would be wrong in case a has changed between the assignment and the calculation. However, why would a do that? Why would a suddenly have a different value? It won't.
If a is a stack variable, it cannot change value, unless you pass a reference to it, e.g.
doSomething(&a);
The function doSomething has a pointer to a, which means it can change the value of a and after that line of code, a may not be 24 any longer. So if you write
int a = 24;
doSomething(&a);
int b = a + 3;
the compiler will not optimize the calculation away. Who knows what value a will have after doSomething? The compiler for sure doesn't.
Things get more tricky with global variables or instance variables of objects. These variables are not on stack, they are on heap and that means that different threads can have access to them.
// Global Scope
int a = 0;
void function ( ) {
a = 24;
b = a + 3;
}
Will b be 27? Most likely the answer is yes, but there is a tiny chance that some other thread has changed the value of a between these two lines of code and then it won't be 27. Does the compiler care? No. Why? Because C doesn't know anything about threads - at least it didn't used to (the latest C standard finally knows native threads, but all thread functionality before that was only API provided by the operating system and not native to C). So a C compiler will still assume that b is 27 and optimize the calculation away, which may lead to incorrect results.
And that's what volatile is good for. If you tag a variable volatile like that
volatile int a = 0;
you are basically telling the compiler: "The value of a may change at any time. No seriously, it may change out of the blue. You don't see it coming and *bang*, it has a different value!". For the compiler that means it must not assume that a has a certain value just because it used to have that value 1 pico-second ago and there was no code that seemed to have changed it. Doesn't matter. When accessing a, always read its current value.
Overuse of volatile prevents a lot of compiler optimizations, may slow down calculation code dramatically and very often people use volatile in situations where it isn't even necessary. For example, the compiler never makes value assumptions across memory barriers. What exactly a memory barrier is? Well, that's a bit far beyond the scope of my reply. You just need to know that typical synchronization constructs are memory barriers, e.g. locks, mutexes or semaphores, etc. Consider this code:
// Global Scope
int a = 0;
void function ( ) {
a = 24;
pthread_mutex_lock(m);
b = a + 3;
pthread_mutex_unlock(m);
}
pthread_mutex_lock is a memory barrier (pthread_mutex_unlock as well, by the way) and thus it's not necessary to declare a as volatile, the compiler will not make an assumption of the value of a across a memory barrier, never.
Objective-C is pretty much like C in all these aspects, after all it's just a C with extensions and a runtime. One thing to note is that atomic properties in Obj-C are memory barriers, so you don't need to declare properties volatile. If you access the property from multiple threads, declare it atomic, which is even default by the way (if you don't mark it nonatomic, it will be atomic). If you never access it from multiple thread, tagging it nonatomic will make access to that property a lot faster, but that only pays off if you access the property really a lot (a lot doesn't mean ten times a minute, it's rather several thousand times a second).
So you want Obj-C code, that requires volatile?
#implementation SomeObject {
volatile bool done;
}
- (void)someMethod {
done = false;
// Start some background task that performes an action
// and when it is done with that action, it sets `done` to true.
// ...
// Wait till the background task is done
while (!done) {
// Run the runloop for 10 ms, then check again
[[NSRunLoop currentRunLoop]
runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.01]
];
}
}
#end
Without volatile, the compiler may be dumb enough to assume, that done will never change here and replace !done simply with true. And while (true) is an endless loop that will never terminate.
I haven't tested that with modern compilers. Maybe the current version of clang is more intelligent than that. It may also depend on how you start the background task. If you dispatch a block, the compiler can actually easily see whether it changes done or not. If you pass a reference to done somewhere, the compiler knows that the receiver may the value of done and will not make any assumptions. But I tested exactly that code a long time ago when Apple was still using GCC 2.x and there not using volatile really caused an endless loop that never terminated (yet only in release builds with optimizations enabled, not in debug builds). So I would not rely on the compiler being clever enough to do it right.
Just some more fun facts about memory barriers:
If you ever had a look at the atomic operations that Apple offers in <libkern/OSAtomic.h>, then you might have wondered why every operation exists twice: Once as x and once as xBarrier (e.g. OSAtomicAdd32 and OSAtomicAdd32Barrier). Well, now you finally know it. The one with "Barrier" in its name is a memory barrier, the other one isn't.
Memory barriers are not just for compilers, they are also for CPUs (there exists CPU instructions, that are considered memory barriers while normal instructions are not). The CPU needs to know these barriers because CPUs like to reorder instructions to perform operations out of order. E.g. if you do
a = x + 3 // (1)
b = y * 5 // (2)
c = a + b // (3)
and the pipeline for additions is busy, but the pipeline for multiplication is not, the CPU may perform instruction (2) before (1), after all the order won't matter in the end. This prevents a pipeline stall. Also the CPU is clever enough to know that it cannot perform (3) before either (1) or (2) because the result of (3) depends on the results of the other two calculations.
Yet, certain kinds of order changes will break the code, or the intention of the programmer. Consider this example:
x = y + z // (1)
a = 1 // (2)
The addition pipe might be busy, so why not just perform (2) before (1)? They don't depend on each other, the order shouldn't matter, right? Well, it depends. Consider another thread monitors a for changes and as soon as a becomes 1, it reads the value of x, which should now be y+z if the instructions were performed in order. Yet if the CPU reordered them, then x will have whatever value it used to have before getting to this code and this makes a difference as the other thread will now work with a different value, not the value the programmer would have expected.
So in this case the order will matter and that's why barriers are needed also for CPUs: CPUs don't order instructions across such barriers and thus instruction (2) would need to be a barrier instruction (or there needs to be such an instruction between (1) and (2); that depends on the CPU). However, reordering instructions is only performed by modern CPUs, a much older problem are delayed memory writes. If a CPU delays memory writes (very common for some CPUs, as memory access is horribly slow for a CPU), it will make sure that all delayed writes are performed and have completed before a memory barrier is crossed, so all memory is in a correct state in case another thread might now access it (and now you also know where the name "memory barrier" actually comes from).
You are probably working a lot more with memory barriers than you are even aware of (GCD - Grand Central Dispatch is full of these and NSOperation/NSOperationQueue bases on GCD), that's why your really need to use volatile only in very rare, exceptional cases. You might get away writing 100 apps and never have to use it even once. However, if you write a lot low level, multi-threading code that aims to achieve maximum performance possible, you will sooner or later run into a situation where only volatile can grantee you correct behavior; not using it in such a situation will lead to strange bugs where loops don't seem to terminate or variables simply seem to have incorrect values and you find no explanation for that. If you run into bugs like these, especially if you only see them in release builds, you might miss a volatile or a memory barrier somewhere in your code.
A good explanation is given here: Understanding “volatile” qualifier in C
The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler.
Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object. So the simple question is, how can value of a variable change in such a way that compiler cannot predict. Consider the following cases for answer to this question.
1) Global variables modified by an interrupt service routine outside the scope: For example, a global variable can represent a data port (usually global pointer referred as memory mapped IO) which will be updated dynamically. The code reading data port must be declared as volatile in order to fetch latest data available at the port. Failing to declare variable as volatile, the compiler will optimize the code in such a way that it will read the port only once and keeps using the same value in a temporary register to speed up the program (speed optimization). In general, an ISR used to update these data port when there is an interrupt due to availability of new data
2) Global variables within a multi-threaded application: There are multiple ways for threads communication, viz, message passing, shared memory, mail boxes, etc. A global variable is weak form of shared memory. When two threads sharing information via global variable, they need to be qualified with volatile. Since threads run asynchronously, any update of global variable due to one thread should be fetched freshly by another consumer thread. Compiler can read the global variable and can place them in temporary variable of current thread context. To nullify the effect of compiler optimizations, such global variables to be qualified as volatile
If we do not use volatile qualifier, the following problems may arise
1) Code may not work as expected when optimization is turned on.
2) Code may not work as expected when interrupts are enabled and used.
volatile comes from C. Type "C language volatile" into your favourite search engine (some of the results will probably come from SO), or read a book on C programming. There are plenty of examples out there.

What are the advantages of the "apply" functions? When are they better to use than "for" loops, and when are they not? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is R's apply family more than syntactic sugar
Just what the title says. Stupid question, perhaps, but my understanding has been that when using an "apply" function, the iteration is performed in compiled code rather than in the R parser. This would seem to imply that lapply, for instance, is only faster than a "for" loop if there are a great many iterations and each operation is relatively simple. For instance, if a single call to a function wrapped up in lapply takes 10 seconds, and there are only, say, 12 iterations of it, I would imagine that there's virtually no difference at all between using "for" and "lapply".
Now that I think of it, if the function inside the "lapply" has to be parsed anyway, why should there be ANY performance benefit from using "lapply" instead of "for" unless you're doing something that there are compiled functions for (like summing or multiplying, etc)?
Thanks in advance!
Josh
There are several reasons why one might prefer an apply family function over a for loop, or vice-versa.
Firstly, for() and apply(), sapply() will generally be just as quick as each other if executed correctly. lapply() does more of it's operating in compiled code within the R internals than the others, so can be faster than those functions. It appears the speed advantage is greatest when the act of "looping" over the data is a significant part of the compute time; in many general day-to-day uses you are unlikely to gain much from the inherently quicker lapply(). In the end, these all will be calling R functions so they need to be interpreted and then run.
for() loops can often be easier to implement, especially if you come from a programming background where loops are prevalent. Working in a loop may be more natural than forcing the iterative computation into one of the apply family functions. However, to use for() loops properly, you need to do some extra work to set-up storage and manage plugging the output of the loop back together again. The apply functions do this for you automagically. E.g.:
IN <- runif(10)
OUT <- logical(length = length(IN))
for(i in IN) {
OUT[i] <- IN > 0.5
}
that is a silly example as > is a vectorised operator but I wanted something to make a point, namely that you have to manage the output. The main thing is that with for() loops, you always allocate sufficient storage to hold the outputs before you start the loop. If you don't know how much storage you will need, then allocate a reasonable chunk of storage, and then in the loop check if you have exhausted that storage, and bolt on another big chunk of storage.
The main reason, in my mind, for using one of the apply family of functions is for more elegant, readable code. Rather than managing the output storage and setting up the loop (as shown above) we can let R handle that and succinctly ask R to run a function on subsets of our data. Speed usually does not enter into the decision, for me at least. I use the function that suits the situation best and will result in simple, easy to understand code, because I'm far more likely to waste more time than I save by always choosing the fastest function if I can't remember what the code is doing a day or a week or more later!
The apply family lend themselves to scalar or vector operations. A for() loop will often lend itself to doing multiple iterated operations using the same index i. For example, I have written code that uses for() loops to do k-fold or bootstrap cross-validation on objects. I probably would never entertain doing that with one of the apply family as each CV iteration needs multiple operations, access to lots of objects in the current frame, and fills in several output objects that hold the output of the iterations.
As to the last point, about why lapply() can possibly be faster that for() or apply(), you need to realise that the "loop" can be performed in interpreted R code or in compiled code. Yes, both will still be calling R functions that need to be interpreted, but if you are doing the looping and calling directly from compiled C code (e.g. lapply()) then that is where the performance gain can come from over apply() say which boils down to a for() loop in actual R code. See the source for apply() to see that it is a wrapper around a for() loop, and then look at the code for lapply(), which is:
> lapply
function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN))
}
<environment: namespace:base>
and you should see why there can be a difference in speed between lapply() and for() and the other apply family functions. The .Internal() is one of R's ways of calling compiled C code used by R itself. Apart from a manipulation, and a sanity check on FUN, the entire computation is done in C, calling the R function FUN. Compare that with the source for apply().
From Burns' R Inferno (pdf), p25:
Use an explicit for loop when each
iteration is a non-trivial task. But a
simple loop can be more clearly and
compactly expressed using an apply
function. There is at least one
exception to this rule ... if the result will
be a list and some of the components
can be NULL, then a for loop is
trouble (big trouble) and lapply gives
the expected answer.

Do recursive sequences leak memory?

I like to define sequences recursively as follows:
let rec startFrom x =
seq {
yield x;
yield! startFrom (x + 1)
}
I'm not sure if recursive sequences like this should be used in practice. The yield! appears to be tail recursive, but I'm not 100% sure since its being called from inside another IEnumerable. From my point of view, the code creates an instance of IEnumerable on each call without closing it, which would actually make this function leak memory as well.
Will this function leak memory? For that matter is it even "tail recursive"?
[Edit to add]: I'm fumbling around with NProf for an answer, but I think it would be helpful to get a technical explanation regarding the implementation of recursive sequences on SO.
I'm at work now so I'm looking at slightly newer bits than Beta1, but on my box in Release mode and then looking at the compiled code with .Net Reflector, it appears that these two
let rec startFromA x =
seq {
yield x
yield! startFromA (x + 1)
}
let startFromB x =
let z = ref x
seq {
while true do
yield !z
incr z
}
generate almost identical MSIL code when compiled in 'Release' mode. And they run at about the same speed as this C# code:
public class CSharpExample
{
public static IEnumerable<int> StartFrom(int x)
{
while (true)
{
yield return x;
x++;
}
}
}
(e.g. I ran all three versions on my box and printed the millionth result, and each version took about 1.3s, +/- 1s). (I did not do any memory profiling; it's possible I'm missing something important.)
In short, I would not sweat too much thinking about issues like this unless you measure and see a problem.
EDIT
I realize I didn't really answer the question... I think the short answer is "no, it does not leak". (There is a special sense in which all 'infinite' IEnumerables (with a cached backing store) 'leak' (depending on how you define 'leak'), see
Avoiding stack overflow (with F# infinite sequences of sequences)
for an interesting discussion of IEnumerable (aka 'seq') versus LazyList and how the consumer can eagerly consume LazyLists to 'forget' old results to prevent a certain kind of 'leak'.)
.NET applications don't "leak" memory in this way. Even if you are creating many objects a garbage collection will free any objects that have no roots to the application itself.
Memory leaks in .NET usually come in the form of unmanaged resources that you are using in your application (database connections, memory streams, etc.). Instances like this in which you create multiple objects and then abandon them are not considered a memory leak since the garbage collector is able to free the memory.
It won't leak any memory, it'll just generate an infinite sequence, but since sequences are IEnumerables you can enumerate them without memory concerns.
The fact that the recursion happens inside the sequence generation function doesn't impact the safety of the recursion.
Just mind that in debug mode tail call optimization may be disabled in order to allow full debugging, but in release there'll be no issue whatsoever.

How does a stackless language work?

I've heard of stackless languages. However I don't have any idea how such a language would be implemented. Can someone explain?
The modern operating systems we have (Windows, Linux) operate with what I call the "big stack model". And that model is wrong, sometimes, and motivates the need for "stackless" languages.
The "big stack model" assumes that a compiled program will allocate "stack frames" for function calls in a contiguous region of memory, using machine instructions to adjust registers containing the stack pointer (and optional stack frame pointer) very rapidly. This leads to fast function call/return, at the price of having a large, contiguous region for the stack. Because 99.99% of all programs run under these modern OSes work well with the big stack model, the compilers, loaders, and even the OS "know" about this stack area.
One common problem all such applications have is, "how big should my stack be?". With memory being dirt cheap, mostly what happens is that a large chunk is set aside for the stack (MS defaults to 1Mb), and typical application call structure never gets anywhere near to using it up. But if an application does use it all up, it dies with an illegal memory reference ("I'm sorry Dave, I can't do that"), by virtue of reaching off the end of its stack.
Most so-called called "stackless" languages aren't really stackless. They just don't use the contiguous stack provided by these systems. What they do instead is allocate a stack frame from the heap on each function call. The cost per function call goes up somewhat; if functions are typically complex, or the language is interpretive, this additional cost is insignificant. (One can also determine call DAGs in the program call graph and allocate a heap segment to cover the entire DAG; this way you get both heap allocation and the speed of classic big-stack function calls for all calls inside the call DAG).
There are several reasons for using heap allocation for stack frames:
If the program does deep recursion dependent on the specific problem it is solving,
it is very hard to preallocate a "big stack" area in advance because the needed size isn't known. One can awkwardly arrange function calls to check to see if there's enough stack left, and if not, reallocate a bigger chunk, copy the old stack and readjust all the pointers into the stack; that's so awkward that I don't know of any implementations.
Allocating stack frames means the application never has to say its sorry until there's
literally no allocatable memory left.
The program forks subtasks. Each subtask requires its own stack, and therefore can't use the one "big stack" provided. So, one needs to allocate stacks for each subtask. If you have thousands of possible subtasks, you might now need thousands of "big stacks", and the memory demand suddenly gets ridiculous. Allocating stack frames solves this problem. Often the subtask "stacks" refer back to the parent tasks to implement lexical scoping; as subtasks fork, a tree of "substacks" is created called a "cactus stack".
Your language has continuations. These require that the data in lexical scope visible to the current function somehow be preserved for later reuse. This can be implemented by copying parent stack frames, climbing up the cactus stack, and proceeding.
The PARLANSE programming language I implemented does 1) and 2). I'm working on 3). It is amusing to note that PARLANSE allocates stack frames from a very fast-access heap-per-thread; it costs typically 4 machine instructions. The current implementation is x86 based, and the allocated frame is placed in the x86 EBP/ESP register much like other conventional x86 based language implementations. So it does use the hardware "contiguous stack" (including pushing and poppping) just in chunks. It also generates "frame local" subroutine calls the don't switch stacks for lots of generated utility code where the stack demand is known in advance.
Stackless Python still has a Python stack (though it may have tail call optimization and other call frame merging tricks), but it is completely divorced from the C stack of the interpreter.
Haskell (as commonly implemented) does not have a call stack; evaluation is based on graph reduction.
There is a nice article about the language framework Parrot. Parrot does not use the stack for calling and this article explains the technique a bit.
In the stackless environments I'm more or less familiar with (Turing machine, assembly, and Brainfuck), it's common to implement your own stack. There is nothing fundamental about having a stack built into the language.
In the most practical of these, assembly, you just choose a region of memory available to you, set the stack register to point to the bottom, then increment or decrement to implement your pushes and pops.
EDIT: I know some architectures have dedicated stacks, but they aren't necessary.
Call me ancient, but I can remember when the FORTRAN standards and COBOL did not support recursive calls, and therefore didn't require a stack. Indeed, I recall the implementations for CDC 6000 series machines where there wasn't a stack, and FORTRAN would do strange things if you tried to call a subroutine recursively.
For the record, instead of a call-stack, the CDC 6000 series instruction set used the RJ instruction to call a subroutine. This saved the current PC value at the call target location and then branches to the location following it. At the end, a subroutine would perform an indirect jump to the call target location. That reloaded saved PC, effectively returning to the caller.
Obviously, that does not work with recursive calls. (And my recollection is that the CDC FORTRAN IV compiler would generate broken code if you did attempt recursion ...)
There is an easy to understand description of continuations on this article: http://www.defmacro.org/ramblings/fp.html
Continuations are something you can pass into a function in a stack-based language, but which can also be used by a language's own semantics to make it "stackless". Of course the stack is still there, but as Ira Baxter described, it's not one big contiguous segment.
Say you wanted to implement stackless C. The first thing to realize is that this doesn't need a stack:
a == b
But, does this?
isequal(a, b) { return a == b; }
No. Because a smart compiler will inline calls to isequal, turning them into a == b. So, why not just inline everything? Sure, you will generate more code but if getting rid of the stack is worth it to you then this is easy with a small tradeoff.
What about recursion? No problem. A tail-recursive function like:
bang(x) { return x == 1 ? 1 : x * bang(x-1); }
Can still be inlined, because really it's just a for loop in disguise:
bang(x) {
for(int i = x; i >=1; i--) x *= x-1;
return x;
}
In theory a really smart compiler could figure that out for you. But a less-smart one could still flatten it as a goto:
ax = x;
NOTDONE:
if(ax > 1) {
x = x*(--ax);
goto NOTDONE;
}
There is one case where you have to make a small trade off. This can't be inlined:
fib(n) { return n <= 2 ? n : fib(n-1) + fib(n-2); }
Stackless C simply cannot do this. Are you giving up a lot? Not really. This is something normal C can't do well very either. If you don't believe me just call fib(1000) and see what happens to your precious computer.
Please feel free to correct me if I'm wrong, but I would think that allocating memory on the heap for each function call frame would cause extreme memory thrashing. The operating system does after all have to manage this memory. I would think that the way to avoid this memory thrashing would be a cache for call frames. So if you need a cache anyway, we might as well make it contigous in memory and call it a stack.

Resources