Different between sympath commands in Windbg - symbols

What is the difference between the following commands?
.sympath cache*c:\CachedSymbols;srv*https://msdl.microsoft.com/download/symbols
.sympath srv*c:\CachedSymbols*https://msdl.microsoft.com/download/symbols

There's no difference betwen the 2 exact commands that you posted.
However, consider the following:
.sympath cache*c:\CachedSymbols;srv*server1;srv*server2
then, symbols of both servers will go into the same cache, whereas
.sympath srv*c:\CachedSymbols*server1;srv*c:\OtherPlace*server2
allows you to define 2 different places.
And a
.sympath cache*c:\CachedSymbols;srv*C:\Server1*server1;srv*C:\Server2*server2
even stores each symbol twice, once per server and both in the cache.

Related

Netlogo Behaviorspace How to save data not per tick but based on reporter

I have a netlogo model, for which a run takes about 15 minutes, but goes through a lot of ticks. This is because per tick, not much happens. I want to do quite a few runs in an experiment in behaviorspace. The output (only table output) will be all the output and input variables per tick. However, not all this data is relevant: it's only relevant once a day (day is variable, a run lasts 1095 days).
The result is that the model gets so slow running experiments via behaviorspace. Not only would it be nicer to have output data with just 1095 rows, it perhaps also causes the experiment to slow down tremendously.
How to fix this?
It is possible to write your own output file in a BehaviorSpace experiment. Program your code to create and open an output file that contains only the results you want.
The problem is to keep BehaviorSpace from trying to open the same output file from different model runs running on different processors, which causes a runtime error. I have tried two solutions.
Tell BehaviorSpace to only use one processor for the experiment. Then you can use the same output file for all model runs. If you want the output lines to include which model run it's on, use the primitive behaviorspace-run-number.
Have each model run create its own output file with a unique name. Open the file using something like:
file-open (word "Output-for-run-" behaviorspace-run-number ".csv")
so the output files will be named Output-for-run-1.csv etc.
(If you are not familiar with it, the CSV extension is very useful for writing output files. You can put everything you want to output on a big list, and then when the model finishes write the list into a CSV file with:
csv:to-file (word "Output-for-run-" behaviorspace-run-number ".csv") the-big-list
)

How to disable core dump in a docker image?

I have a service that uses an Docker image. About a half dozen people use it. However, occasionally containers produces big core.xxxx dump files. How do I disable it on docker images? My base image is Debian 9.
To disable core dumps set a ulimit value in /etc/security/limits.conf file and defines some shell specific restrictions.
A hard limit is something that never can be overridden, while a soft limit might only be applicable for specific users. If you would like to ensure that no process can create a core dump, you can set them both to zero. Although it may look like a boolean (0 = False, 1 = True), it actually indicates the allowed size.
soft core 0
hard core 0
The asterisk sign means it applies to all users. The second column states if we want to use a hard or soft limit, followed by the columns stating the setting and the value.

Apache-camel Xpathbuilder performance

I have following question. I set up an camel -project to parse certain xml files. I have to selecting take out certain nodes from a file.
I have two files 246kb and 347kb in size. I am extracting a parent-child pair of 250 nodes in the above given example.
With the default factory here are the times. For the 246kb file respt 77secs and 106 secs. I wanted to improve the performance so switched to saxon and the times are as follows 47secs and 54secs. I was able to cut the time down by at least half.
Is it possible to cut the time further, any other factory or optimizations I can use will be appreciated.
I am using XpathBuilder to cut the xpaths out. here is an example. Is it possible to not to have to create XpathBuilder repeatedly, it seems like it has to be constructed for every xpath, I would have one instance and keep pumping the xpaths into it, maybe it will improve performance further.
return XPathBuilder.xpath(nodeXpath)
.saxon()
.namespace(Consts.XPATH_PREFIX, nameSpace)
.evaluate(exchange.getContext(), exchange.getIn().getBody(String.class), String.class);
Adding more details based on Michael's comments. So I am kind of joining them, will become clear with my example below. I am combining them into a json.
So here we go, Lets say we have following mappings for first and second path.
pData.tinf.rexd: bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:ReqdExctnDt/text()
pData.tinf.pIdentifi.instId://bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:CdtTrfTxInf[{1}]/bm:PmtId/bm:InstrId/text()
This would result in a json as below
pData:{
tinf: {
rexd: <value_from_xml>
}
pIdentifi:{
instId: <value_from_xml>
}
}
Hard to say without seeing your actual XPath expression, but given the file sizes and execution time my guess would be that you're doing a join which is being executed naively as a cartesian product, i.e. with O(n*m) performance. There is probably some way of reorganizing it to have logarithmic performance, but the devil is in the detail. Saxon-EE is quite good at optimizing join queries automatically; if not, there are often ways of doing it manually -- though XSLT gives you more options (e.g. using xsl:key or xsl:merge) than XPath does.
Actually I was able to bring the time down to 10 secs. I am using apache-camel. So I added threads there so that multiple files can be read in separate threads. Once the file was being read, it had serial operation to based on the length of the nodes that had to be traversed. I realized that it was not necessary to be serial here so introduced parrallelStream and that now gave it enough power. One thing to guard agains is not to have a proliferation of threads since that can degrade the performance. So I try to restrict the number of threads to twice or thrice the number of cores on the operating machine.

Delete variables based on the number of observations

I have an SPSS file that contains about 1000 variables and I have to delete the ones having 0 valid values. I can think of a loop with an if statement but I can't find how to write it.
The simplest way would be to use the spssaux2.FindEmptyVars Python function like this:
begin program.
import spssaux2
spssaux2.FindEmptyVars(delete=True)
end program.
If you don't already have the spssaux2 module installed, you would need to get it from the SPSS Community website or the IBM Predictive Analytics site and save it in the python\lib\site-packages directory under your Statistics installation.
Otherwise, the VALIDATEDATA command, if you have it, will identify the variables violating such rules as maximum percentage of missing values, but you would have to turn that output into a DELETE VARIABLES command. You could also look for variables with zero missing values using, say, DESCRIPTIVES and select out the ones with N=0.
If you've never worked with python in SPSS, here's a way to get the job done without it (not as elegant, but should do the job):
This will count the valid cases in each variable, and select only those that have 0 valid cases. Then you'll manually copy the names of these variables into a syntax command that will delete them.
DATASET NAME Orig.
DATASET DECLARE VARLIST.
AGGREGATE /OUTFILE='VARLIST'/BREAK=
/**list_all_the_variable_names_here = NU(*FirstVarName to *LastVarName).
DATASET ACTIVATE VARLIST.
VARSTOCASES /MAKE NumValid FROM *FirstVarName to *LastVarName/INDEX=VarName(NumValid).
SELECT IF NumValid=0.
EXECUTE.
Pause here to copy the remaining names in the list and complete the syntax, then continue:
DATASET ACTIVATE Orig.
DELETE VARIABLES *paste_here_all_the_remaining_variable_names_from_varlist .
Notes:
* I put stars where you have to replace my text with your variable names.
** If the variables are neatly named like Q1, Q2, Q3 .... Q1000, you can use the "FirstVarName to LastVarName" form (Q1 to Q1000) instead of listing all the variable names.
BTW it is of course possible to do this completely automatically without manually copying those names (using only syntax, no Python), but the added complexity is not worth bothering with for a single use...

Data integrity errors in google cloud dataflow jobs

I've noticed one of my dataflow jobs has produced output with what I could best describe as too many random bit flips. For example a year "2014" (as text) was written as "0007" or "2016" or "0052" or other textual values. In some cases the output line format is valid (which suggests something happened in processing) but few lines seems to have malformed formatting as well (e.g., "20141215-04-25" instead of something like "2014-12-25").
I'm occasionally re-running the jobs with the same code and different date range parameters, and for this specific dates range the job was completing successfully until about a week ago. I have been trying different machine configurations though (4 cpus and 1-cpu instances) and the problems seems to happen more with 4-cpu instances.
Does anybody know what could be leading to this?
Thanks,
G
When using 4-cpu instances, Dataflow runs multiple threads in a single Java process. Data corruption could happen if one of the transforms is thread-hostile, that is, not even separate instances of the class can be safely accessed by multiple threads. This typically happens when the class uses a static non-thread-safe member variable.
A thread safety issue in user code resulted in this type of corruption. This type of errors are likely to occur when using multi-core instances for compute.

Resources