Which of the IDS 11.70 onconfig parameters can be changed to maximize performance for a DSS app? - informix

Informix 11.70.TC5DE,
Windows Vista with Dual Core Processor, 8GB RAM, 1TB HDD:
During the installation of this server, I specified it was going to be used for a data warehousing application. These are the onconfig parameters the install script generated.
Can any of these parameters be changed to maximize the performance of the server?
#(onconfig.ol_informix1170) - for data warehousing app.
ROOTNAME rootdbs
ROOTPATH C:\PROGRA~1\IBM\Informix\11.70\OL_INF~2\dbspaces\rootdbs.000
ROOTOFFSET 0
ROOTSIZE 312992
MIRROR 0
MIRRORPATH
MIRROROFFSET 0
PHYSFILE 49152
PLOG_OVERFLOW_PATH
PHYSBUFF 512
LOGFILES 6
LOGSIZE 10000
DYNAMIC_LOGS 2
LOGBUFF 256
LTXHWM 70
LTXEHWM 80
MSGPATH C:\PROGRA~1\IBM\Informix\11.70\ol_informix1170_1.log
CONSOLE C:\PROGRA~1\IBM\Informix\11.70\ol_informix1170_1.con
TBLTBLFIRST 0
TBLTBLNEXT 0
TBLSPACE_STATS 1
DBSPACETEMP tempdbs
SBSPACETEMP
SBSPACENAME sbspace
SYSSBSPACENAME
ONDBSPACEDOWN 2
SERVERNUM 6
DBSERVERNAME ol_informix1170_1
DBSERVERALIASES dr_informix1170_1
NETTYPE olsoctcp,1,150,NET
LISTEN_TIMEOUT 60
MAX_INCOMPLETE_CONNECTIONS 1024
FASTPOLL 1
NS_CACHE host=900,service=900,user=900,group=900
MULTIPROCESSOR 0
VPCLASS cpu,num=1,noage
VP_MEMORY_CACHE_KB 0
SINGLE_CPU_VP 1
#VPCLASS aio,num=1
CLEANERS 2
AUTO_AIOVPS 1
DIRECT_IO 0
LOCKS 2000
DEF_TABLE_LOCKMODE page
RESIDENT 0
SHMBASE 0xc000000L
SHMVIRTSIZE 209920
SHMADD 6560
EXTSHMADD 8192
SHMTOTAL 0
SHMVIRT_ALLOCSEG 0,3
#SHMNOACCESS 0x70000000-0x7FFFFFFF
CKPTINTVL 300
AUTO_CKPTS 1
RTO_SERVER_RESTART 60
BLOCKTIMEOUT 3600
CONVERSION_GUARD 2
RESTORE_POINT_DIR $INFORMIXDIR\tmp
TXTIMEOUT 300
DEADLOCK_TIMEOUT 60
HETERO_COMMIT 0
TAPEDEV \\.\TAPE0
TAPEBLK 16
TAPESIZE 0
LTAPEDEV
LTAPEBLK 16
LTAPESIZE 0
BAR_ACT_LOG $INFORMIXDIR\tmp\bar_act.log
BAR_DEBUG_LOG $INFORMIXDIR\tmp\bar_dbug.log
BAR_DEBUG 0
BAR_MAX_BACKUP 0
BAR_RETRY 1
BAR_NB_XPORT_COUNT 20
BAR_XFER_BUF_SIZE 15
RESTARTABLE_RESTORE ON
BAR_PROGRESS_FREQ 0
BAR_BSALIB_PATH
BACKUP_FILTER
RESTORE_FILTER
BAR_PERFORMANCE 0
BAR_CKPTSEC_TIMEOUT 15
ISM_DATA_POOL ISMData
ISM_LOG_POOL ISMLogs
DD_HASHSIZE 31
DD_HASHMAX 10
DS_HASHSIZE 31
DS_POOLSIZE 127
PC_HASHSIZE 31
PC_POOLSIZE 127
PRELOAD_DLL_FILE
STMT_CACHE 0
STMT_CACHE_HITS 0
STMT_CACHE_SIZE 512
STMT_CACHE_NOLIMIT 0
STMT_CACHE_NUMPOOL 1
USEOSTIME 0
STACKSIZE 64
ALLOW_NEWLINE 0
USELASTCOMMITTED NONE
FILLFACTOR 90
MAX_FILL_DATA_PAGES 0
BTSCANNER num=1,threshold=5000,rangesize=-1,alice=6,compression=default
ONLIDX_MAXMEM 188928
MAX_PDQPRIORITY 100
DS_MAX_QUERIES 1
DS_TOTAL_MEMORY 188928
DS_MAX_SCANS 1
DS_NONPDQ_QUERY_MEM 188928
DATASKIP
OPTCOMPIND 2
DIRECTIVES 1
EXT_DIRECTIVES 0
OPT_GOAL -1
IFX_FOLDVIEW 0
AUTO_REPREPARE 1
USTLOW_SAMPLE 0
RA_PAGES 64
RA_THRESHOLD 16
BATCHEDREAD_TABLE 1
BATCHEDREAD_INDEX 1
BATCHEDREAD_KEYONLY 0
EXPLAIN_STAT 1
#SQLTRACE level=low,ntraces=1000,size=2,mode=global
#DBCREATE_PERMISSION informix
#DB_LIBRARY_PATH
IFX_EXTEND_ROLE 1
SECURITY_LOCALCONNECTION
UNSECURE_ONSTAT
ADMIN_USER_MODE_WITH_DBSA
ADMIN_MODE_USERS
PLCY_POOLSIZE 127
PLCY_HASHSIZE 31
USRC_POOLSIZE 127
USRC_HASHSIZE 31
STAGEBLOB
OPCACHEMAX 0
SQL_LOGICAL_CHAR OFF
SEQ_CACHE_SIZE 10
ENCRYPT_HDR
ENCRYPT_SMX
ENCRYPT_CDR 0
ENCRYPT_CIPHERS
ENCRYPT_MAC
ENCRYPT_MACFILE
ENCRYPT_SWITCH
CDR_EVALTHREADS 1,2
CDR_DSLOCKWAIT 5
CDR_QUEUEMEM 4096
CDR_NIFCOMPRESS 0
CDR_SERIAL 0
CDR_DBSPACE
CDR_QHDR_DBSPACE
CDR_QDATA_SBSPACE
CDR_SUPPRESS_ATSRISWARN
CDR_DELAY_PURGE_DTC 0
CDR_LOG_LAG_ACTION ddrblock
CDR_LOG_STAGING_MAXSIZE 0
CDR_MAX_DYNAMIC_LOGS 0
DRAUTO 0
DRINTERVAL 30
DRTIMEOUT 30
HA_ALIAS
DRLOSTFOUND $INFORMIXDIR\etc\dr.lostfound
DRIDXAUTO 0
LOG_INDEX_BUILDS
SDS_ENABLE
SDS_TIMEOUT 20
SDS_TEMPDBS
SDS_PAGING
SDS_LOGCHECK 0
UPDATABLE_SECONDARY 0
FAILOVER_CALLBACK
FAILOVER_TX_TIMEOUT 0
TEMPTAB_NOLOG 0
DELAY_APPLY 0
STOP_APPLY 0
LOG_STAGING_DIR
RSS_FLOW_CONTROL 0
ENABLE_SNAPSHOT_COPY 0
SMX_COMPRESS 0
ON_RECVRY_THREADS 2
OFF_RECVRY_THREADS 5
DUMPDIR $INFORMIXDIR\tmp
DUMPSHMEM 1
DUMPGCORE 0
DUMPCORE 0
DUMPCNT 1
ALARMPROGRAM $INFORMIXDIR\etc\alarmprogram.bat
ALRM_ALL_EVENTS 0
#SYSALARMPROGRAM $INFORMIXDIR\etc\evidence.bat
STORAGE_FULL_ALARM 600,3
RAS_PLOG_SPEED 10982
RAS_LLOG_SPEED 0
EILSEQ_COMPAT_MODE 0
QSTATS 0
WSTATS 0
#VPCLASS MQ,noyield
MQSERVER
MQCHLLIB
MQCHLTAB
#VPCLASS jvp,num=1
#JVPJAVAHOME $INFORMIXDIR\extend\krakatoa\jre
#JVPHOME $INFORMIXDIR\extend\krakatoa
JVPPROPFILE $INFORMIXDIR\extend\krakatoa\.jvpprops
JVPLOGFILE $INFORMIXDIR\jvp.log
#JDKVERSION 1.5
#JVPJAVALIB \bin
#JVPJAVAVM jvm
#JVPARGS -verbose:jni
#JVPCLASSPATH $INFORMIXDIR\extend\krakatoa\krakatoa_g.jar;$INFORMIXDIR\extend\krakatoa\jdbc_g.jar
JVPARGS -Dcom.ibm.tools.attach.enable=no
JVPCLASSPATH $INFORMIXDIR\extend\krakatoa\krakatoa.jar;$INFORMIXDIR\extend\krakatoa\jdbc.jar
BUFFERPOOL default,buffers=10000,lrus=8,lru_min_dirty=50.00,lru_max_dirty=60.50
BUFFERPOOL size=4K,buffers=13108,lrus=16,lru_min_dirty=70.00,lru_max_dirty=80.00
AUTO_LRU_TUNING 1
USERMAPPING OFF
SP_AUTOEXPAND 1
SP_THRESHOLD 0
SP_WAITTIME 30
DEFAULTESCCHAR \
LOW_MEMORY_RESERVE 0
LOW_MEMORY_MGR 0
REMOTE_SERVER_CFG
REMOTE_USERS_CFG
S6_USE_REMOTE_SERVER_CFG 0
GSKIT_VERSION
NETTYPE drsoctcp,1,150,NET

If it is a multiprocessor machine, definitely consider turning on MULTIPROCESSOR by setting it to a non-zero value.
The ONCONFIG parameters of greatest interest to you for DSS are those related to Parallel Data Query, or PDQ. The block that commences with MAX_PDQPRIORITY. It is worth perusing the fine manual on these specifically, because the inter-relationship between them and some other parameters is too complex to go into here.
But in essence, DS_MAX_QUERIES is the maxumum number of parallel queries permitted at any time, and DS_MAX_SCANS determines the number of IO threads for scanning your tables. DS_TOTAL_MEMORY determines the amount of memory allocated for PDQ processing, and there is an algorithm in the manual that shows how these variables and the user's PDQPRIORITY setting combine.
You might also want to consider lifting the RA_PAGES and RA_THRESHOLD values - these determine how many pages are read into memory as 'blocks' before grabbing the next batch. If you're wanting to favour table-scans (which generally you do in DSS) then increasing these to something like 256 and 128 might improve performance.
My experience is with SMP and MPP unix boxes, rather than Windows, so I'm not sure how much you can wring out of your architecture, but this is where you want to start.
I would recommend identifying a good DSS query that runs for a decent length of time, and changing one parameter at a time to see the effect. SET EXPLAIN ON is your friend here, too.
One last thing - 11.7 supports table compression, and the tests I've seen show dramatic improvements in a DSS environment with large reads and irregular writes.

Related

How Standard CAN win the bus access in Arbitration with EXT CAN?

I read some documents and all of them say that the Std CAN have higher priority than the Ext CAN because the SRR bit is always Recessive in EXT CAN when they have the same ID, but from my understanding it depends.
https://copperhilltech.com/blog/controller-area-network-can-bus-tutorial-extended-can-protocol/
To simplify, let's say we have message ID 0x1(Std CAN) and 0x1(Ext CAN) sending simultaneously on the same bus.
The arbitration field of the Std CAN be compared to Ext CAN should be like this:
Std CAN: 0 0 0 0 0 0 0 0 0 0 1 0 (The bold bit is RTR)
Ext CAN: 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 (The bold bits are SRR, IDE and RTR)
At the 11th bit, The node that sends Std CAN is sending 1 (Recessive bit), and the node that sends Ext CAN is sending 0 (Dominant bit), so the Ext CAN wins the bus access and the node that sends Std CAN switch to listen mode and not sending anything after that, so the SRR and IDE bits never be reached to decide the message is Ext CAN or Std CAN.
Is my above understanding correct?
Thank you in advance,
Yes, an 29 bit frame with RTR set has higher priority than an 11 bit frame without RTR, given the first 11 bits of the identifiers are identical. So saying that standard frames have higher priority than extended is a simplification.
RTR frames is a bit of an oddball case overall, as they may also have varied length in the DLC area even though there's no data at all in the frame.

when using the default 'randomForest' algorithm for classification, why doesn't the number of terminal nodes match the number of cases?

According to https://cran.r-project.org/web/packages/randomForest/randomForest.pdf, classification trees are fully grown, meaning node size = 1.
However, if trees are really grown to a maximum, then shouldn't each terminal node contain a single case (data point, species, etc)?
If I run:
library(randomForest)
data(iris) #150 cases
set.seed(352)
rf <- randomForest(Species ~ ., iris)
hist(treesize(rf),main ="number of nodes")
I can see that most "fully grown" trees only have about 10 nodes, meaning node size can't be equal to 1...Right?
for example, (-1) below represents a terminal node for the 134th tree in the forest. Only 8 terminal nodes!?
> getTree(rf,134)
left daughter right daughter split var split point status prediction
1 2 3 3 2.50 1 0
2 0 0 0 0.00 -1 1
3 4 5 4 1.75 1 0
4 6 7 3 4.95 1 0
5 8 9 3 4.85 1 0
6 10 11 4 1.60 1 0
7 12 13 1 6.50 1 0
8 14 15 1 5.95 1 0
9 0 0 0 0.00 -1 3
10 0 0 0 0.00 -1 2
11 0 0 0 0.00 -1 3
12 0 0 0 0.00 -1 3
13 0 0 0 0.00 -1 2
14 0 0 0 0.00 -1 2
15 0 0 0 0.00 -1 3
I would be greatful if someone can explain
"Fully grown" -> "Nothing left to split". A (node of a-) decision tree is fully grown, if all data records assigned to it hold/make the same prediction.
In the iris dataset case, once you reach a node with 50 setosa data records in it, it doesn't make sense to split it into two child nodes with 25 and 25 setosas each.

False high values in Grafana causing false alerts

I configured alerting in Grafana yesterday and get from two servers alerts. It's always the same two servers which got high IO, high CPU or anything else.
The thing is, they do not have such high data. In fact they're almost on idle. All servers are configured exactly the same via Ansible. So the Telegraf config is the same on all servers.
Also if I filter the stats in Grafana to the corresponding server the data displayed in the graph is correct as you can see in the screenshot below. Still the Rule-Test results in a false positive.
I checked vmstat which also displays correct informations:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 47100 151152 20948 454556 2 2 16 38 2 1 2 1 96 0 1
0 0 47100 151136 20948 454592 0 0 0 0 125 135 0 1 96 0 2
0 0 47100 150408 20956 454584 0 0 0 84 222 282 1 3 93 0 4
0 0 47100 150424 20956 454592 0 0 0 0 151 225 0 0 97 0 2
0 0 47100 150424 20956 454592 0 0 0 0 115 140 0 0 96 0 4
0 0 47100 150424 20956 454592 0 0 0 0 109 125 0 0 97 0 2
0 0 47100 150424 20956 454592 0 0 0 0 121 131 0 0 98 0 2
0 0 47100 150412 20972 454576 0 0 0 92 139 208 0 1 96 0 3
0 0 47100 150456 20972 454592 0 0 0 0 65 117 0 0 99 0 1
0 0 47100 150876 20972 454592 0 0 0 16 692 705 2 4 88 0 5
And the telegraf.log if something's wrong.
2017-07-07T09:22:04Z I! Starting Telegraf (version 1.3.3)
2017-07-07T09:22:04Z I! Loaded outputs: influxdb
2017-07-07T09:22:04Z I! Loaded inputs: inputs.diskio inputs.processes inputs.swap inputs.system inputs.redis inputs.disk inputs.kernel inputs.mem inputs.net inputs.nginx inputs.postgresql inputs.cpu
2017-07-07T09:22:04Z I! Tags enabled: environment=production host=om-1-prod rails_env=production role=telegraf
2017-07-07T09:22:04Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"om-1-prod", Flush Interval:10s
Any ideas whats wrong here?
I kept monitoring the servers manually and found these high peaks for a short period of time.
So the issue here is that these peaks aren't visible in the selected range of time within Grafana. It gets aggregated to a smaler average which then looks like there only have been 40 ips. If I zoom into the corresponding time range I see these peaks.
Long story short: There's no issue witch Grafana, Telegraf of InfluxDB. The problem existed between keyboard and chair.

Automatically learning clusters

HI complete newbie question here: I have a table consisting of two columns. First column belongs to "bins" that are coded by where a the fruit flies live. The second column is either 0 or 1, neutral vs really like sugar, respectively. I have two question?
1) if I suspect that there is a single variable, something about where they live that is determining whether how much they like sugar. Is there a way that I can have the computer to group into just 2 clusters? All the bins that like sugar vs neutral. That way we can do further experiment to determine what is it about the bins.
2) automatically determine how many clusters there might be that is driving this behavior? For example may be there is 4 variables (4 clusters) that can determine the outcome of sugar preference.
Apologies if this is trivial. The table is listed below. thanks!
Bin sugar
1 1
1 1
1 0
1 0
2 1
2 0
2 0
3 1
3 0
3 1
3 1
4 1
4 1
4 1
5 1
5 0
5 1
6 0
6 0
6 0
7 0
7 1
7 1
8 1
8 0
8 1
9 1
9 0
9 0
9 0
10 0
10 0
10 0
11 1
11 1
11 1
12 0
12 0
12 0
12 0
13 0
13 0
13 1
13 0
13 0
14 0
14 0
14 0
14 0
15 1
15 0
15 0
16 1
16 1
17 1
17 1
18 0
18 1
18 1
17 1
19 1
20 1
20 0
20 0
20 1
21 0
21 0
21 1
21 0
22 1
22 0
22 1
22 1
23 1
23 1
24 1
24 0
25 0
25 1
25 0
26 1
26 1
27 1
27 1
Okay, assuming I understood what you meant, one approach to problem 1) should be addressed using bayes filtering.
Say event L is "a fly likes sugar", event B is "a fly is in bin B".
So what you have is:
number of flies = 84
size of each bins = (eg size of bin 1: 4)
probability that a fly likes sugar:
P(L) = flies that like sugar / total number of flies = 43/84
probability that a fly doesn't like sugar:
P(notL) = 1 - P(L) = 41/84
probability that a fly is in a given bin:
P(B) = size of the bin / sum of the sizes of all bins = 4/84 (for bin 1)
probability that a fly isn't in a given bin:
P(notB) = 1 - P(B) = 80/84 (for bin 1)
probability that a fly likes sugar, knowing that's in bin B:
P(L|B) = flies that like sugar in a bin / size of the bin
(eg for bin 1 is 2/4 = 1/2)
probability that a fly likes sugar, knowing that it's not in bin B:
P(L|notB) = (total flies that like sugar - flies that like sugar in the bin)/(size of bins - size of the bin)) = 41/80
You want to know the probability that a fly is in a given bin B knowing that likes sugar, which you can obtain with:
P(B|L) = (P(L|B) * P(B)) / (P(L|B) * P(B) + P(L|notB) * P(notB))
If you compute P(B|L) and P(B|notL) for each bin, then you know which of the bins have the highest probability of containing flies that like sugar. Then you can further study those bins.
Hope i was clear, my statistics is a bit rusty and I'm not even sure I am doing everything correctly. Take it as a hint to point you in the right direction to address the problem.
You can refer here to get more accurate reasoning and results.
As for problem 2)... I have to think about it a bit more.

How to interpret the memory usage figures?

Can someone explain this in a practical way? Sample represents usage for one, low-traffic Rails site using Nginx and 3 Mongrel clusters. I ask because I am aiming to learn about page caching, wondering if these figures have significant meaning to that process. Thank you. Great site!
me#vps:~$ free -m
total used free shared buffers cached
Mem: 512 506 6 0 15 103
-/+ buffers/cache: 387 124
Swap: 1023 113 910
Physical memory is all used up. Why? Because it's there, the system should be using it.
You'll note also that the system is using 113M of swap space. Bad? Good? It depends.
See also that there's 103M of cached disk; this means that the system has decided that it's better to cache 103M of disk and swap out these 113M; maybe you have some processes using memory that are not being used and thus are paged out to disk.
As the other poster said, you should be using other tools to see what's happening:
Your perception: is the site running appropiately when you use it?
Benchmarking: what response times are your clients seeing?
More fine-grained diagnostics:
top: you can see live which processes are using memory and CPU
vmstat: it produces this kind of output:
alex#armitage:~$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 1 71184 156520 92524 316488 1 5 12 23 362 250 13 6 80 1
0 0 71184 156340 92528 316508 0 0 0 1 291 608 10 1 89 0
0 0 71184 156364 92528 316508 0 0 0 0 308 674 9 2 89 0
0 0 71184 156364 92532 316504 0 0 0 72 295 723 9 0 91 0
1 0 71184 150892 92532 316508 0 0 0 0 370 722 38 0 62 0
0 0 71184 163060 92532 316508 0 0 0 0 303 611 17 2 81 0
which will show you whether swap is hurting you (high numbers on si, so) and a more easier to see performance-over-time statistic.
by my reading of this, you have used almost all your memory, have 6 M free, and are going into about 10% of your swap. A more useful tools is to use top or perhaps ps to see how much each of your individual mongrels are using in RAM. Because you're going into swap, you're probably getting more slowdowns. you might find having only 2 mongrels rather than 3 might actually respond faster because it likely wouldn't go into swap memory.
Page caching will for sure help a tonne on response time, so if your pages are cachable (eg, they don't have content that is unique to the individual user) I would say for sure check it out

Resources