How to use ReadFile from kernel32 with NSIS - buffer

I'm trying to open a file and read the content inside a buffer using NSIS Installer.
Unfortunatly, everything works except KERNEL32::ReadFile. I read that a lot of people have some problem with this API, and i can't find a solution.
Here is my code :
StrCpy $2 $2\TOS.TXT
System::Call 'Kernel32::CreateFile(t, i, i, i, i, i, i) i (r2, 0x80000000, 0, 0, 4, 0x80, 0) .r3'
System::Call 'kernel32::GetFileSize(pr3, p0)i.r7' ; Call API to read 32-bit file size
System::Call "kernel32::VirtualAlloc(i0, ir7, i0x3000, i0x40) .r1"
System::Call "KERNEL32::ReadFile(pr3,pr1,ir7,*i,p0)i.r3"
The file is well opened and the buffer is well created with the correct size.
Any help is appreciated,
Thank,
Chris.

Your VirtualAlloc call is missing the output type before .r1.
There is no need to use the System plug-in just to read from a simple file.
!macro MakeTestFile
FileOpen $0 "$temp\nsis_test.txt" w
FileWrite $0 "Hello$\nWorld!"
FileClose $0
!macroend
StrCpy $9 "$temp\nsis_test.txt"
!insertmacro MakeTestFile
FileOpen $0 "$9" r
FileRead $0 $1 ; Line 1
FileRead $0 $2 ; Line 2
FileClose $0
MessageBox MB_OK $1$2
!insertmacro MakeTestFile
FileOpen $0 "$9" r
FileSeek $0 1 SET
FileReadByte $0 $1
FileClose $0
IntFmt $1 "0x%.2X" $1
MessageBox MB_OK "Byte #2 is $1"
!insertmacro MakeTestFile
System::Call 'KERNEL32::CreateFile(t, i, i, p, i, i, p) p (r9, 0x80000000, 0, 0, 4, 0x80, 0) .r3'
System::Call 'KERNEL32::GetFileSize(pr3, p0)i.r7'
System::Call "KERNEL32::VirtualAlloc(p0, pr7, i0x3000, i0x40)p.r1"
System::Call "KERNEL32::ReadFile(pr3,pr1,ir7,*i,p0)i.r3"
System::Call "USER32::MessageBoxA(p$hwndparent,pr1,t 'System::Call',i0)"
System::Call "KERNEL32::VirtualFree(pr1,p0,i0x8000)"
The System plug-in also has System::Alloc/StrAlloc/Free so there is no need to call VirtualAlloc directly if you need memory.

Related

NSIS read value from listbox and write it to registry

I’ve been struggling for a while with NSIS, I need a listbox filled with 300 values like
Value Label
Then take selected value and write it to registry. So far I was able to create the listbox but hadn’t luck getting selected value. Any idea or example. Thanks
!include nsDialogs.nsh
Var ListboxValue
Function .onInit
StrCpy $ListboxValue "xyz" ; Default value used in silent installers
FunctionEnd
Page Custom MyPageCreate MyPageLeave
Page InstFiles
!define LISTITEMSEP ": "
Function MyPageCreate
nsDialogs::Create 1018
Pop $0
${NSD_CreateListBox} 0 0 100% 50% ""
Pop $1
${NSD_LB_AppendString} $1 "Foo${LISTITEMSEP}Bar"
${NSD_LB_AppendString} $1 "Hello${LISTITEMSEP}World"
${NSD_LB_SetSelectionIndex} $1 0 ; Select the first item by default
nsDialogs::Show
FunctionEnd
Function MyPageLeave
${NSD_LB_GetSelection} $1 $2
; Extract the part after the separator
StrLen $3 "${LISTITEMSEP}"
StrCpy $4 0
loop:
StrCpy $5 $2 $3 $4
${If} $5 != "${LISTITEMSEP}"
IntOp $4 $4 + 1
StrCmp $5 "" 0 loop
${EndIf}
IntOp $4 $4 + $3
StrCpy $ListboxValue $2 "" $4
${If} $ListboxValue == ""
MessageBox MB_ICONSTOP "Please select something!"
Abort
${EndIf}
FunctionEnd
Section
MessageBox MB_OK "Write $ListboxValue to the registry here"
SectionEnd
You can also use ${NSD_LB_SetItemData} and ${NSD_LB_GetItemData} if you need to assign a numeric value to each item.

Parsing Lua strings, more specifically newlines

I'm trying to parse Lua 5.3 strings. However, I encountered an issue. For example,
$ lua
Lua 5.3.4 Copyright (C) 1994-2017 Lua.org, PUC-Rio
> print(load('return "\\z \n\r \n\r \r\n \n \n \\x"', "#test"))
nil test:6: hexadecimal digit expected near '"\x"'
>
> print(load('return "\\z\n\r\n\r\r\n\n\n\\x"', "#test"))
nil test:6: hexadecimal digit expected near '"\x"'
Both of these error on line 6, and the logic behind that is pretty simple: eat newline characters (\r or \n) if they're different from the current one (I believe this to be an accurate description of how the lua lexer works, but I may be wrong).
I have this code, which should do it:
local ln = 1
local skip = false
local mode = 0
local prev
for at, crlf in eaten:gmatch('()[\r\n]') do
local last = eaten:sub(at-1, at-1)
if skip and prev == last and last ~= crlf then
skip = false
else
skip = true
ln = ln + 1
end
prev = crlf
end
It decides whether to eat newlines based on the previous char. Now, from what I can tell, this should work, but no matter what I do it doesn't seem to work. Other attempts have made it report 5 lines, while this one makes it report 9(!). What am I missing here? I'm running this on Lua 5.2.4.
This is part of a routine for parsing \z:
local function parse52(s)
local startChar = string.sub(s,1,1)
if startChar~="'" and startChar~='"' then
error("not a string", 0)
end
local c = 0
local ln = 1
local t = {}
local nj = 1
local eos = #s
local pat = "^(.-)([\\" .. startChar .. "\r\n])"
local mkerr = function(emsg, ...)
error(string.format('[%s]:%d: ' .. emsg, s, ln, ...), 0)
end
local lnj
repeat
lnj = nj
local i, j, part, k = string.find(s, pat, nj + 1, false)
if i then
c = c + 1
t[c] = part
if simpleEscapes[v] then
--[[ some code, some elseifs, some more code ]]
elseif v == "z" then
local eaten, np = s:match("^([\t\n\v\f\r ]*)%f[^\t\n\v\f\r ]()", nj+1)
local p=np
nj = p-1
--[[ the newline counting routine above ]]
--[[ some other elseifs ]]
end
else
nj = nil
end
until not nj
if s:sub(-1, -1) ~= startChar then
mkerr("unfinished string near <eof>")
end
return table.concat(t)
end
Compact code for iterating lines of Lua script:
local text = "First\n\r\n\r\r\n\n\nSixth"
local ln = 1
for line, newline in text:gmatch"([^\r\n]*)([\r\n]*)" do
print(ln, line)
ln = ln + #newline:gsub("\n+", "\0%0\0"):gsub(".%z.", "."):gsub("%z", "")
end
Efficient code for iterating lines of Lua script:
local text = "First\n\r\n\r\r\n\n\nSixth"
local sub = string.sub
local ln = 1
for line, newline in text:gmatch'([^\r\n]*)([\r\n]*)' do
print(ln, line)
local pos, max_pos = 1, #newline
while pos <= max_pos do
local crlf = sub(newline, pos, pos + 1)
if crlf == "\r\n" or crlf == "\n\r" then
pos = pos + 2
else
pos = pos + 1
end
ln = ln + 1
end
end

associative arrays in awk challenging memory limits

This is related to my recent post in Awk code with associative arrays -- array doesn't seem populated, but no error and also to optimizing loop, passing parameters from external file, naming array arguments within awk
My basic problem here is simply to compute from detailed ancient archival financial market data, daily aggregates of #transactions, #shares, value, BY DATE, FIRM-ID, EXCHANGE, etc. Learnt to use associative arrays in awk for this, and was thrilled to be able to process 129+ million lines in clock time of under 11 minutes. Literally before I finished my coffee.
Became a little more ambitious, and moved from 2 array subscripts to 4, and now I am unable to process more than 6500 lines at a time.
Get error messages of the form:
K:\User Folders\KRISHNANM\PAPERS\FII_Transaction_Data>zcat
RAW_DATA\2003_1.zip | gawk -f CODE\FII_daily_aggregates_v2.awk >
OUTPUT\2003_1.txt&
gawk: CODE\FII_daily_aggregates_v2.awk:33: (FILENAME=- FNR=49300)
fatal: more_no des: nextfree: can't allocate memory (Not enough space)
On some runs the machine has told me it lacks as little as 52 KB of memory. I have what I think of a std configuration with Win-7 and 8MB RAM.
(Economist by training, not computer scientist.) I realize that going from 2 to 4 arrays makes the problem computationally much more complex for the computer, but is there something one can do to improve memory management at least a little bit. I have tried closing everything else I am doing. The error always has to do only with memory, never with disk space or anything else.
Sample INPUT:
49290,C198962542782200306,6/30/2003,433581,F5811773991200306,S5405611832200306,B5086397478200306,NESTLE INDIA LTD.,INE239A01016,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,591.13,5655,3342840.15,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49291,C198962542782200306,6/30/2003,433563,F6292896459200306,S6344227311200306,B6110521493200306,GRASIM INDUSTRIES LTD.,INE047A01013,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,495.33,3700,1832721,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49292,C198962542782200306,6/30/2003,433681,F6513202607200306,S1724027402200306,B6372023178200306,HDFC BANK LTD,INE040A01018,6/26/2003,1,E745964372424200306,REG_DL_STLD_02,242,2600,629200,REG_DL_INSTR_EQ,REG_DL_DLAY_D,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49293,C7885768925200306,6/30/2003,48128,F4406661052200306,S7376401565200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,44600,5575000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49294,C7885768925200306,6/30/2003,48129,F4500260787200306,S1312094035200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,4,E912851176274200306,REG_DL_STLD_04,125,445600,55700000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49295,C7885768925200306,6/30/2003,48130,F6425024637200306,S2872499118200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,48000,6000000,REG_DL_INSTR_EU,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
Code
BEGIN { FS = "," }
# For each array subscript variable -- DATE ($10), firm_ISIN ($9), EXCHANGE ($12), and FII_ID ($5), after checking for type = EQ, set up counts for each value, and number of unique values.
( $17~/_EQ\>/ ) { if (date[$10]++ == 0) date_list[d++] = $10;
if (isin[$9]++ == 0) isin_list[i++] = $9;
if (exch[$12]++ == 0) exch_list[e++] = $12;
if (fii[$5]++ == 0) fii_list[f++] = $5;
}
# For cash-in, buy (B), or cash-out, sell (S) count NR = no of records, SH = no of shares, RV = rupee-value.
(( $17~/_EQ\>/ ) && ( $11~/1|2|3|5|9|1[24]/ )) {{ ++BNR[$10,$9,$12,$5]} {BSH[$10,$9,$12,$5] += $15} {BRV[$10,$9,$12,$5] += $16} }
(( $17~/_EQ\>/ ) && ( $11~/4|1[13]/ )) {{ ++SNR[$10,$9,$12,$5]} {SSH[$10,$9,$12,$5] += $15} {SRV[$10,$9,$12,$5] += $16} }
END {
{ print NR, "records processed."}
{ print " " }
{ printf("%-11s\t%-13s\t%-20s\t%-19s\t%-7s\t%-7s\t%-14s\t%-14s\t%-18s\t%-18s\n", \
"DATE", "ISIN", "EXCH", "FII", "BNR", "SNR", "BSH", "SSH", "BRV", "SRV") }
{ for (u = 0; u < d; u++)
{
for (v = 0; v < i; v++)
{
for (w = 0; w < e; w++)
{
for (x = 0; x < f; x++)
#check first below for records with zeroes, don't print them
{ if (BNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]] + SNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]] > 0)
{ BR = BNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SR = SNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
BS = BSH[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
BV = BRV[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SS = SSH[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SV = SRV[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
{ printf("%-11s\t%13s\t%20s\t%19s\t%7d\t%7d\t%14d\t%14d\t%18.2f\t%18.2f\n", \
date_list[u], isin_list[v], exch_list[w], fii_list[x], BR, SR, BS, SS, BV, SV) } }
}
}
}
}
}
}
Expected output
6 records processed.
DATE ISIN EXCH FII BNR SNR BSH SSH BRV SRV
6/27/2003 INE239A01016 E9035083824200306 F5811773991200306 1 0 5655 0 3342840.15 0.00
6/27/2003 INE047A01013 E9035083824200306 F6292896459200306 1 0 3700 0 1832721.00 0.00
6/26/2003 INE040A01018 E745964372424200306 F6513202607200306 1 0 2600 0 629200.00 0.00
6/28/2003 INE585B01010 E912851176274200306 F4406661052200306 1 0 44600 0 5575000.00 0.00
6/28/2003 INE585B01010 E912851176274200306 F4500260787200306 0 1 0 445600 0.00 55700000.00
It is in this case that as the number of input records exceeds 6500, I end up having memory problems. Have about 7 million records in all.
For a 2 array subscript problem, albeit on a different data set, where 129+ million lines were processed in clock time of 11 minutes using the same GNU-AWK on the same machine, see optimizing loop, passing parameters from external file, naming array arguments within awk
Question: is it the case that awk is not very smart with memory management, but that some other more modern tools (say, SQL) would accomplish this task with the same memory resources? Or is this simply a characteristic of associative arrays, which I found magical in enabling me to avoid many passes over the data, many loops and SORT procedures, but which maybe work well up to 2 array subscripts, and then face exponential memory resource costs after that?
Afterword: the super-detailed almost-idiot-proof tutorial along with the code provided by Ed Morton in comments below makes a dramatic difference, especially his GAWK script tst.awk. He taught me about (a) using SUBSEP intelligently (b) tackling needless looping, which is crucial in this problem which tends to have very sparse arrays, with various AWK constructs. Compared to performance with my old code (only up to 6500 lines of input accepted on one machine, another couldn't even get that far), the performance of Ed Morton's tst.awk can be seen from the table below:
**filename start end min in ln out lines
2008_1 12:08:40 AM 12:27:18 AM 0:18 391438 301160
2008_2 12:27:18 AM 12:52:04 AM 0:24 402016 314177
2009_1 12:52:05 AM 1:05:15 AM 0:13 302081 238204
2009_2 1:05:15 AM 1:22:15 AM 0:17 360072 276768
2010_1 "slept" 507496 397533
2010_2 3:10:26 AM 3:10:50 AM 0:00 76200 58228
2010_3 3:10:50 AM 3:11:18 AM 0:00 80988 61725
2010_4 3:11:18 AM 3:11:47 AM 0:00 86923 65885
2010_5 3:11:47 AM 3:12:15 AM 0:00 80670 63059**
Times were obtained simply from using %time% on lines before and after tst.awk was executed, all put in a simple batch script, "min" is the clock time taken (per whatever rounding EXCEL does by default), "in ln" and "out lines" are lines of input and output, respectively. From processing the entire data that we have, from Jan 2003 to Jan 2014, we find the theoretical max number of output records = #dates*#ISINs*#Exchanges*#FIIs = 2992*2955*567*82268, while the actual number of total output lines is only 5,261,942, which is only 1.275*10^(-8) of the theoretical max -- very sparse indeed. That there was sparseness, we did guess earlier, but that the arrays could be SO sparse -- which matters a lot for memory management -- we had no way of telling till something actually completed, for a real data set. Time taken seems to increase exponentially in input size, but within limits that pose no practical difficulty. Thanks a ton, Ed.
There is no problem with associative arrays in general. In awk (except gawk for true 2D arrays) an associative array with 4 subscripts is identical to one with 2 subscripts since in reality it only has one subscript which is the concatenation of each of the pseudo-subscripts separated by SUBSEP.
Given you say I am unable to process more than 6500 lines at a time. the problem is far more likely to be in the way you wrote your code than any fundamental awk issue so if you'd like more help, post a small script with sample input and expected output that demonstrates your problem and attempted solution to see if we have suggestions on way to improve it's memory usage.
Given your posted script, I expect the problem is with those nested loops in your END section When you do:
for (i=1; i<=maxI; i++) {
for (j=1; j<=maxJ; j++) {
if ( arr[i,j] != 0 ) {
print arr[i,j]
}
}
}
you are CREATING arr[i,j] for every possible combination of i and j that didn't exist prior to the loop just by testing for arr[i,j] != 0. If you instead wrote:
for (i=1; i<=maxI; i++) {
for (j=1; j<=maxJ; j++) {
if ( (i,j) in arr ) {
print arr[i,j]
}
}
}
then the loop itself would not create new entries in arr[].
So change this block:
if (BNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]] + SNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]] > 0)
{
BR = BNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SR = SNR[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
BS = BSH[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
BV = BRV[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SS = SSH[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
SV = SRV[date_list[u],isin_list[v],exch_list[w],fii_list[x]]
which is probably unnecessarily turning each of BNR, SNR, BSH, BRV, SSH, and SRV into huge but highly sparse arrays, to something like this:
idx = date_list[u] SUBSEP isin_list[v] SUBSEP exch_list[w] SUBSEP fii_list[x]
BR = (idx in BNR ? BNR[idx] : 0)
SR = (idx in SNR ? SNR[idx] : 0)
if ( (BR + SR) > 0 )
{
BS = (idx in BSH ? BSH[idx] : 0)
BV = (idx in BRV ? BRV[idx] : 0)
SS = (idx in SSH ? SSH[idx] : 0)
SV = (idx in SRV ? SRV[idx] : 0)
and let us know if that helps. Also check your code for other places where you might be doing the same.
The reason you have this problem with 4 subscripts when you didn't with 2 is simply that you have 4 levels of nesting in the loops now creating much larger and more sparse arrays when when you just had 2.
Finally - you have some weird syntax in your script, some of which #MarkSetchell pointed out in a comment, and your script isn't as efficient as it could be since you're not using else statements and so testing for multiple conditions that can't possibly all be true and you're testing the same condition repeatedly, and it's not robust as you aren't anchoring your REs (e.g you test /4|1[13]/ instead of /^(4|1[13])$/ so for example your 4 would match on 14 or 41 etc. instead of just 4 on its own) so change your whole script to this:
$ cat tst.awk
BEGIN { FS = "," }
# For each array subscript variable -- DATE ($10), firm_ISIN ($9), EXCHANGE ($12), and FII_ID ($5), after checking for type = EQ, set up counts for each value, and number of unique values.
$17 ~ /_EQ\>/ {
if (!seenDate[$10]++) date_list[++d] = $10
if (!seenIsin[$9]++) isin_list[++i] = $9
if (!seenExch[$12]++) exch_list[++e] = $12
if (!seenFii[$5]++) fii_list[++f] = $5
# For cash-in, buy (B), or cash-out, sell (S) count NR = no of records, SH = no of shares, RV = rupee-value.
idx = $10 SUBSEP $9 SUBSEP $12 SUBSEP $5
if ( $11 ~ /^([12359]|1[24])$/ ) {
++BNR[idx]; BSH[idx] += $15; BRV[idx] += $16
}
else if ( $11 ~ /^(4|1[13])$/ ) {
++SNR[idx]; SSH[idx] += $15; SRV[idx] += $16
}
}
END {
print NR, "records processed."
print " "
printf "%-11s\t%-13s\t%-20s\t%-19s\t%-7s\t%-7s\t%-14s\t%-14s\t%-18s\t%-18s\n",
"DATE", "ISIN", "EXCH", "FII", "BNR", "SNR", "BSH", "SSH", "BRV", "SRV"
for (u = 1; u <= d; u++)
{
for (v = 1; v <= i; v++)
{
for (w = 1; w <= e; w++)
{
for (x = 1; x <= f; x++)
{
#check first below for records with zeroes, don't print them
idx = date_list[u] SUBSEP isin_list[v] SUBSEP exch_list[w] SUBSEP fii_list[x]
BR = (idx in BNR ? BNR[idx] : 0)
SR = (idx in SNR ? SNR[idx] : 0)
if ( (BR + SR) > 0 )
{
BS = (idx in BSH ? BSH[idx] : 0)
BV = (idx in BRV ? BRV[idx] : 0)
SS = (idx in SSH ? SSH[idx] : 0)
SV = (idx in SRV ? SRV[idx] : 0)
printf "%-11s\t%13s\t%20s\t%19s\t%7d\t%7d\t%14d\t%14d\t%18.2f\t%18.2f\n",
date_list[u], isin_list[v], exch_list[w], fii_list[x], BR, SR, BS, SS, BV, SV
}
}
}
}
}
}
I added seen in front of 4 array names just because by convention arrays testing for the pre-existence of a value are typically named seen. Also, when populating the SNR[] etc arrays I created an idx variable first instead of repeatedly using the field numbers every time for both ease of changing it in future and mostly because string concatenation is relatively slow in awk and that's whats happening when you use multiple indices in an array so best to just do the string concatenation once explicitly. And I changed your date_list[] etc arrays to start at 1 instead of zero because all awk-generated arrays, strings and field numbers start at 1. You CAN create an array manually that starts at 0 or -357 or whatever number you want but it'll save shooting yourself in the foot some day if you always start them at 1.
I expect it could be made more efficient still by restricting the nested loops to only values that could exist for the enclosing loop index combinations (e.g. not every value of u+v+w is possible so there will be times when you shouldn't bother looping on x). For example:
$ cat tst.awk
BEGIN { FS = "," }
# For each array subscript variable -- DATE ($10), firm_ISIN ($9), EXCHANGE ($12), and FII_ID ($5), after checking for type = EQ, set up counts for each value, and number of unique values.
$17 ~ /_EQ\>/ {
if (!seenDate[$10]++) date_list[++d] = $10
if (!seenIsin[$9]++) isin_list[++i] = $9
if (!seenExch[$12]++) exch_list[++e] = $12
if (!seenFii[$5]++) fii_list[++f] = $5
# For cash-in, buy (B), or cash-out, sell (S) count NR = no of records, SH = no of shares, RV = rupee-value.
idx = $10 SUBSEP $9 SUBSEP $12 SUBSEP $5
if ( $11 ~ /^([12359]|1[24])$/ ) {
seen[$10,$9]
seen[$10,$9,$12]
++BNR[idx]; BSH[idx] += $15; BRV[idx] += $16
}
else if ( $11 ~ /^(4|1[13])$/ ) {
seen[$10,$9]
seen[$10,$9,$12]
++SNR[idx]; SSH[idx] += $15; SRV[idx] += $16
}
}
END {
printf "d = %d\n", d | "cat>&2"
printf "i = %d\n", i | "cat>&2"
printf "e = %d\n", e | "cat>&2"
printf "f = %d\n", f | "cat>&2"
print NR, "records processed."
print " "
printf "%-11s\t%-13s\t%-20s\t%-19s\t%-7s\t%-7s\t%-14s\t%-14s\t%-18s\t%-18s\n",
"DATE", "ISIN", "EXCH", "FII", "BNR", "SNR", "BSH", "SSH", "BRV", "SRV"
for (u = 1; u <= d; u++)
{
date = date_list[u]
for (v = 1; v <= i; v++)
{
isin = isin_list[v]
if ( (date,isin) in seen )
{
for (w = 1; w <= e; w++)
{
exch = exch_list[w]
if ( (date,isin,exch) in seen )
{
for (x = 1; x <= f; x++)
{
fii = fii_list[x]
#check first below for records with zeroes, don't print them
idx = date SUBSEP isin SUBSEP exch SUBSEP fii
if ( (idx in BNR) || (idx in SNR) )
{
if (idx in BNR)
{
bnr = BNR[idx]
bsh = BSH[idx]
brv = BRV[idx]
}
else
{
bnr = bsh = brv = 0
}
if (idx in SNR)
{
snr = SNR[idx]
ssh = SSH[idx]
srv = SRV[idx]
}
else
{
snr = ssh = srv = 0
}
printf "%-11s\t%13s\t%20s\t%19s\t%7d\t%7d\t%14d\t%14d\t%18.2f\t%18.2f\n",
date, isin, exch, fii, bnr, snr, bsh, ssh, brv, srv
}
}
}
}
}
}
}
}

Get a modules data without an entity statement in VHDL?

I need to get some data from a module I was given, but I don't know if it is even possible or how to approach the problem.
Is it possible to get information from another module if that module doesn't have an entity map? It only has a generic with TIME statements.
Is it at all possible to get anything out of this module?
It writes to a memory, could I pull things out of that?
This is the file I have.
library IEEE;
use IEEE.STD_LOGIC_1164.all;
use IEEE.STD_LOGIC_UNSIGNED.all;
use IEEE.STD_LOGIC_ARITH.all;
use STD.TEXTIO.all;
use IEEE.STD_LOGIC_TEXTIO.all;
entity MIPS is
generic (
MEM_DLY : TIME := 0.5 ns;
CYC_TIME: TIME := 2 ns
);
end entity MIPS;
architecture MIPS of MIPS is
signal PC : STD_LOGIC_VECTOR ( 31 downto 0 ) := X"0000_0010";
signal READ_DATA2 : STD_LOGIC_VECTOR ( 31 downto 0 ) := ( others => '0');
signal HUH : BIT_VECTOR ( 31 downto 0 );
signal HUHINS : STRING ( 1 to 25 );
signal INSTRUC : STD_LOGIC_VECTOR ( 31 downto 0 );
signal M_DATA_IN : STD_LOGIC_VECTOR ( 31 downto 0 ) := ( others => 'Z');
signal M_DATA_OUT : STD_LOGIC_VECTOR ( 31 downto 0 ):= ( others => 'Z');
signal M_ADDR : STD_LOGIC_VECTOR ( 11 downto 0 ) := ( others => '0');
signal CLK : STD_LOGIC := '0';
signal MEMREAD : STD_LOGIC := '0';
signal M_DATA_WHEN : STD_LOGIC := '0';
signal MEMWRITE : STD_LOGIC := '0';
signal CYCLE : INTEGER := 1;
begin
CLOCK_PROC:
process
begin
CLK <= '1';
wait for CYC_TIME/2;
CLK <= '0';
wait for CYC_TIME/2;
CYCLE <= CYCLE + 1;
end process;
TEST_PC_PROC:
process ( CLK ) is
begin
if RISING_EDGE ( CLK ) then
PC <= PC + 4;
end if;
end process;
INSTR_MEM_PROC:
process ( PC ) is -- make subject only to address
type INSTR_STR_ARY is array ( 0 to 1023 ) of STRING ( 1 to 25 );
variable MEMSTRR : INSTR_STR_ARY:=(others => " ");
type MEMORY is array ( 0 to 1023 ) of BIT_VECTOR ( 31 downto 0 );
variable MEM : MEMORY := ( others => X"0000_0000");
variable IADDR : INTEGER; -- integer for address
variable DTEMP : BIT_VECTOR ( 31 downto 0 );
variable INIT : INTEGER := 0; -- when to initialize...
file IN_FILE : TEXT open READ_MODE is "instr_mem.txt";
variable BUF : LINE;
variable ADR_STR : STD_LOGIC_VECTOR ( 31 downto 0 );
variable TADR : INTEGER;
variable TDATA : STD_LOGIC_VECTOR ( 31 downto 0 );
variable BDATA : BIT_VECTOR ( 31 downto 0 );
variable STR_ING : STRING ( 1 to 25 );
begin
if INIT = 0 then
while not (ENDFILE ( IN_FILE )) loop
READLINE ( IN_FILE, BUF );
HREAD ( BUF, ADR_STR ); -- get the address on the line
TADR := CONV_INTEGER ( ADR_STR (14 downto 2));
HREAD ( BUF, TDATA ); -- get the data on the line
BDATA := To_bitvector (TDATA);
MEM ( TADR ) := BDATA; -- put into memory
for J in 1 to 25 loop
STR_ING(J) := ' ';
end loop;
READ ( BUF, STR_ING ); -- get instruction string
MEMSTRR ( TADR ) := STR_ING;
report "iteration of loop";
end loop;
INIT := 1; -- when all data in, set INIT to 1;
end if; -- end of INIT check
IADDR := CONV_INTEGER ( PC ( 14 downto 2 ));
HUH <= MEM ( IADDR );
INSTRUC <= To_StdLogicVector ( MEM ( IADDR )) after MEM_DLY;
HUHINS <= MEMSTRR ( IADDR );
report "should hit INSTRUC";
end process;
M_DATA_IN_STMT:
M_DATA_IN <= READ_DATA2 ;
-- The following is the magic process
-- User must supply:
-- M_ADDR - memory address (data memory) as a 12 bit STD_LOGIC_VECTOR
-- Remember the M_ADDR is a WORD address
-- M_DATA_IN - value going to memory from hardware (data path)
-- Remember that this is 32 bit STD_LOGIC_VECTOR, user supplied
-- READ_DATA2 - this is to be replaced by user's sourceof info for memory
DATA_MEMORY_PROCESS: -- name of process ...
process ( M_ADDR, CLK, MEMREAD ) is -- Sens: M_ADDR, CLK, MEMREAD
file IN_FILE: TEXT open READ_MODE is "data_mem_init.txt"; -- initial data
file OUT_FILE: TEXT open WRITE_MODE is "mem_trans.txt"; -- results
variable BUF : LINE; -- declare BUF as LINE
variable TVAL : STD_LOGIC_VECTOR ( 31 downto 0 ); -- var for temp value
variable TADRHEX : STD_LOGIC_VECTOR ( 31 downto 0 ); -- var for address
variable TADR : INTEGER; -- address as integer
type MEM_TYPE is array ( 0 to 1023 ) of STD_LOGIC_VECTOR ( 31 downto 0 );
variable THE_MEMORY : MEM_TYPE := ( others => X"00000000" ); -- the memory
variable FIRST : BOOLEAN := TRUE; -- flag for first time thru
constant STR : STRING ( 1 to 3 ) := " "; -- 3 spaces - for printing
constant WR_STR : STRING ( 1 to 3 ) := "W "; -- for write
constant RD_STR : STRING ( 1 to 3 ) := "R "; -- for read
variable TSTR2 : STRING ( 1 to 29 ); -- to create a string
type MEMSTR_TYPE is array ( 0 to 1023 ) of STRING ( 1 to 29 ); --
variable INSTRS : MEMSTR_TYPE;
begin -- start here
if FIRST then -- first time thru,
while FIRST loop -- loop on data available - until
if not ( ENDFILE ( IN_FILE )) then -- end of file shows up
READLINE(IN_FILE, BUF); -- read a line from file,
HREAD(BUF, TADRHEX); -- get address from BUF
TADR := CONV_INTEGER ( TADRHEX ); -- turn it into integer
HREAD(BUF, TVAL); -- next, get value from BUF
THE_MEMORY(TADR/4) := TVAL; -- put TVAL into the memory
else -- the 'else' is for end of file
FIRST := FALSE; -- EOF shows up - set FIRST false
end if;
end loop; -- where loop ends...
end if; -- where if FIRST ends ...
if MEMREAD = '1' then -- now, memory function 'read'
M_DATA_OUT <= THE_MEMORY ( CONV_INTEGER ( M_ADDR ) / 4 ); -- get val from
M_DATA_WHEN <= not M_DATA_WHEN; -- and invert M_DATA_WHEN
else -- if not MEMREAD,
M_DATA_OUT <= ( others => 'Z' ); -- set memory out to 'Z's
end if;
if RISING_EDGE ( CLK ) then -- on clock edge...
if MEMREAD = '1' then -- if MEMREAD asserted,
TADR := CONV_INTEGER ( M_ADDR ) / 4; -- set TADR to address as int
TVAL := THE_MEMORY ( TADR ); -- and get contents to TVAL
WRITE (BUF, RD_STR); -- then build BUF; put read indi
HWRITE (BUF, M_ADDR); -- and the address
WRITE (BUF, STR); -- some spaces
HWRITE (BUF, TVAL); -- and the value
WRITE (BUF, STR); -- more spaces
WRITE (BUF, NOW); -- current simulation time
WRITELINE (OUT_FILE, BUF); -- and send line to file.
elsif MEMWRITE = '1' then -- if not read, but it is write
TADR := CONV_INTEGER ( M_ADDR ) / 4; -- set TADR to address as int
TVAL := M_DATA_IN; -- set TVAL as data in value
WRITE (BUF, WR_STR); -- start buffer with write indi
HWRITE (BUF, M_ADDR); -- then the address
WRITE (BUF, STR); -- then some spaces
HWRITE (BUF, TVAL); -- and the value written
WRITE (BUF, STR); -- still more spaces
WRITE (BUF, NOW); -- simulation time
WRITELINE (OUT_FILE, BUF); -- and send line to file
THE_MEMORY ( CONV_INTEGER ( M_ADDR ) / 4) := M_DATA_IN;
-- and finally, value to the mem
end if;
end if;
end process;
end architecture MIPS;
The code you presented simulates the memories that your MIPS processor will interact with - a program memory and a data memory.
Your MIPS will interact with the program memory by providing a value for PC; the corresponding instruction will be handed to your CPU on signal INSTRUCT. You'll probably delete the lines corresponding to the TEST_PC_PROC process, since the actual PC value will come from the MIPS. The program to be run by the CPU is given in file data_mem_init.txt. This program memory is asynchronous.
Your MIPS will interact with the data memory through signals M_ADDR, M_DATA_OUT, M_DATA_IN, and MEMREAD. To read data, your CPU will set M_ADDR and MEMREAD=1, and provide the address in M_ADDR. The given code will set M_DATA_OUT with the requested data. To write data, you will set M_DATA_IN or READ_DATA2 (or replace READ_DATA2 with a signal of your choice). The data will be written on the rising edge of CLK.
Don't be distracted by the WRITE/HWRITE calls, they just keep a log on file mem_trans.txt.
IMO, this interface is much more complicated than it needed to be. You're probably better or if you keep your MIPS implementation in totally separate files, and just add the signals needed to interact with this model to its ports list.
It's not entirely clear from your question what you are hoping to achieve with this mystery module that you have... but here's some ideas which might trigger something:
If you have a component for the module in question, then you can instantiate it within your design and then manipulate its inputs to make its outputs do what they should. Maybe it has some documentation to give you some clues!
If it writes to memory and you have a multi port memory controller within your system connected to the same memory, you could build something which will read data from the memory after your mystery module has written to it.
Or finally, if this is an FPGA, you can embed a logic analyser into the FPGA bitstream to observe the signals going to and from the secret module.

How to perform calculation over a log file

I have a that looks like this:
I, [2009-03-04T15:03:25.502546 #17925] INFO -- : [8541, 931, 0, 0]
I, [2009-03-04T15:03:26.094855 #17925] INFO -- : [8545, 6678, 0, 0]
I, [2009-03-04T15:03:26.353079 #17925] INFO -- : [5448, 1598, 185, 0]
I, [2009-03-04T15:03:26.360148 #17925] INFO -- : [8555, 1747, 0, 0]
I, [2009-03-04T15:03:26.367523 #17925] INFO -- : [7630, 278, 0, 0]
I, [2009-03-04T15:03:26.375845 #17925] INFO -- : [7640, 286, 0, 0]
I, [2009-03-04T15:03:26.562425 #17925] INFO -- : [5721, 896, 0, 0]
I, [2009-03-04T15:03:30.951336 #17925] INFO -- : [8551, 4752, 1587, 1]
I, [2009-03-04T15:03:30.960007 #17925] INFO -- : [5709, 5295, 0, 0]
I, [2009-03-04T15:03:30.966612 #17925] INFO -- : [7252, 4928, 0, 0]
I, [2009-03-04T15:03:30.974251 #17925] INFO -- : [8561, 4883, 1, 0]
I, [2009-03-04T15:03:31.230426 #17925] INFO -- : [8563, 3866, 250, 0]
I, [2009-03-04T15:03:31.236830 #17925] INFO -- : [8567, 4122, 0, 0]
I, [2009-03-04T15:03:32.056901 #17925] INFO -- : [5696, 5902, 526, 1]
I, [2009-03-04T15:03:32.086004 #17925] INFO -- : [5805, 793, 0, 0]
I, [2009-03-04T15:03:32.110039 #17925] INFO -- : [5786, 818, 0, 0]
I, [2009-03-04T15:03:32.131433 #17925] INFO -- : [5777, 840, 0, 0]
I'd like to create a shell script that calculates the average of the 2nd and 3rd fields in brackets (840 and 0 in the last example). An even tougher question: is it possible to get the average of the 3rd field only when the last one is not 0?
I know I could use Ruby or another language to create a script, but I'd like to do it in Bash. Any good suggestions on resources or hints in how to create such a script would help.
Use bash and awk:
cat file | sed -ne 's:^.*INFO.*\[\([0-9, ]*\)\][ \r]*$:\1:p' | awk -F ' *, *' '{ sum2 += $2 ; sum3 += $3 } END { if (NR>0) printf "avg2=%.2f, avg3=%.2f\n", sum2/NR, sum3/NR }'
Sample output (for your original data):
avg2=2859.59, avg3=149.94
Of course, you do not need to use cat, it is included there for legibility and to illustrate the fact that input data can come from any pipe; if you have to operate on an existing file, run sed -ne '...' file | ... directly.
EDIT
If you have access to gawk (GNU awk), you can eliminate the need for sed as follows:
cat file | gawk '{ if(match($0, /.*INFO.*\[([0-9, ]*)\][ \r]*$/, a)) { cnt++; split(a[1], b, / *, */); sum2+=b[2]; sum3+=b[3] } } END { if (cnt>0) printf "avg2=%.2f, avg3=%.2f\n", sum2/cnt, sum3/cnt }'
Same remarks re. cat apply.
A bit of explanation:
sed only prints out lines (-n ... :p combination) that match the regular expression (lines containing INFO followed by any combination of digits, spaces and commas between square brackets at the end of the line, allowing for trailing spaces and CR); if any such line matches, only keep what's between the square brackets (\1, corresponding to what's between \(...\) in the regular expression) before printing (:p)
sed will output lines that look like: 8541, 931, 0, 0
awk uses a comma surrounded by 0 or more spaces (-F ' *, *') as field delimiters; $1 corresponds to the first column (e.g. 8541), $2 to the second etc. Missing columns count as value 0
at the end, awk divides the accumulators sum2 etc by the number of records processed, NR
gawk does everything in one shot; it will first test whether each line matches the same regular expression passed in the previous example to sed (except that unlike sed, awk does not require a \ in fron the round brackets delimiting areas or interest). If the line matches, what's between the round brackets ends up in a[1], which we then split using the same separator (a comma surrounded by any number of spaces) and use that to accumulate. I introduced cnt instead of continuing to use NR because the number of records processed NR may be larger than the actual number of relevant records (cnt) if not all lines are of the form INFO ... [...comma-separated-numbers...], which was not the case with sed|awk since sed guaranteed that all lines passed on to awk were relevant.
Posting the reply I pasted to you over IM here too, just because it makes me try StackOverflow out :)
# replace $2 with the column you want to avg;
awk '{ print $2 }' | perl -ne 'END{ printf "%.2f\n", $total/$n }; chomp; $total+= $_; $n++' < log
Use nawk or /usr/xpg4/bin/awk on Solaris.
awk -F'[],]' 'END {
print s/NR, t/ct
}
{
s += $(NF-3)
if ($(NF-1)) {
t += $(NF-2)
ct++
}
}' infile
Use Python
logfile= open( "somelogfile.log", "r" )
sum2, count2= 0, 0
sum3, count3= 0, 0
for line in logfile:
# find right-most brackets
_, bracket, fieldtext = line.rpartition('[')
datatext, bracket, _ = fieldtext.partition(']')
# split fields and convert to integers
data = map( int, datatext.split(',') )
# compute sums and counts
sum2 += data[1]
count2 += 1
if data[3] != 0:
sum3 += data[2]
count3 += 1
logfile.close()
print sum2, count2, float(sum2)/count2
print sum3, count3, float(sum3)/count3

Resources