How to disable Bitcoind ZeroMQ notification publish block or transaction in mempool - bitcoind

TL;DR
zmq send the block data in mempool not confirmed in blockchain,how to filter these unconfirmed data.
the problem
In 2019 our production use blocknotify to fire event of new block,but this year(2020) I'm required to changed to using zeromq.So I change my code fired by zeromq's publish zmqpubhashblock , it's easy.But I found zmq send the block data in mempool not confirmed in blockchain, and not found anyother config can disable this behavior.
current node(a testnet btc node for development)
Bitcoin Core Daemon version v0.18.0.0
disabled wallet feature of bitcoind
add -txindex in config
my code (a zmq sub demo in laravel command)
$context = new \ZMQContext();
$subscriber = new \ZMQSocket($context, \ZMQ::SOCKET_SUB);
$subscriber->connect("tcp://192.168.1.136:28332"); //btc testnet
$subscriber->setSockOpt(\ZMQ::SOCKOPT_SUBSCRIBE, "hashblock");
// $subscriber->setSockOpt(\ZMQ::SOCKOPT_SUBSCRIBE, "hashtx");
$subscriber->setSockOpt(\ZMQ::SOCKOPT_SUBSCRIBE, "rawblock");
// $subscriber->setSockOpt(\ZMQ::SOCKOPT_SUBSCRIBE, "rawtx");
$this->info("sub btc");
while (true) {
$multiRec = $subscriber->recvMulti();
// if ($multiRec != false) {
// var_dump($multiRec);
$this->info(date('Y-m-d H:i:s') . ' rec:');
$topic = $multiRec[0];
$body = \bin2hex($multiRec[1]);
$sequence = \bin2hex($multiRec[2]);
$this->info("topic: " . $topic . " , sequence : " . $sequence);
$this->info("body " . " , sequence : " . $sequence);
$this->info($body);
$this->info('----------------------');
}
It output like this
sub btc
2020-01-09 07:55:10 rec:
topic: hashblock , sequence : 4c2b0000
body , sequence : 4c2b0000
00000000000119f7061e4de7bc09f7526ad6d03057da7eadb2a8c68260765b20
----------------------
2020-01-09 07:55:10 rec:
topic: rawblock , sequence : 4c2b0000
body , sequence : 4c2b0000
0000ff3fef3ca23d30df507cfa0bbbccb27cbed53e714cf4da599c917fc200000000000035f889f1bb70bac042d7f63dc0056279d16564ef2b5a9ef6ef6e02a122ae3cea67dc165eb08f091b056252a702020000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff2703b038190458dc165e726567696f6e312f46756e3230313931322f010000012700000000000000ffffffff02c40c5402000000001976a9149dbb856bf9bfe4cebc7cc3aec5434c14e540ba7d88ac0000000000000000266a24aa21a9ed0b17f5f2ecfe5fff1fde0d8e23223f2aa9fc2fad5b04c0430bb682c69f4f59dd012000000000000000000000000000000000000000000000000000000000000000000000000001000000000101d266d0eba0eb71e7f4c163caa5ce49d9f9d167cec65f7a1299cda5e6a10b3b2b0000000000ffffffff02e8030000000000001976a914c48ce75ce550d6d13548fe683106facbc0aeee3c88acb0290c0000000000160014f52b799d88c35973313bc2f5110a53f74e4dde810247304402203a001e87cbda9f36570963ff8dbb3f9b086008f34c15c360bcea2417aea6c575022071bf52e8394b54bab9d5a4a87e386c995d3651f9d09dc958e38d5fffeae1910701210339363cbe9f2a914801db907302a6658c28c145b247cb7c0133f70f5db0cdd0aa00000000
----------------------
2020-01-09 07:55:12 rec:
topic: hashblock , sequence : 4d2b0000
body , sequence : 4d2b0000
0000000000017d572741e884f5b6d20a86351dc52f432072eba0f86ac5026cbd
----------------------
2020-01-09 07:55:12 rec:
topic: rawblock , sequence : 4d2b0000
body , sequence : 4d2b0000
00000020205b766082c6a8b2ad7eda5730d0d66a52f709bce74d1e06f719010000000000eb6c4b961cfb08a367efdc0770df9377b5c1ffc7358ecf25870648e99e44d51c5edc165eb08f091bbd17f63801010000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff4c03b13819045edc165e08fabe6d6d00000000000000000000000000000000000000000000000000000000000000000100000000000000180000708b1800000d2f6e6f64655374726174756d2f00000000020000000000000000266a24aa21a9ede2f61c3f71d1defd3fa999dfa36953755c690689799962b48bebd836974e8cf9e40b5402000000001976a914bd3400d71504033fb1e7c947f2e0f55775899f2288ac0120000000000000000000000000000000000000000000000000000000000000000000000000
----------------------
2020-01-09 07:55:12 rec:
topic: hashblock , sequence : 4e2b0000
body , sequence : 4e2b0000
000000000000df593a7a72c65d1ddcb8d7aef78bac448b288b798de256d04db3
----------------------
2020-01-09 07:55:12 rec:
topic: rawblock , sequence : 4e2b0000
body , sequence : 4e2b0000
00000020bd6c02c56af8a0eb7220432fc51d35860ad2b6f584e84127577d010000000000f70bd9ad38658f46e958c11bc3f98e3bc7b5082c72beee27df6672750a2c936c60dc165eb08f091b1b18f5f401010000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff4c03b238190460dc165e08fabe6d6d0000000000000000000000000000000000000000000000000000000000000000010000000000000018000070b40200000d2f6e6f64655374726174756d2f00000000020000000000000000266a24aa21a9ede2f61c3f71d1defd3fa999dfa36953755c690689799962b48bebd836974e8cf9e40b5402000000001976a914bd3400d71504033fb1e7c947f2e0f55775899f2288ac0120000000000000000000000000000000000000000000000000000000000000000000000000
----------------------
2020-01-09 07:55:13 rec:
topic: hashblock , sequence : 4f2b0000
body , sequence : 4f2b0000
00000000000082663372af34f08cb4463bba26bc50849d1f0cf559a6beb52c30
----------------------
2020-01-09 07:55:13 rec:
topic: rawblock , sequence : 4f2b0000
body , sequence : 4f2b0000
00000020b34dd056e28d798b288b44ac8bf7aed7b8dc1d5dc6727a3a59df0000000000003ed7970bbe29d8f94424d80bfd597d11ccb0501a9ca9744bae7ce7e9e68f26a460dc165eb08f091b54bb9f6601010000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff4c03b338190460dc165e08fabe6d6d0000000000000000000000000000000000000000000000000000000000000000010000000000000018000070c90400000d2f6e6f64655374726174756d2f00000000020000000000000000266a24aa21a9ede2f61c3f71d1defd3fa999dfa36953755c690689799962b48bebd836974e8cf9e40b5402000000001976a914bd3400d71504033fb1e7c947f2e0f55775899f2288ac0120000000000000000000000000000000000000000000000000000000000000000000000000
----------------------
I decode the rawBlockdata and get the txid a4268fe6e9e77cae4b74a99c1a50b0cc117d59fd0bd82444f9d829be0b97d73e, but it is zero confirmations (current 08:12:18 UTC Thursday, January 9, 2020).

check other testnet explorer solved my problem, the explorer I used is 200 behind current
By checking other testnet explorer, I find https://tbtc.bitaps.com/ is 200 block behind current (testnet January 9, 2020).
And the data is right now.(January 10, 2020).
It's not bitcoind's problem or my code's problem just check other testnet explorer.

Related

Extract values from a json object using a function node

I have a problem and knowing how to ask the correct question is key! I was looking around and found someone asking the same question in a different way, unfortunately the answer was not complete. So I am going to ask it here using my flows and situation.
I am trying to extract values from an json object using a function node, tried many ways and nothing seems to work for me. I am trying to extract 7 values to create 7 different messages with 7 different Topics so they can be published via 7 different MQTT nodes.
This is the complete message object:
`
2022-11-11, 1:59:04 p.m.node: 3ec81adb55be266d
v3/rak-wis-blocks#ttn/devices/rak4631-sensor1/up : msg : Object
object
topic: "v3/rak-wis-blocks#ttn/devices/rak4631-sensor1/up"
payload: object
Vbat_Per: 110
Vbat: 4269
Fixed: 1
TemperatureAM: 11.85
HumidityAM: 99.99
TempInside: 12.42
Pressure: 995.44
qos: 0
retain: false
_msgid: "85957ff31d82e49b"
`
I found an example that used a Function node, with the below code, and it spits out 7 separate messages with the variables I want. I just don't know how to take each new message and modify its properties with new and different Topic values.
Example: topic: sensor/Vbat_per, topic: sensor/Vbat, topic: sensor/Fixed, etc.
`
var keys = Object.keys(msg.payload);
var msgs = keys.map(function(key) {
return { topic: key, payload: msg.payload[key] };
});
return [msgs];
`
This is the output from the Function node with the above code:
`
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
Vbat_Per : msg.payload : number
75
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
Vbat : msg.payload : number
4034
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
Fixed : msg.payload : number
1
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
TemperatureAM : msg.payload : number
12.32
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
HumidityAM : msg.payload : number
99.99
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
TempInside : msg.payload : number
12.49
2022-11-11, 3:09:04 p.m.node: 9bc14535de62758e
Pressure : msg.payload : number
994.3
`
To me it looks like there are 7 new messages with only a payload and no Topic. Am I understanding this correctly?

Form data is not correct with Suave web server

I am trying to receive a confirmation from AWS' SNS system.
It sends a message through a POST to a webserver and I'm receiving it using Suave.
When I get the message, the form field is truncated, I am receiving:
"{\n \"Type\" : \"SubscriptionConfirmation\",\n \"MessageId\" : \"13dd68fa-2720-419a-8d8a-edc9f4466ea3\",\n \"Token\" : \"2336412f37fb687f5d51e6e2425e90ccf23f36200405c1942ea85f0874b382c47327c4bfd63203eace5240bb7b253428af362a04a61c8f98ab1718c679e9e27594529615adf5d86729374cce472768b91622851aced957c1dcddf21e3d82b48f5ff528c8dbc911179a3f26126a8d7f00\",\n \"TopicArn\" : \"arn:aws:sns:ap-northeast-1:960544703730:igorbot\",\n \"Message\" : \"You have chosen to subscribe to the topic arn:aws:sns:ap-northeast-1:960544703730:igorbot.\\nTo confirm the subscription, visit the SubscribeURL included in this message.\",\n \"SubscribeURL\" : \"https://sns.ap-northeast-1.amazonaws.com/?Action",
"ConfirmSubscription"
so it's an unfinished json...
but when I look at the rawForm field, I get the whole message:
{
"Type" : "SubscriptionConfirmation",
"MessageId" : "0645c009-f4fa-4fb7-9c94-127e98f5eb76",
"Token" : "2336412f37fb687f5d51e6e2425e90ccf23f36200406641a9f63b753dad1d31c61da2ebeea8cdaeed2e3c04f701bd08e2d2c9cac65676979e43c1089a96779f9b57a0e0f072013333db51472ca43c1e6a0854cf3af6769d95c7911d74c9e2bec22db93de90e537d480070891ddaaa548",
"TopicArn" : "arn:aws:sns:ap-northeast-1:960544703730:igorbot",
"Message" : "You have chosen to subscribe to the topic arn:aws:sns:ap-northeast-1:960544703730:igorbot.\nTo confirm the subscription, visit the SubscribeURL included in this message.",
"SubscribeURL" : "https://sns.ap-northeast-1.amazonaws.com/?Action=ConfirmSubscription&TopicArn=arn:aws:sns:ap-northeast-1:960544703730:igorbot&Token=2336412f37fb687f5d51e6e2425e90ccf23f36200406641a9f63b753dad1d31c61da2ebeea8cdaeed2e3c04f701bd08e2d2c9cac65676979e43c1089a96779f9b57a0e0f072013333db51472ca43c1e6a0854cf3af6769d95c7911d74c9e2bec22db93de90e537d480070891ddaaa548",
"Timestamp" : "2021-03-13T20:36:21.918Z",
"SignatureVersion" : "1",
"Signature" : "FBnpuGtkZmzox+5ryo1/4k1hgwmoeuvcptQ2dOyyneShVHovmdemMqo9JFTzbBFelTN7FMojX/sIjFs2dZoQYeqEgsQW9WqCiEstDQu0toHn7KKxapzIoGfjfh6Rikfy8Liv88RRNLC2DLtxWW2JWr5Mmwkjtro/pm7vyJhfp5G4qcAB3gtBOtVm+XOAai6rY7obcMojkmMr4jDd9UqutV6imyYDCH+PvUCnc7aKg6p4EmZO33VlRibIPa5PiN1Sj/mmNhyoeR4pGu+0Jci+utvonXaYPgtlDuEyoVcgUQ6lki1xiclIRpDm4FvOL5tvUSq+Jdjz3prlNDNM8AuQpQ==",
"SigningCertURL" : "https://sns.ap-northeast-1.amazonaws.com/SimpleNotificationService-010a507c1833636cd94bdb98bd93083a.pem"
}
At first I thought the output was truncated, but then I narrowed it down. Posting this line:
{
"SubscribeURL" : "https://sns.ap-northeast-1.amazonaws.com/?Action=ConfirmSubscription&TopicArn=arn:aws:sns:ap-northeast-1:960544703730:igorbot&Token=2336412f37fb687f5d51e6e2425e90ccf23f36200406641a9f63b753dad1d31c61da2ebeea8cdaeed2e3c04f701bd08e2d2c9cac65676979e43c1089a96779f9b57a0e0f072013333db51472ca43c1e6a0854cf3af6769d95c7911d74c9e2bec22db93de90e537d480070891ddaaa548"
}
will fail and get truncated at the first '=' sign. Then:
{
"SubscribeURL" : "abc=3"
}
will cause Suave to fail. When sent as form data, it will not be converted to a string properly.
Is there any setting related to encoding, etc that would allow to prevent this? (since I can't ask AWS to escape all their '=' signs)
JSON and form data (application/x-www-form-urlencoded, to be specific) have different syntax. Form data looks like key1=value1&key2=value2, so Suave is breaking the string on & and = in order to parse it.
You can parse JSON using mapJson instead. Something like this should work:
open System.Runtime.Serialization
open Suave
open Suave.Json
[<DataContract>]
type Subscription =
{
[<field: DataMember(Name = "SubscribeURL")>]
SubscribeUrl : string;
}
[<EntryPoint>]
let main argv =
let app = mapJson (fun sub -> sprintf "URL: %s" sub.SubscribeUrl)
startWebServer defaultConfig app
0

Parsing text file - Lua

I have a text file contains:
<Response>
<IP>17.178.96.59</IP>
<CountryCode>US</CountryCode>
<CountryName>United States</CountryName>
<RegionCode></RegionCode>
<RegionName></RegionName>
<City>Chicago</City>
<ZipCode></ZipCode>
<TimeZone>America/Chicago</TimeZone>
<Latitude>37.751</Latitude>
<Longitude>-97.822</Longitude>
<MetroCode>0</MetroCode>
</Response>
How to remove each and so I get only:
17.178.96.59,
US,
United States,
nil,
nil,
Chicago,
nil,
America/Chicago,
37.751,
-97.822,
0
Using the Lua script.
You don't show what you already tried, but there is a recently discussed SO question on how to iterate over XML nodes that should work well for your purposes. Just replace fixed element names with something like [%w_]+ and collect parsed values into a table and you'll get the result you are looking for.
This is what I am trying to do.
int = getInternet()
url3 = 'https://freegeoip.app/xml/17.178.96.59'
result = int.getURL(url3)
result = result:gsub("><",">nil<"):gsub("<.->",""):gsub("%c%s+",","):sub(2)..'nil'
print(result)
-- = 17.178.96.59,US,United States,nil,nil,nil,nil,America/Chicago,37.751,-97.822,0,mil
geoResult = {}
for word in string.gmatch(result, '([^,]+)') do
--print(word)
table.insert(geoResult, word)
--for i,v in ipairs(result) do print(i,v) end
end
print('IP : '..geoResult[1])
print('Country ID : '..geoResult[2])
print('Country Name : '..geoResult[3])
print('Region Code : '..geoResult[4])
print('Region Name : '..geoResult[5])
print('City : '..geoResult[8])
print('ZIP Code : '..geoResult[7])
print('Time Zone : '..geoResult[8])
print('Latitude : '..geoResult[9])
print('Longitude : '..geoResult[10])
print('Metro Code : '..geoResult[11])
-- result:
17.178.96.59,US,United States,nil,nil,nil,nil,America/Chicago,37.751,-97.822,0,nil
IP : 17.178.96.59
Country ID : US
Country Name : United States
Region Code : nil
Region Name : nil
City : America/Chicago
ZIP Code : nil
Time Zone : America/Chicago
Latitude : 37.751
Longitude : -97.822
Metro Code : 0

Derivative node in kapacitor batch

I am using derivative node to calculate bandwidth utilization of network devices, below is the script.
I am using where clause because i wanted alert for specific interface for specific Ip.
// database
var database = 'router'
// measurement from where data is coming
var measurement = 'cisco_router'
// RP from where data is coming
var RP = 'autogen'
// which influx cluster to use
var clus = 'network'
// durations
var period = 7m
var every = 10s
// alerts
var crit = 320
var alertName = 'cisco_router_bandwidth_alert'
var triggerType = 'threshold'
batch
|query(''' SELECT (mean("bandwidth_in") * 8) as "value" FROM "router"."autogen"."cisco_router" where host = '10.1.11.1' and ( interface_name = 'GigabitEthernet0/0/0' or interface_name = 'GigabitEthernet0/0/1') ''')
.cluster('network')
.period(7m)
.every(6m)
.groupBy(*)
|derivative('value')
.unit(1s)
.nonNegative()
.as('value')
|alert()
.crit(lambda: "value" > crit)
.stateChangesOnly()
.message(' {{.Level}} for {{ index .Tags "device_name" }} on Port {{ index .Tags "name" }} {{ .Time.Local.Format "2006.01.02 - 15:04:05" }} ')
.details('''
<pre>
------------------------------------------------------------------
CLIENT NAME : XXXXXXXX
ENVIRONMENT : Prod
DEVICE TYPE : Router
CATEGORY : {{ index .Tags "type" }}
IP ADDRESS : {{ index .Tags "host" }}
DATE : {{ .Time.Local.Format "2006.01.02 - 15:04:05" }}
INTERFACE NAME : {{ index .Tags "name" }}
VALUE : {{ index .Fields "value" }}
SEVERITY : {{.Level}}
------------------------------------------------------------------
</pre>
''')
.log('/tmp/chronograf/cisco_router_interface_alert.log')
.levelTag('level')
.idTag('id')
.messageField('message')
.email()
.to('XXXXXXX')
|influxDBOut()
.database('chronograf')
.retentionPolicy(RP)
.measurement('alerts')
.tag('alertName', alertName)
But it is not showing anything when i do kapacitor watch and not showing any errors in logs.
derivative() and some other nodes like stateDuration() kind of resets their state on each new batch query, in opposite to stream mode, where their state is kept whole time.
Actually, it is because in batch mode this nodes designed to track changes only inside the current batch of points.
Since your query returns single point - there is no result from derivative().
Try move derivative to the query. And use |httpOut() node to track results on each step - really helpful to understand kapacitor logic.
here is some example:
dbrp "telegraf"."autogen"
var q= batch
|query('SELECT derivative(mean("bytes_recv"), 1s) AS "bytes_recv_1s" FROM "telegraf"."autogen"."net" WHERE time < now() AND "interface"=\'eth0\' GROUP BY time(10m) fill(none)')
.period(10m)
.every(30s).align()
.groupBy(time(10m))
.fill('none')
|last('bytes_recv_1s').as('value')
|httpOut('query')
Note, there is a bugs associated with query parsing, that requires specify GROUP BY in both query and tick
https://github.com/influxdata/kapacitor/issues/971
https://github.com/influxdata/kapacitor/issues/622

changing text of rule in antlr4 using setText

I want to change every entry in csv file to 'BlahBlah'
For that I have antlr grammar as
grammar CSV;
file : hdr row* row1;
hdr : row;
row : field (',' value1=field)* '\r'? '\n'; // '\r' is optional at the end of a row of CSV file ..
row1 : field (',' field)* '\r'? '\n'?;
field
: TEXT
{
$setText("BlahBlah");
}
| STRING
|
;
TEXT : ~[,\n\r"]+ ;
STRING : '"' ('""' | ~'"')* '"' ;
But when I run this on antlr4
error(63): CSV.g4:13:3: unknown attribute reference setText in $setText
make: *** [run] Error 1
why is setText not supported in antlr4 and is there any other alternative to replace text?
Couple of problems here:
First, have to identify the receiver of the setText method. Probably want
field : TEXT { $TEXT.setText("BlahBlah"); }
| STRING
;
Second is that setText is not defined in the Token class.
Typically, create your own token class extending CommonToken and corresponding token factory class. Set the TokenLableType (in the options block) to your token class name. The setText method in CommonToken will then be visible.
tl;dr:
Given the following grammar (derived from original CSV.g4 sample and grammar attempt of OP (cf. question)):
grammar CSVBlindText;
#header {
import java.util.*;
}
/** Derived from rule "file : hdr row+ ;" */
file
locals [int i=0]
: hdr ( rows+=row[$hdr.text.split(",")] {$i++;} )+
{
System.out.println($i+" rows");
for (RowContext r : $rows) {
System.out.println("row token interval: "+r.getSourceInterval());
}
}
;
hdr : row[null] {System.out.println("header: '"+$text.trim()+"'");} ;
/** Derived from rule "row : field (',' field)* '\r'? '\n' ;" */
row[String[] columns] returns [Map<String,String> values]
locals [int col=0]
#init {
$values = new HashMap<String,String>();
}
#after {
if ($values!=null && $values.size()>0) {
System.out.println("values = "+$values);
}
}
// rule row cont'd...
: field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
( ',' field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
)* '\r'? '\n'
;
field
: TEXT
| STRING
|
;
TEXT : ~[',\n\r"]+ {setText( "BlahBlah" );} ;
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote
One has:
$> antlr4 -no-listener CSVBlindText.g4
$> grep setText CSVBlindText*java
CSVBlindTextLexer.java: setText( "BlahBlah" );
Compiling it works flawlessly:
$> javac CSVBlindText*.java
Testdata (the users.csv file just renamed):
$> cat blinded_by_grammar.csv
User, Name, Dept
parrt, Terence, 101
tombu, Tom, 020
bke, Kevin, 008
Yields in test:
$> grun CSVBlindText file blinded_by_grammar.csv
header: 'BlahBlah,BlahBlah,BlahBlah'
values = {BlahBlah=BlahBlah}
values = {BlahBlah=BlahBlah}
values = {BlahBlah=BlahBlah}
3 rows
row token interval: 6..11
row token interval: 12..17
row token interval: 18..23
So it looks as if the setText() should be injected before the semicolon of a production and not between alternatives (wild guessing here ;-)
Previous iterations below:
Just guessing, as I 1) have no working antlr4 available currently and 2) did not write ANTLR4 grammars for quite some time now - maybe without the Dollar ($) ?
grammar CSV;
file : hdr row* row1;
hdr : row;
row : field (',' value1=field)* '\r'? '\n'; // '\r' is optional at the end of a row of CSV file ..
row1 : field (',' field)* '\r'? '\n'?;
field
: TEXT
{
setText("BlahBlah");
}
| STRING
|
;
TEXT : ~[,\n\r"]+ ;
STRING : '"' ('""' | ~'"')* '"' ;
Update: Now that an antlr 4.5.2 (at least via brew) instead of a 4.5.3 is available, I digged into that and answering some comment below from OP: the setText() will be generated in lexer java module if the grammar is well defined. Unfortunately debugging antlr4 grammars for a dilettant like me is ... but nevertheless very nice language construction kit IMO.
Sample session:
$> antlr4 -no-listener CSV.g4
$> grep setText CSVLexer.java
setText( String.valueOf(getText().charAt(1)) );
The grammar used:
(hacked up from example code retrieved via:
curl -O http://media.pragprog.com/titles/tpantlr2/code/tpantlr2-code.tgz )
grammar CSV;
#header {
import java.util.*;
}
/** Derived from rule "file : hdr row+ ;" */
file
locals [int i=0]
: hdr ( rows+=row[$hdr.text.split(",")] {$i++;} )+
{
System.out.println($i+" rows");
for (RowContext r : $rows) {
System.out.println("row token interval: "+r.getSourceInterval());
}
}
;
hdr : row[null] {System.out.println("header: '"+$text.trim()+"'");} ;
/** Derived from rule "row : field (',' field)* '\r'? '\n' ;" */
row[String[] columns] returns [Map<String,String> values]
locals [int col=0]
#init {
$values = new HashMap<String,String>();
}
#after {
if ($values!=null && $values.size()>0) {
System.out.println("values = "+$values);
}
}
// rule row cont'd...
: field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
( ',' field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
)* '\r'? '\n'
;
field
: TEXT
| STRING
| CHAR
|
;
TEXT : ~[',\n\r"]+ ;
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote
/** Convert 3-char 'x' input sequence to string x */
CHAR: '\'' . '\'' {setText( String.valueOf(getText().charAt(1)) );} ;
Compiling works:
$> javac CSV*.java
Now test with a matching weird csv file:
a,b
"y",'4'
As:
$> grun CSV file foo.csv
line 1:0 no viable alternative at input 'a'
line 1:2 no viable alternative at input 'b'
header: 'a,b'
values = {a="y", b=4}
1 rows
row token interval: 4..7
So in conclusion, I suggest to rework the logic of the grammar (I presume inserting "BlahBlahBlah" was not essential but a mere debugging hack).
And citing http://www.antlr.org/support.html :
ANTLR Discussions
Please do not start discussions at stackoverflow. They have asked us to
steer discussions (i.e., non-questions/answers) away from Stackoverflow; we
have a discussion forum at Google specifically for that:
https://groups.google.com/forum/#!forum/antlr-discussion
We can discuss ANTLR project features, direction, and generally argue about
whatever we want at the google discussion forum.
I hope this helps.

Resources