DateType Solr Indexing Error - datastax-enterprise

I use DSE 3.2.0. When I try to index in Solr a DateType column (system locale is GMT+3), I get following SOLR exception:
org.apache.solr.common.SolrException: org.apache.solr.common.SolrException: Invalid Date String:'2013-10-10 23:59:59+0300'
at com.datastax.bdp.cassandra.index.solr.CassandraDirectUpdateHandler2.deleteByQuery(CassandraDirectUpdateHandler2.java:230)
at com.datastax.bdp.cassandra.index.solr.AbstractSolrSecondaryIndex.doDelete(AbstractSolrSecondaryIndex.java:628)
at com.datastax.bdp.cassandra.index.solr.Cql3SolrSecondaryIndex.updateColumnFamilyIndex(Cql3SolrSecondaryIndex.java:138)
at com.datastax.bdp.cassandra.index.solr.AbstractSolrSecondaryIndex$3.run(AbstractSolrSecondaryIndex.java:896)
at com.datastax.bdp.cassandra.index.solr.concurrent.IndexWorker.run(IndexWorker.java:38)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.solr.common.SolrException: Invalid Date String:'2013-10-10 23:59:59+0300'
at org.apache.solr.schema.DateField.parseMath(DateField.java:182)
at org.apache.solr.analysis.TrieTokenizer.reset(TrieTokenizerFactory.java:135)
at org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:409)
at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:959)
at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:574)
at org.apache.solr.parser.SolrQueryParserBase.handleQuotedTerm(SolrQueryParserBase.java:779)
Schema below:
<schema name="mach" version="1.1">
<types>
<fieldType name="string" class="solr.StrField"/>
<fieldType name="int" class="solr.TrieIntField"/>
<fieldType name="date" class="solr.TrieDateField"/>
</types>
<fields>
<field name="snapshot_date" type="date" indexed="true" stored="true"/>
<field name="account_id" type="string" indexed="true" stored="true"/>
<field name="account_type" type="string" indexed="true" stored="true" />
</fields>
<uniqueKey>(snapshot_date, account_id)</uniqueKey>
<defaultSearchField>account_id</defaultSearchField>
</schema>

Solr uses a particular subset of the ISO date format: YYYY-MM-DDThh:mm:ssZ or ss.tttZ at the end, suppressing any trailing zeros. Only GMT is supported ("Z").
So, your value of "2013-10-10 23:59:59+0300" should be expressed as "2013-10-10T20:59:59Z".

This is a bug affecting reindexing of deleted rows, and will be fixed in DSE 3.2.3.

Related

How Can I use Name as res-id instead of index in a MIB table with two key indexes

[issue description]
I have defined a MIB table with two indexes, the table is like this:
TerminationEntry OBJECT-TYPE
SYNTAX TerminationEntry
ACCESS not-accessible
STATUS mandatory
DESCRIPTION
"An entry in the terminationTable ."
INDEX {ifIndex, TkId}
::= {terminationTable 1}
And the Tkname and TkId mapping table is:
TkMappingEntry::=
SEQUENCE
{
tkMappingName OCTET STRING,
tkMappingId INTEGER
}
In CLI, I defined two res-id mapping to this two indexes. And for the TkId, the user should input the TkName, and the TkName can be mapped to the TkId. the CLI XML is like this:
<parameters>
<res-id uname="if-index" parameter-type="Itf::Line">
<help>The unique internal identifier of the termination port</help>
<fields>
<field name="">
<mib-index name="ifIndex"/>
</field>
</fields>
</res-id>
<res-id name="tkgname" parameter-type="Sip::TkName">
<help>The name of Tk.</help>
<fields>
<field name="" access="CommandFieldDefinition::mayBeReadDuringDisplay_c |
CommandFieldDefinition::mayBeWrittenDuringCreate_c">
<mib-var tree-node="NODEterminationTkName" table-name="terminationTable "/>
<mib-index name="tkMappingName"/>
</field>
</fields>
</res-id>
<parameters>
...
<fields>
<field name="index" basic-type="Sip::TkId"
access="CommandFieldDefinition::mayBeReadDuringPrepare_c |
CommandFieldDefinition::mayBeReadDuringModify_c |
CommandFieldDefinition::mayBeReadDuringCommit_c |
CommandFieldDefinition::mayBeReadDuringDelete_c |
CommandFieldDefinition::mayBeReadDuringIn_c |
CommandFieldDefinition::mayBeReadDuringDisplay_c |
CommandFieldDefinition::mayBeReadDuringCreate_c">
<mib-var tree-node="NODEtkMappingId" table-name="tkMappingTable"/>
<mib-index name="terminationTkId"/>
</field>
<field name="next-free" basic-type="Sip::TrunkGroupId" access="CommandFieldDefinition::mayBeReadDuringCreate_c">
<mib-var tree-node="NODE_tkIdNext" table-name="SnmpAgent::localScalarTable_m"/>
<mib-index name="terminationTkId"/>
</field>
</fields>
But during testing, I find that when I input a unexisted TkName, the next-free field is called and the free index is stored in node tkIdNext. But it is not transferred to the terminationTkId. So my CLI command is failed and I get a error on CLI:referred instance does not exist.
[note]
Please help to check the code and help me find why the name/id mapping is failed. By the way, I have tried the name/id mapping in the signle index MIB table, there's no problem. I don't know why the same code can be failed in a two indexes MIB table.
in field name="index", the access "CommandFieldDefinition::mayBeReadDuringCreate_c" should be removed.
During creating node, CLi should only call "next-free" field.

Defining VSTS Field only when Value of field is not Set

I have a need to set the value of a date field when the value of substate field changes only if no value already exists in the date field. Is it possible?
<FIELD refname="MyCorp.StateDate" name="Date Of Last State Change" type="DateTime">
<WHENCHANGED field="MyCorp.State">
<COPY from="clock" /> ** AND do this only of MyCorp.StateDate != Empty **
</WHENCHANGED>
</FIELD>
I read https://msdn.microsoft.com/en-us/library/ms194966.aspx but I am not able to find any way to implement what I need from the WIT language definition.
If you are using Visual Studio Team Services, add rules to work item is not supported now.
If you are using on-premise TFS, you can add field action in state transition, for example:
<TRANSITION from="New" to="Committed">
<REASONS>
<DEFAULTREASON value="Commitment made by the team" />
</REASONS>
<FIELDS>
<FIELD refname="starain.ScrumStarain.StateChageDate">
<WHEN field="starain.ScrumStarain.StateChageDate" value="">
<COPY from="clock" />
</WHEN>
</FIELD>
</FIELDS>
</TRANSITION>

Assigning Default value in DateTime field in TFS

Is it possible to assign DEFAULT rule for DateTime Field while adding a Work Item in TFS? But not using CLOCK value. Giving some Default Date.
Yes, you can do this by hardcoding the value in the Work Item Type Definition. In the appropriate TRANSITION element under the FIELDS\FIELD element for that field, instead of using ServerDefault and clock which you have probably seen like this:
<FIELD refname="Microsoft.VSTS.Common.ActivatedDate">
<SERVERDEFAULT from="clock" />
</FIELD>
try doing this:
<FIELD refname="Microsoft.VSTS.Common.ActivatedDate">
<COPY from"value" value="01-Feb-2014" />
</FIELD>
It is possible, try this:
<FIELD name="Custom Date" refname="Custom.Date" type="DateTime">
<DEFAULT from="value" value="2100-01-01" />
</FIELD>
The date that will be displayed is a day before of whatever date you're setting.

Rails Hash.from_xml not giving expected results

Trying to process some XML that comes from an application called TeleForm. This is form scanning software and it grabs the data and puts it into XML. This is a snippet of the XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<Records>
<Record>
<Field id="ImageFilename" type="string" length="14"><Value>00000022000000</Value></Field>
<Field id="Criterion_1" type="number" length="2"><Value>3</Value></Field>
<Field id="Withdrew" type="string" length="1"></Field>
</Record>
<Record>
<Field id="ImageFilename" type="string" length="14"><Value>00000022000001</Value></Field>
<Field id="Criterion_1" type="number" length="2"><Value>3</Value></Field>
<Field id="Withdrew" type="string" length="1"></Field>
</Record>
</Records>
I've dealt with this in an other system, probably using a custom parser we wrote. I figured it would be no problem in Rails, but I was wrong.
Parsing this with Hash.from_xml or from Nokogiri does not give me the results I expected, I get:
{"Records"=>{"Record"=>[{"Field"=>["", {"id"=>"Criterion_1", "type"=>"number", "length"=>"2", "Value"=>"3"}, ""]},
{"Field"=>["", {"id"=>"Criterion_1", "type"=>"number", "length"=>"2", "Value"=>"3"}, ""]}]}}
After spending way too much time on this, I discovered if I gsub out the type and length attributes, I get what I expected (even if it is wrong! I only removed on the first record node).
{"Records"=>{"Record"=>[{"Field"=>[{"id"=>"ImageFilename", "Value"=>"00000022000000"},
{"id"=>"Criterion_1", "type"=>"number", "length"=>"2", "Value"=>"3"}, {"id"=>"Withdrew"}]},
{"Field"=>["", {"id"=>"Criterion_1", "type"=>"number", "length"=>"2", "Value"=>"3"}, ""]}]}}
Not being well versed in XML, I assume this style of XML using type and length attributes is trying to convert to the data types. In that case, I can understand why the "Withdrew" attribute showed up as empty, but don't understand why the "ImageFilename" was empty - it is a 14 character string.
I've got the work around with gsub, but is this invalid XML? Would adding a DTD (which TeleForm should have provided) give me different results?
EDIT
I'll provide a possible answer to my own question with some code as an edit. The code follows some of the features in the one answer I did receive from Mark Thomas, but I decided against Nokogiri for the following reasons:
The xml is consistent and alway contains the same tags (/Records/Record/Field) and attributes.
There can be several hundred records in each XML file and Nokogiri seems a little slow with only 26 records
I figured out how to get Hash.from_xml to give me what I expected (does not like type="string", but only use the hash to populate a class.
An expanded version of the XML with one complete record
<?xml version="1.0" encoding="ISO-8859-1"?>
<Records>
<Record>
<Field id="ImageFilename" type="string" length="14"><Value>00000022000000</Value></Field>
<Field id="DocID" type="string" length="15"><Value>731192AIINSC</Value></Field>
<Field id="FormID" type="string" length="6"><Value>AIINSC</Value></Field>
<Field id="Availability" type="string" length="18"><Value>M T W H F S</Value></Field>
<Field id="Criterion_1" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_2" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_3" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_4" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_5" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_6" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_7" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_8" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_9" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_10" type="number" length="2"><Value>3</Value></Field>
<Field id="Criterion_11" type="number" length="2"><Value>0</Value></Field>
<Field id="Criterion_12" type="number" length="2"><Value>0</Value></Field>
<Field id="Criterion_13" type="number" length="2"><Value>0</Value></Field>
<Field id="Criterion_14" type="number" length="2"><Value>0</Value></Field>
<Field id="Criterion_15" type="number" length="2"><Value>0</Value></Field>
<Field id="DayTraining" type="string" length="1"><Value>Y</Value></Field>
<Field id="SaturdayTraining" type="string" length="1"></Field>
<Field id="CitizenStageID" type="string" length="12"><Value>731192</Value></Field>
<Field id="NoShow" type="string" length="1"></Field>
<Field id="NightTraining" type="string" length="1"></Field>
<Field id="Withdrew" type="string" length="1"></Field>
<Field id="JobStageID" type="string" length="12"><Value>2292</Value></Field>
<Field id="DirectHire" type="string" length="1"></Field>
</Record>
</Records>
I am only experimenting with a workflow prototype to replace an aging system written in 4D and Active4D. This area of processing TeleForms data was implemented as a batch operation and it still may revert to that. I am just trying to merge some of the old viable concepts in a new Rails implementation. The XML files are on a shared server and will probably have to be moved into the web root and then some trigger set to process to files.
I am still in the defining stage, but my module/classes to handle the InterviewForm is looking like this and may change (with little error trapping, still trying to get into testing and my Ruby is not as good as it should be after playing with Rails for about 5 years!):
module Teleform::InterviewForm
class Form < Prawn::Document
# Not relevant to this question, but this class generates the forms from a Fillable PDF template and
# relavant Model(s) data.
# These forms, when completed are what is processsed by TeleForms and produces the xml.
end
class RateForms
attr_accessor :records, :results
def initialize(xml_path)
fields = []
xml = File.read(xml_path)
# Hash.from_xml does not like a type of "string"
hash = Hash.from_xml(xml.gsub(/type="string"/,'type="text"'))
hash["Records"]["Record"].each do |record|
#extract the field form each record
fields << record["Field"]
end
#records = []
fields.each do |field|
#build the records for the form
#records << Record.new(field)
end
#results = rate_records
end
def rate_records
# not relevant to the qustions but this is where the data is processed and a bunch of stuff takes place
return "Any errors"
end
end
class Record
attr_accessor(*[:image_filename, :doc_id, :form_id, :availability, :criterion_1, :criterion_2,
:criterion_3, :criterion_4, :criterion_5, :criterion_6, :criterion_7, :criterion_8,
:criterion_9, :criterion_10, :criterion_11, :criterion_12, :criterion_13, :criterion_14, :criterion_15,
:day_training, :saturday_training, :citizen_stage_id, :no_show, :night_training, :withdrew, :job_stage_id, :direct_hire])
def initialize(fields)
fields.each do |field|
if field["type"] == "number"
try("#{field["id"].underscore.to_sym}=", field["Value"].to_i)
else
try("#{field["id"].underscore.to_sym}=", field["Value"])
end
end
end
end
end
Thanks for adding the additional information that this is a rating for an interviewee. Using this domain information in your code will likely improve it. You haven't posted any code, but generally using domain objects leads to more concise and more readable code.
I recommend creating a simple class representing a Rating, rather than transforming data from XML to a data structure.
class Rating
attr_accessor :image_filename, :criterion_1, :withdrew
end
Using the above class, here's one way to extract the fields from the XML using Nokogiri.
doc = Nokogiri::XML(xml)
ratings = []
doc.xpath('//Record').each do |record|
rating = Rating.new
rating.image_filename = record.at('Field[#id="ImageFilename"]/Value/text()').to_s
rating.criterion_1 = record.at('Field[#id="Criterion_1"]/Value/text()').to_s
rating.withdrew = record.at('Field[#id="Withdrew"]/Value/text()').to_s
ratings << rating
end
Now, ratings is a list of Rating objects, each with methods to retrieve the data. This is a lot cleaner than delving into a deep data structure. You could even improve on the Rating class further, for example creating a withdrew? method that returns a true or false.
It appears XmlSimple (by maik) is better suited for this task then the unreliable and inconsistent Hash.from_xml implementation.
A port of the tried and tested perl module of the same name, which has several notable advantages.
It is consistent, whether you find one or many occurrences of a node
does not choke and garble the results
able te distinguish between attributes and node content.
Running the above same xml document through the parser:
XmlSimple.xml_in xml
Will produce the following result.
{"Record"=>
[{"Field"=>
[{"id"=>"ImageFilename", "type"=>"string", "length"=>"14", "Value"=>["00000022000000"]},
{"id"=>"DocID", "type"=>"string", "length"=>"15", "Value"=>["731192AIINSC"]},
{"id"=>"FormID", "type"=>"string", "length"=>"6", "Value"=>["AIINSC"]},
{"id"=>"Availability", "type"=>"string", "length"=>"18", "Value"=>["M T W H F S"]},
{"id"=>"Criterion_1", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_2", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_3", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_4", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_5", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_6", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_7", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_8", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_9", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_10", "type"=>"number", "length"=>"2", "Value"=>["3"]},
{"id"=>"Criterion_11", "type"=>"number", "length"=>"2", "Value"=>["0"]},
{"id"=>"Criterion_12", "type"=>"number", "length"=>"2", "Value"=>["0"]},
{"id"=>"Criterion_13", "type"=>"number", "length"=>"2", "Value"=>["0"]},
{"id"=>"Criterion_14", "type"=>"number", "length"=>"2", "Value"=>["0"]},
{"id"=>"Criterion_15", "type"=>"number", "length"=>"2", "Value"=>["0"]},
{"id"=>"DayTraining", "type"=>"string", "length"=>"1", "Value"=>["Y"]},
{"id"=>"SaturdayTraining", "type"=>"string", "length"=>"1"},
{"id"=>"CitizenStageID", "type"=>"string", "length"=>"12", "Value"=>["731192"]},
{"id"=>"NoShow", "type"=>"string", "length"=>"1"},
{"id"=>"NightTraining", "type"=>"string", "length"=>"1"},
{"id"=>"Withdrew", "type"=>"string", "length"=>"1"},
{"id"=>"JobStageID", "type"=>"string", "lth"=>"12", "Value"=>["2292"]},
{"id"=>"DirectHire", "type"=>"string", "length"=>"1"}]
}]
}
I am contemplating fixing the problem and providing Hash with a working implementation for from_xml and was hoping to find some feedback from others who reached the same conclusion. Surely we are not the only ones with these frustrations.
In the meantime we may find solace in knowing there is something lighter than Nokogiri and its full kitchen sink for this task.
nJoy!

Solr search with non-standard ASCII characters

A search which indexes the following string: "Ordoñez" as:
text :lastname
Is then searched as:
User.solr_search do
keywords 'Ordonez'
end
Will return 0 results.
How can I index the string: Ordoñez using solr and get a match when the search is performed for
keywords 'Ordonez' or keywords 'Ordoñez'
I have tried the ASCIIFoldingFilter at index time but this did not do the job.
Here's what I did to try to make this work.
You probably need to add the handling on the Container side as well.
You can check Why don't International Characters Work
My problem was having these 3 fields, which happen to be unused.
<field name="firstname_text" type="textgen" stored="false" multiValued="true" indexed="true"/>
<field name="lastname_text" type="textgen" stored="false" multiValued="true" indexed="true"/>
<field name="specialty_text" type="textgen" stored="false" multiValued="true" indexed="true"/>
Not too sure why but as soon as removed them, the ASCII filter started working.
The ASCIIFoldingFilterFactory does do the job.
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.SynonymFilterFactory"/>
</analyzer>

Resources