How to handle Dates as long numbers - neo4j

I am trying to store Dates on my nodes on the database. I am loading data using webadmin and the csv importer, my problem is that data is saved as string and i need it to be long, i have found some methods to cast some types like toInt() but there is no equivalent for long type.
I have a node that contains two date fields, ArrivalDate and DepartureDate, it is a long number in the csv file but once the query is executed in neo4j the field is stored as a string. The problem is that i cannot make a query to compare Dates since they are strings, a sample query i want to run is like this:
match(p:Person)--(s:Stay)
where (s.ArrivalDate)<=634924360000000000 and s.DepartureDate>=634924360000000000)
So i would get all the people staying in that Date.
I have done some research, and also asked before here, maybe the question was not that good explained.
For references: i am using the webadmin to load csv files for the bulk load but then my app is in c# and i am using neo4jclient to work with the DB.

Neo4j 2.1 (which is about to release rather soon) has a Cypher command LOAD CSV. You can use toInt function to convert a string to a numeric value during import. Example:
LOAD CSV WITH HEADERS FROM 'file:/mnt/teamcity-work/42cff4ac2707ec23/target/community/cypher/docs/cypher-docs/target/docs/dev/ql/load-csv/csv-files/file.csv'
AS line
CREATE (:Artist { name: line.Name, year: toInt(line.Year)})

Related

how to change the format of a field when using parse to select fields in sumologic

I am totally new to sumologic platform. I am trying to select fields from the log data. The simple code is:
| parse "transactionNumber=*|" as transactionNumber
| parse "message=*|" as message
My transaction number is a very long numbers, such as 123456789987654321. So, when I 'Export(Display Fields)' to save the result to csv file, it will be automatically transfer to scientific notation such as 123e+15.
So, how to change the format, let's say from number to character, so that I can get the real numbers in csv?
I think the simple way is save the file as txt, instead of csv.
But this is not related to sumo logic programming. So I think this is not a very "descent" way.

How to apply a Hive schema to unstructured text?

I have a space delimited text file representing some logs data. For simplicity, headers would be:
'date', 'time',’query’,’host’
And a record would look like:
2001-01-01 01:02:04 irfjrifjWt.f=32&ydeyf myhost
A simple Hive table with space delimited fields will read this data correctly. However I want to do further parsing of the query string.
Within this text are tags that I want to parse into Hive columns.
Here’s a de-identified example of a couple of query strings:
ofifnmfiWT.s=12&ifmrinfnWT.df=hello’&oirjfirngirngWT.gh=32&iurenfur
ggfWT.gh=12&WT.ll=12&uyfer3d
Tags have the format WT.xx, followed by an =, followed by the value of the tag, followed by an &.
The order of the tags and the presence of each tag varies from record to record. The only thing I could define ahead is a set of tags I want to parse. In the example above, let’s say I’m interested in tags [WT.gh, WT.s]. Then (making up date time and host), my Hive table would look like:
Date time host WT.s WT.gh
2011-01-01 05:03:03 myhost1 12 32
2011-01-01 05:03:03 myhost1 NULL 12
I could easily parse the query string with Python and regex, and just create a second .txt file with the original record, plus a series of new values with the parsed tags, but that seems a waste of time and it doesn’t look like it is utilizing schema on read principles.
I might be wrong in my thinking, since I’m new to this, but I was wondering if there is a way to apply a schema on this data that would inherently do the parsing for me.
If not, what solution would you recommend?

history attribute in neo4j

I was reading about Time-Based Versioned Graphs and came across the following example:
CREATE (s1:Shop{shop_id:1})
-[:STATE{from:1388534400000,to:9223372036854775807}]->
(ss1:ShopState{name:'General Store'})
My question: how do I calculate this date? from:1388534400000,to:9223372036854775807
Those two values are timestamps which in java are the number of milliseconds since the Epoch (1/1/1970) began. The second value is the maximum Long value, the end of Java time, a long way away.
There are ways in all languages for generating these values for specific dates (beware that some will be based on seconds), there is quite a handy list on this site.
If you are not working in any particular programming language and just want to enter queries then you can use an online date converter like this one.
You can also calculate timestamps in Cypher if you are working with dates that relate to Now somehow using the timestamp() function:
CREATE (s1:Shop{shop_id:1})
-[:STATE{from:timestamp(),to:9223372036854775807}]->
(ss1:ShopState{name:'General Store'})
IIUC to is just a Long.MAX_VALUE, and from can be a result of either calling timestamp() function via Cypher or setting the property with the value of System.currentTimeMills() via Java API.
Take a look at the example: http://console.neo4j.org/?id=43uoyt (Note that you can skip setting rel.to and use coalesce when querying instead).

Reading Excel formulae using Ruby

I'm trying to use the Spreadsheet gem to parse XLS files that store information about school courses. These XLS files are automatically generated, so I cannot change the presentation of data.
Course schedules are saved as a list of characters, with dashes representing days in which the class does not meet. An example would be "3--33--", meaning the class meets during block 3 on days 1, 4, and 5 in the rotation. Excel parses some of these schedules as formulae, meaning that I need to read the formula itself from certain cells.
The problem is that when I try to read the data from a formula cell, using cell.data, the result is a string like \r\x00\x1F\x00\x00\x00\x00\x00\xD0\x84\xC0\x1EB\x00\x04. I'm assuming that this is Ruby's attempt to print the data as ASCII text. After some research, I have learned that Excel stores formulae in RPN format.
In short: I'm not sure how to go about reading a formula (the formula itself, not the formula's calculated value) from an Excel spreadsheet. I can't change the input Excel spreadsheet, and having a purely Ruby solution would be nice, since I'm planning on using this with Rails.
A different approach is:
convert it to csv using xls2csv: http://linux.die.net/man/1/xls2csv
read it using the ruby standard lib: http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html
I hope this can help you.

Cypher java date query

I'm attempting to run a cypher query to return back nodes within a particular date range. When passing in the date object (Java) the query fails to return the correct nodes. I'm currently using the long date value (i.e. getTime()) which works as expected. This is great, but is there a way of just using the actual Date object?
Unfortunately neo4j has no support for storing actual date objects as properties. So, right now you have to pass in date.getTime() as cypher parameters.
We'll look into that.
It has been integrated in Neo4j 2.0, see timestamp.

Resources