I have a table with a column timestamp in type TIMESTAMP in BigQuery. When I display it on my console, I can see timestamps as follows: 2015-10-19 21:25:35 UTC
I then query my table using the BigQuery API, and when I display the result of the query, I notice that this timestamp has been converted in some kind of very big integer like 1.445289935E9 in string format.
Any idea on how do I convert it back to normal time? something I can use in my ruby code?
Time.at("1.468768144014E9".to_f)
Related
I have a column from a CSV that stores datetime. I need to be able to do some simple calculations on the seconds of that datetime field. I want to know how I can convert the datetime field into the UNIX epoch timestamp using a derived column.
In SQL I would usually just do something like this:
DATEDIFF(second,{d '1970-01-01'},myDateTime)
Which gives me the integer value.
However I've tried the same in SSIS and get errors, why is this not correctly parsing?
Have you had a look at the SSIS documentation online for datediff? There tends to be usable examples in those docs.
I have a database that the timestamps are all in UTC format, but I need to convert, for just this one database, it over to CST for any (and all) timestamp fields.
There are 200 tables, so I don't have each table/field mapped that need to be updated. Is there a way to do this, without using
'''
convert_timezone
''' or
'''
dateadd
'''
on every query written?
The database instance is set to CST, but that database is in UTC.
You would need to write a stored proc that would read the ACCOUNT_USAGE.COLUMNS table, identify columns that have a date datatype and then construct SQL statements for each table that updated the values using CONVERT_TIMEZONE
Currently, we are getting stream of data from message queue, which contains multiple information. One of which is created and updated timestamp of a certain event in epoch format.
{"ip":"1.1.1.1","name":"abc.com","createtime":1500389719832,"updatetime":1500613413164 },{"ip":"1.1.1.2","name":"xyz.com","createtime":1500389719821,"updatetime":1500613413233}
Currently, my code will consume messages from queue and pushes all data to Neo4j as bulk. There would be 1000's of rows like this. Each field in this data is stored in neo4j as individual property keys. When a user selects a date from UI, my intention here is to get all the "name" values from that specific date and display only those records in the UI. As the user would select the date which would be in MM/DD/YYYY format, whats the best option to only compare the user selected date with "createtime" thats in epoch format? My thinking is to convert the "createtime" into MM/DD/YYYY readable format and store only the date portion as a separate neo4j property maybe newCreateTime, but i am not sure how to convert only the createtime and updatetime from entire stream of data. Can someone throw some light on this?
You can use the APOC function apoc.date.format to set the newCreateTime properties.
For example (assuming your data is stored in nodes with the Info label):
MATCH (i:Info)
SET i.newCreateTime = apoc.date.format(i.createTime, 'ms', 'MM/dd/yyyy');
I am building a rails app, where the user picks up a date from a date picker and a time from the time picker. Both the date and time have been formatted using moment js to show the date and time in the following way:
moment().format('LL'); //January 23,2017
moment().format('LTS'); //1:17:54 PM
I read this answer with guidelines about selection of a proper column type.
Is there documentation for the Rails column types?
Ideally, I should be using :date, :time or :timestamp for this. But since the dates are formatted, should I be using :string instead?
Which would be the correct and appropriate column type to use in this situation?
If you want to store a time reference in your database you should use one of the types the database offers you. I'll explain this using MySQL (which is the one I have used the most) but the explanation should be similar in other database servers.
If you use a timestamp column you will be using just 4 bytes of storage, which is always a good new since it makes smaller indexes, uses less memory in temporal tables during the internal database operations and so on. However, timestamp has a smaller range than datetime so you will only be able to store values from year 1970 up to year 2038 more or less
If you use datetime you will be able to store a wider range (from year 1001 to year 9999) with the same precision (second). The bad consequence is that a higher range needs more memory, making it a bit slower.
There are some other differences between these two column types that don't fit in this answer, but you should keep an eye on before deciding.
If you use varchar, which is the default column type for text attributes in Ruby on Rails, you will be forced to convert from text to datetime and vice-versa every time you need to use that field. In addition, ordering or filtering on that column will be very inefficient because the database will need to convert all strings into dates before filtering or sorting, making it impossible to use indexes on that column.
If you need sub-second precision, you can use bigint to meet your requirements, as MySQL does not provide a date specific type for this purpose
In general, I recommend using timestamp if your application requirements fit the timestamp limitation. Otherwise, use datetime, but I strongly discourage you to use varchar for this purpose.
EDIT: Formatting
The way you store dates in database is completely different from the way you display it to the user. You can create a DateTime object using DateTime.new(year, month, day, hour, minute, second) and assign that object to your model. By the time you save it into database, ActiveRecord will be in charge of converting the DateTime object into the appropiate database format.
In order to display a value that is already stored in database in a specific format (in a view, API response, etc.) you can hava a look at other posts like this one.
You can have a timestamp column in your database, and then parse the request to a ruby datetime object like this:
d = Time.parse(params[:date])
t = Time.new(params[:time])
dt = DateTime.new(d.year, d.month, d.day, t.hour, t.min, t.sec, t.zone)
#now simply use dt to your datetime column
On Postgres you can save a ruby DateTime object straight into a postgres timestamp field, e.g
User.first.update_attribute('updated_at', dt )
Another option is to concatenate your date and time strings into one and then u can do a one-liner:
User.last.update_attribute('created_at', Time.parse('January 23,2017 1:17:54 PM'))
I'm pretty sure this will work on MySQL datetime or timestamp as well.
Credit to david grayson Ruby: combine Date and Time objects into a DateTime
I have some eventlog data in HDFS that, in its raw format, looks like this:
2015-11-05 19:36:25.764 INFO [...etc...]
An external table points to this HDFS location:
CREATE EXTERNAL TABLE `log_stage`(
`event_time` timestamp,
[...])
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
For performance, we'd like to query this in Impala. The log_stage data is inserted into a Hive/Impala Parquet-backed table by executing a Hive query: INSERT INTO TABLE log SELECT * FROM log_stage. Here's the DDL for the Parquet table:
CREATE TABLE `log`(
`event_time` timestamp,
[...])
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
The problem: when queried in Impala, the timestamps are 7 hours ahead:
Hive time: 2015-11-05 19:36:25.764
Impala time: 2015-11-06 02:36:25.764
> as.POSIXct("2015-11-06 02:36:25") - as.POSIXct("2015-11-05 19:36:25")
Time difference of 7 hours
Note: The timezone of the servers (from /etc/sysconfig/clock) are all set to "America/Denver", which is currently 7 hours behind UTC.
It seems that Impala is taking events that are already in UTC, incorrectly assuming they're in America/Denver time, and adding another 7 hours.
Do you know how to sync the times so that the Impala table matches the Hive table?
Hive writes timestamps to Parquet differently. You can use the impalad flag -convert_legacy_hive_parquet_utc_timestamps to tell Impala to do the conversion on read. See the TIMESTAMP documentation for more details.
This blog post has a brief description of the issue:
When Hive stores a timestamp value into Parquet format, it converts local time into UTC time, and when it reads data out, it converts back to local time. Impala, however on the other hand, does no conversion when reads the timestamp field out, hence, UTC time is returned instead of local time.
The impalad flag tells Impala to do the conversion when reading timestamps in Parquet produced by Hive. It does incur some small cost, so you should consider writing your timestamps with Impala if that is an issue for you (though it likely is minimal).
On a related note, as of Hive v1.2, you can also disable the timezone conversion behaviour with this flag:
hive.parquet.timestamp.skip.conversion
"Current Hive implementation of parquet stores timestamps to UTC, this flag allows skipping of the conversion on reading parquet files from other tools."
This was added in as part of https://issues.apache.org/jira/browse/HIVE-9482
Lastly, not timezone exactly, but for compatibility of Spark (v1.3 and up) and Impala on Parquet files, there's this flag:
spark.sql.parquet.int96AsTimestamp
https://spark.apache.org/docs/1.3.1/sql-programming-guide.html#configuration
Other: https://issues.apache.org/jira/browse/SPARK-12297
be VERY careful with the answers above due to https://issues.apache.org/jira/browse/IMPALA-2716
As for now, the best workaround is not to use TIMESTAMP data type and store timestamps as strings.
As mentioned in
https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_timestamp.html
You can use ----use_local_tz_for_unix_timestamp_conversions=true and --convert_legacy_hive_parquet_utc_timestamps=true to match Hive results.
The first one ensures it converts to local timezone when you use any datetime function. You can set it as Impala Daemon startup options as mentioned in this document.
https://docs.cloudera.com/documentation/enterprise/5-6-x/topics/impala_config_options.html