My question is whether Twitter's id associated to each tweet is time ordered, i.e. ids of more recent tweets are bigger numbers.
For instance: this tweet
has an id of 623261510727561216, and was published at 12:41 AM - 21 Jul 2015
This other tweet
has an id of 623260219477524481, and was published at 12:36 AM - 21 Jul 2015. IDs difference 623261510727561216−623260219477524481 = 1291250036735, a positive difference for a positive time difference.
The only thing I want to ascertain from this is just an order, which tweet was published first.
Twitter ids are time ordered. According to twitter doc, the full ID is composed of a timestamp, a worker number, and a sequence number. So basically the first part of the id is the timestamp, so it can be sorted by time. ( But I am not sure how many bits are used for timestamp for twitter ).
Extracting timestamp from a tweet ID
Tweet IDs are k-sorted within a second bound. We can extract the timestamp for a tweet ID by right shifting the tweet ID by 22 bits and adding the Twitter epoch time of 1288834974657.
Python code to get UTC timestamp of a tweet ID
def get_tweet_timestamp(tid):
offset = 1288834974657
tstamp = (tid >> 22) + offset
utcdttime = datetime.utcfromtimestamp(tstamp/1000)
print(str(tid) + " : " + str(tstamp) + " => " + str(utcdttime))
Source
Related
I need help with a project I am working on in Tableau. I have a cvs file which I can load in Tableau. I need to calculate the attrition rate of different attributes such as gender, age etc. Someone please help me with that. I have been trying it for hours and I still haven't had any success.
Below is a sample of what the dataset looks like
Employee ID
date hired
termination date
age
gender
length of service
status
job title
12
02/21/2018
04/29/2022
38
F
4
Terminated
auditor
17
08/28/1989
01/01/2023
52
M
32
Active
CEO
41
04/21/2013
10/21/2014
21
M
1
Terminated
Cashier
Hi everyone,
I want to print the current timestamp in google sheet. I'm using keyboard shortcut Ctrl + Shift + ; to print the current time. The timestamp is correct but when I change the format of the timestamp to date and time, the date is actually incorrect because the date is on 30 Dec 1899 as shown in the screenshot above.
There is another shortcut key which is Ctrl + Alt + Shift + ; which print the correct date and time but I don't prefer to use this as I don't want the output to show the date. May I know is there any way for me to change the date from 30 Dec 1899 to current date when I'm using Ctrl + Shift + ;?
Any help will be greatly appreciated!
It is a common practice to store dates in databases as a number of days elapsed since 1st January 1900. If the date part of the timedate object is unknown (meaning that only the time parts are known), then it is assumed to be 30th December 1899 (one day before 1st January 1900) by agreement.
In the given example you create a timedate object based on the current time, so the date part is left blank. After that, if you try to format that object as a full date, then the date part is assumed to be 30th December 1899 because it is unknown.
In conclusion, you should use Ctrl + Alt + Shift + ; if you want to work with both dates and times; or Ctrl + Shift + ; if you only need times (but keep in mind that it has a unknown associated date that will be represented as 30th December 1899 if formatted properly). Additionally you could use Apps Script to develop Sheet macros that create datetime objects fitting your precise needs. Don't hesitate to leave a comment if you need further explanations.
I have created a hypertable water_meter to store the sensor data
It contains following data ordered by timestamp in ascending order
select * from water_meter order by time_stamp;
As can be seen I have data starting from 01 May 2020
if I use time_bucket() function to get aggregates per 1 day as:
SELECT
time_bucket('1 days', time_stamp) as bucket,
thing_key,
avg(pulsel) as avg_pulse_l,
avg(pulseh) as avg_pulse_h
FROM
water_meter
GROUP BY thing_key, bucket;
It works fine and I get below data:
Now if I use it to get 15 days aggregates, I get unexpected results where the starting time bucket is shown for 17 April 2020, for which there was no data in the table
SELECT
time_bucket('15 days', time_stamp) as bucket,
thing_key,
avg(pulsel) as avg_pulse_l,
avg(pulseh) as avg_pulse_h
FROM
water_meter
GROUP BY thing_key, bucket;
The time_bucket function buckets things into buckets which have an implied range, ie a 15 minute bucket might appear as '2021-01-01 01:15:00.000+00' or something, but it would contain timestamps in the range ['2021-01-01 01:15:00', '2021-01-01 01:30:00') - inclusive on the left exclusive on the right. The same thing happens for days. The bucket is determined and happens to start on the 17th of April, but will include the data in the range: ["2020-04-17 00:00:00+00","2020-05-02 00:00:00+00"). You can use the experimental function in the TimescaleDB Toolkit extension to get these ranges: SELECT toolkit_experimental.time_bucket_range('15 days'::interval, '2020-05-01');
You can also use the offset or origin parameters of the time_bucket function to modify the start: select time_bucket('15 days'::interval, '2020-05-01', origin=>'2020-05-01');
How would I return an aggregated sum of customers' hourly data in a column representing each hour of the day? That question might be a bit vague, so I'll set the context...
I have a data set of Customers, each Customer has a meter, that meter is read every hour (think of an electrical meter for your house). Customers 1-X are assigned to Group 1, Customers Y-Z are assigned to Group 2, etc. I have setup a time tree in Neo4j (Year-->Month-->Day-->Hour) and the hourly meter reads are a separate node with an edge to the appropriate Hour (and an edge to the Customer). I need to return a report that sums up all of the hourly meter reads for all Customers in each Group (by Group), but each hour needs to be a separate column, like this:
GroupName Date H1Sum H2Sum H3Sum…H24Sum
The following query returns the correct format of the report but only for the first hour. How would I create an additional 23 columns of data representing hours 2-24?
MATCH (Group:LMRGroup)<-[:PART_OF_GROUP]-(SubGroup:SubLMRGroup)<-[:PART_OF_GROUP]-(c:Customer)-[:HAS_METER_READ]->(HrlyMR:HourlyMeterRead)-[:METER_READ]->(hr:Hour
{hour:1})<-[:HAS_HOUR]-(d:Day {day:5})<-[HAS_DAY]-(m:Month {month:3})<-[:HAS_MONTH]-(y:Year {year:2015})
RETURN
Group.Name as GroupName,
m.month + '-' + d.day + '-' + y.year as Date,
sum(HrlyMR.Reading) as HE1
Thanks for the help and my apologies if this is still a confusing question.
That's quite a query. I've broken it up to clarify.
MATCH (Group:LMRGroup)<-[:PART_OF_GROUP]-
(SubGroup:SubLMRGroup)<-[:PART_OF_GROUP]-
(c:Customer)-[:HAS_METER_READ]->
(HrlyMR:HourlyMeterRead)-[:METER_READ]->
(hr:Hour)<-[:HAS_HOUR]-
(d:Day {day:5})<-[HAS_DAY]-
(m:Month {month:3})<-[:HAS_MONTH]-
(y:Year {year:2015})
RETURN
distinct(hr.hour) as Hour,
Group.Name as GroupName,
m.month + '-' + d.day + '-' + y.year as Date,
sum(HrlyMR.Reading) as HE1;
I removed your parameter on (hr:Hour { hour: 1}) so that it matches all hours. Then you just return distinct(hr.hour) to group records by hour 1, hour 2, hour 3, and so on. The rest of the results should aggregate to the appropriate hour automatically, no need for a GROUP BY statement as you might otherwise use in SQL
I am writing a Google Calendar sidebar gadget to keep track of the total hours per event tag (as determined in details of the event i.e. "tags: work").
Users can change the current week, month, day they are viewing in the calendar and I want to be able to count up the hours pertaining to their current view.
I don't see anywhere in the gadget API (or any other Google Calendar API) that allows gadgets to access the currently displayed view. I have noticed that the URL has an anchor tag that looks like
g|week-2+23127+23137+23131
which corresponds to viewing Monday Feb. 23, 2015 - Sunday March 1, 2015 in week mode.
I have also noticed the following relationships:
23127 is the first day in the view
23137 is the last day in the view
23131 is the day selected in the month view (on the left of the calendar)
If there is a way to get the currently displayed view using the API, that would be ideal but I would settle for parsing the anchor tag. Unfortunately I cannot decipher how the numbers work.
Google API
The currently displayed date range can be accessed using the following call:
google.calendar.subscribeToDates(function(d) {
// do something
});
where d is a Google date range d.startTime and d.endTime being the beginning and end.
Numbers
The numbers in the URL do not correspond directly to epoch date and time. Rather, each year has 512 days associated with it and each month has 32 days. For example, February has 28 days regularly but every leap year it has 29. The calendar never has to adjust for this since it simply allots each month 32 days and comes out with a nice even number every time.
A careful examination of the date ranges displayed will also show you that if you subtract the number for December 31 from January 1 you get 130. Accounting for the beginning and the end (don't count December 31 and January 1) will give you 128.
12 * 32 + 128 = 512 -- 12 months a year, 32 days a month and a 128 gap per year
Also, for some reason January 1, 1970 has the associated number of 33 so add that to your calculations when determining dates.
This wouldn't fit in the comments, but here's how the encoding works:
The encoding scheme makes it easy to find the day/month/year from the number.
Take 23131 which yields Feb 27, 2015 (from the example in your question).
Divide by 512 and add 1970 (epoch) for the year.
23131 / 512 = 45.xxx => 45 + 1970 = 2015.
Get the remainder of that division and divide by 32 to find the month.
23131 mod 512 = 91 / 32 = 2.xxx = February
Get the remainder of that division and it's the day.
91 mod 32 = 27