Joining Multiple Dataframes with Pandas with overlapping Column Names? - join

I have multiple (more than 2) dataframes I would like to merge. They all share the same value column:
In [431]: [x.head() for x in data]
Out[431]:
[ AvgStatisticData
DateTime
2012-10-14 14:00:00 39.335996
2012-10-14 15:00:00 40.210110
2012-10-14 16:00:00 48.282816
2012-10-14 17:00:00 40.593039
2012-10-14 18:00:00 40.952014,
AvgStatisticData
DateTime
2012-10-14 14:00:00 47.854712
2012-10-14 15:00:00 55.041512
2012-10-14 16:00:00 55.488026
2012-10-14 17:00:00 51.688483
2012-10-14 18:00:00 57.916672,
AvgStatisticData
DateTime
2012-10-14 14:00:00 54.171233
2012-10-14 15:00:00 48.718387
2012-10-14 16:00:00 59.978616
2012-10-14 17:00:00 50.984514
2012-10-14 18:00:00 54.924745,
AvgStatisticData
DateTime
2012-10-14 14:00:00 65.813114
2012-10-14 15:00:00 71.397868
2012-10-14 16:00:00 76.213973
2012-10-14 17:00:00 72.729002
2012-10-14 18:00:00 73.196415,
....etc
I read that join can handle multiple dataframes, however I get:
In [432]: data[0].join(data[1:])
...
Exception: Indexes have overlapping values: ['AvgStatisticData']
I have tried passing rsuffix=["%i" % (i) for i in range(len(data))] to join and still get the same error. I can workaround this by building my data list in a way where the column names don't overlap, but maybe there is a better way?

In [65]: pd.concat(data, axis=1)
Out[65]:
AvgStatisticData AvgStatisticData AvgStatisticData AvgStatisticData
2012-10-14 14:00:00 39.335996 47.854712 54.171233 65.813114
2012-10-14 15:00:00 40.210110 55.041512 48.718387 71.397868
2012-10-14 16:00:00 48.282816 55.488026 59.978616 76.213973
2012-10-14 17:00:00 40.593039 51.688483 50.984514 72.729002
2012-10-14 18:00:00 40.952014 57.916672 54.924745 73.196415

I would try pandas.merge using the suffixes= option.
import pandas as pd
import datetime as dt
df_1 = pd.DataFrame({'x' : [dt.datetime(2012,10,21) + dt.timedelta(n) for n in range(10)], 'y' : range(10)})
df_2 = pd.DataFrame({'x' : [dt.datetime(2012,10,21) + dt.timedelta(n) for n in range(10)], 'y' : range(10)})
df = pd.merge(df_1, df_2, on='x', suffixes=['_1', '_2'])
I am interested to see if the experts have a more algorithmic approach to merge a list of data frames.

Related

dateFromString returns nil for some values

I am getting nil for some values while using dateFromString in swift. I searched a lot but in vain.
Following is my code:
let strDate = self.sortedDict.valueForKey("TIME").objectAtIndex(indexPath.row).objectAtIndex(0) as? String
print(strDate);
let st_date = frmt.dateFromString(strDate!)
let frmt1:NSDateFormatter = NSDateFormatter()
frmt1.locale = NSLocale(localeIdentifier: localeStr)
frmt1.dateFormat = "MMM, dd yyyy hh:mm a"
if st_date != nil {
print(st_date)
}
Output console:
Optional("September, 20 2015 10:00:00")
Optional(2015-09-20 10:00:00 +0000)
Optional("October, 04 2015 10:00:00")
Optional(2015-10-04 10:00:00 +0000)
Optional("October, 04 2015 14:00:00") // nil
Optional("October, 18 2015 15:00:00") // nil
Optional("September, 20 2015 14:00:00") // nil
Optional("September, 27 2015 10:00:00")
Optional(2015-09-27 10:00:00 +0000)
Optional("September, 27 2015 12:00:00")
Optional(2015-09-27 00:00:00 +0000)
Optional("September, 27 2015 14:00:00")
Optional("October, 03 2015 14:00:00") //nil
Optional("October, 03 2015 16:00:00") //nil
The format is same for all date strings still I get nil for few values. Why so? Please help. Where am I getting wrong?
format should be HH for 24 hours even you are getting values only for 12 hours.
frmt1.dateFormat = "MMM, dd yyyy HH:mm a"

IOS Components From Date Changes Timezone

I am using
NSDateComponents *components = [CURRENT_CALENDAR components:DATE_COMPONENTS fromDate:_startDate];
[components setHour:0];
[components setMinute:0];
[components setSecond:0];
_date = [CURRENT_CALENDAR dateFromComponents:components];
With dates I receive from an API.
The _date returns 2 different outputs depending on the calendar day:
2016-01-04 05:00:00 +0000
2015-10-26 04:00:00 +0000
As if there was a change in time zone.
Is there a reason the time of _date changes from 5 to 4 ?Is there something to prevent that?
Problem is that unexpected time offset (-1) reflects in all the other dates I create with dateFromComponents:components
Output for different dates showing the offset
2016-01-04 05:00:00 +0000
2015-12-21 05:00:00 +0000
2015-12-14 05:00:00 +0000
2015-12-07 05:00:00 +0000
2015-11-23 05:00:00 +0000
2015-11-16 05:00:00 +0000
2015-11-09 05:00:00 +0000
2015-11-02 05:00:00 +0000
2015-10-26 04:00:00 +0000
2015-10-19 04:00:00 +0000
2015-10-22 04:00:00 +0000
2015-10-01 04:00:00 +0000
2015-09-24 04:00:00 +0000

Is the Time object suitable to create a calendar?

I want to make a database-backed calendar. Will the Time object make my life easier? It hasn't so far...
The .end_of_year method gives me some strange information. If it's contemporary time it works flawlessly:
date = '2012-3-2'.to_time(:utc) #=> 2012-03-02 00:00:00 UTC
date.end_of_year #=> 2012-12-31 23:59:59 UTC
However, if you go back in time things get strange.
date = '1399-3-2'.to_time(:utc) #=> 1399-03-02 00:00:00 UTC
date.end_of_year #=> 1399-12-23 23:59:59 UTC
23rd of December? Shouldn't that be 31st?
It's not even consistent:
date = '0000-3-2'.to_time(:utc) #=> 0000-03-02 00:00:00 UTC
date.end_of_year #=> 0001-01-02 23:59:59 UTC
Um, the 2nd of January? OF THE NEXT YEAR? What is going on?
Also, are leap years taken into account by the object?
You could use DateTime instead:
date = '2012-3-2'.to_datetime #=> Fri, 02 Mar 2012 00:00:00 +0000
date.end_of_year #=> Mon, 31 Dec 2012 23:59:59 +0000
date = '1399-3-2'.to_datetime #=> Sun, 02 Mar 1399 00:00:00 +0000
date.end_of_year #=> Wed, 31 Dec 1399 23:59:59 +0000
date = '0000-3-2'.to_datetime #=> Tue, 02 Mar 0000 00:00:00 +0000
date.end_of_year #=> Fri, 31 Dec 0000 23:59:59 +0000
It's mora accurate, and you can format the output
I've did some digging. Here's what I found.
Let's begin with end_of_year:
def end_of_year
change(:month => 12).end_of_month
end
Which relies on change and end_of_month:
def end_of_month
last_day = ::Time.days_in_month(month, year)
last_hour{ days_since(last_day - day) }
end
The most interesting part is happening inside of days_since:
def days_since(days)
advance(:days => days)
end
The advance method is a bit more complex:
def advance(options)
unless options[:weeks].nil?
options[:weeks], partial_weeks = options[:weeks].divmod(1)
options[:days] = options.fetch(:days, 0) + 7 * partial_weeks
end
unless options[:days].nil?
options[:days], partial_days = options[:days].divmod(1)
options[:hours] = options.fetch(:hours, 0) + 24 * partial_days
end
d = to_date.advance(options)
time_advanced_by_date = change(:year => d.year, :month => d.month, :day => d.day)
seconds_to_advance = options.fetch(:seconds, 0) +
options.fetch(:minutes, 0) * 60 +
options.fetch(:hours, 0) * 3600
if seconds_to_advance.zero?
time_advanced_by_date
else
time_advanced_by_date.since(seconds_to_advance)
end
end
And he is the guy we're looking for :
# in rails console
time = '0000-01-01'.to_time(:utc) #=> 0000-01-01 00:00:00 UTC
time.advance(days: 1) #=> 0000-01-04 00:00:00 UTC
time.advance(days: 2) #=> 0000-01-05 00:00:00 UTC
time.advance(days: 3) #=> 0000-01-06 00:00:00 UTC
That's all for now. I will continue to dig.

Can not format pubDate to NSDate iOS

I can not figure out why NSDate continues to throw nil.
NSString * copyString = [[self.parseResults objectAtIndex:indexPath.row]objectForKey:#"date"];
NSDateFormatter *df = [[[NSDateFormatter alloc] init] autorelease];
[df setDateFormat:#"EEE, dd MMM yyyy HH:mm:ss zzz"];
NSDate *date = [df dateFromString:copyString];
NSLog(#"%#", copyString);
NSLog(#"%#",date);
Did I set the date format set properly?
Output from copyString
2014-01-24 11:17:25.893 Events[32755:70b] Wed, 31 Dec 1969 16:00:00
PST
2014-01-24 11:17:25.895 Events[32755:70b] Fri, 24 Jan 2014 20:00:00
PST
2014-01-24 11:17:25.896 Events[32755:70b] Sat, 25 Jan 2014 10:00:00
PST
2014-01-24 11:17:25.897 Events[32755:70b] Mon, 27 Jan 2014 10:00:00
PST
2014-01-24 11:17:25.899 Events[32755:70b] Mon, 27 Jan 2014 12:15:00
PST
2014-01-24 11:17:25.900 Events[32755:70b] Mon, 27 Jan 2014 19:00:00
PST
Output from date
2014-01-24 11:22:24.707 Events[32827:70b] (null)
2014-01-24 11:22:24.709 Events[32827:70b] (null)
2014-01-24 11:22:24.710 Events[32827:70b] (null)
2014-01-24 11:22:24.712 Events[32827:70b] (null)
2014-01-24 11:22:24.713 Events[32827:70b] (null)
2014-01-24 11:22:24.714 Events[32827:70b] (null)

Sum date object and time object on RoR

sorry for my english...
I have two object, one Date object and one Time object, I want sum them to get a DateTime object, this is I am trying...
[35] pry(main)> dateobj = Date.today
=> Sun, 06 Oct 2013
[36] pry(main)> timeobj = Time.parse("02:00:00")
=> 2013-10-06 02:00:00 -0600
[37] pry(main)> datetimeobj = dateobj + timeobj
TypeError: expected numeric
from /home/elquick/www/rails/vivsan/http/vendor/bundle/ruby/2.0.0/gems/activesupport-3.2.14/lib/active_support/core_ext/date/calculations.rb:90:in `+'
[38] pry(main)>
Some help?
Thanks!
Try to use
Date.today.to_datetime + Time.parse('02:00:00').seconds_since_midnight.seconds
Here is my result:
.0.0-p195 :019 > Date.today.to_datetime
=> Mon, 07 Oct 2013 00:00:00 +0000
2.0.0-p195 :020 > Time.parse('02:00:00').seconds_since_midnight.seconds
=> 7200.0 seconds
2.0.0-p195 :021 > Date.today.to_datetime + Time.parse('02:00:00').seconds_since_midnight.seconds
=> Mon, 07 Oct 2013 02:00:00 +0000
I found this...
dateobj = Date.new(2013, 10, 6)
timeobj = Time.parse("02:00:00")
datetime = DateTime.new(dateobj.year, dateobj.month, dateobj.day, timeobj.hour, timeobj.min, timeobj.sec)

Resources