I'm building a Rails app where I want to download historical financial data. I've found this URL that I can use:
Yahoo Finance API - historical
but I haven't found any way to download multiple financial data simultaneously. The only thing that I have found is to download multiple quotes, like so:
Yahoo Finance API - quotes
Is there a way to download multiple historical data simultaneously?
(The reason why I ask is because I want to upload the data to a SQLite database and use that in my app. Of course I can download the data individually, stock by stock, but it would be quite tedious.
Now, I've found this Ruby script on the internet:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'sqlite3'
START_DATE=['01','01','2014']
END_DATE=['01','05','2014']
YURL="http://ichart.finance.yahoo.com/table.csv?a=#{START_DATE[0]}&b=#{START_DATE[1]}&c=#{START_DATE[2]}&d=#{END_DATE[0]}&e=#{END_DATE[1]}&f=#{END_DATE[2]}&g=d&ignore=.csv&s="
DBNAME = "data-hold/sp500-data.sqlite"
DB = SQLite3::Database.new( DBNAME )
SUBDIR = 'data-hold/yahoo-data'
Dir.mkdir(SUBDIR) unless File.exists?SUBDIR
DB.execute("SELECT DISTINCT ticker_symbol from companies").each do |sym|
fname = "#{SUBDIR}/#{sym}.csv"
unless File.exists?fname
puts fname
d = open("#{YURL}#{sym}")
File.open(fname, 'w') do |ofile|
ofile.write(d.read)
sleep(1.5 + rand)
end
end
end
but when I run it Rails throws me an error:
bad URI (is not URI?):
So my question is basically: What is the best way to solve the problem?)
Most financial data providers limit historical downloads to one ticker per API call. You can imagine that pulling multiple time series in the same JSON output is confusing and puts heavy loads on servers.
There is a Ruby wrapper of Intrinio's API on github, you can see it here, that will make it easier to get historical time series data.
This will pull Apple's price history:
curl "https://api.intrinio.com/prices?ticker=AAPL" -u "APIusername:APIpassword"
This will pull the current price dimensionally, for up to 150 stocks:
curl "https://api.intrinio.com/data_point?ticker=AAPL,MSFT,T,XOM&item=last_price" -u "APIusername:APIpassword"
You will, of course, need to exchange your own API keys in the curl, but using the github wrapper will make it easy. API username and password are free.
Yahoo historical data does not support downloading more than one symbol at a time. Each URL is unique per symbol.
With yahoo limitations, I would not recommend downloading using more than a single thread.
Related
I am trying to pull reports automatically from shopify admin portal. From source page I can see that javascript function makes this call -
var shopifyQL = "**SHOW** quantity_count, total_sales BY product_type, product_title, sku, shipping_city, traffic_source, source, variant_title, host, shipping_country, shipping_province, day, month, referrer FROM **products** SINCE xxx UNTIL yyy ORDER BY total_sales DESC";
var options = {"category":"product_reports","id":wwwwww,"name":"Product Report by SKU","shopify_ql":"SHOW quantity_count, total_sales BY product_type, product_title, sku, shipping_city, traffic_source, source, variant_title, host, shipping_country, shipping_province, day, month, referrer FROM products SINCE xxxx UNTIL yyyy ORDER BY total_sales DESC","updated_at":"zzz"};
However looking at the product API (https://docs.shopify.com/api/product) I do not see most of the attributes. I am assuming some join tables or seperate calls to the model. Also I tried to pull single sku information but it pulls everything.
ShopifyAPI::Product.find(:all, :params => {:variants => {:sku => 'zzzz'}})
Does anybody had any experience to work with reports??
You need to grab the data from the api and play with it. The available objects are clearly stated on the Shopify API docs. Admin dashboard data can't be pulled like the way you seem to envision unless you play with JavaScript injection (tampermonkey...) which is highly not recommended.
It would go like this for you. First off, if you pull products, you have do so in chunks of 250. The :all symbol gives you up to 250. Supplying a page and limit parameter would help there.
Second, you cannot filter by SKU. Instead, download all the products, and then inside each product are the variants. A variant has a SKU, so you'd search that way.
Doing that, you could setup your own nice reference data structure, ready to be used in reporting as you see fit.
I have a class method (placed in /app/lib/) which performs some heavy calculations and sub-http requests until a result is received.
The result isn't too dynamic, and requested by multiple users accessing a specific view in the app.
So, I want to schedule a periodic run of the method (using cron and Whenever gem), store the results somewhere in the server using JSON format and, by demand, read the results alone to the view.
How can this be achieved? what would be the correct way of doing that?
What I currently have:
def heavyMethod
response = {}
# some calculations, eventually building the response
File.open(File.expand_path('../../../tmp/cache/tests_queue.json', __FILE__), "w") do |f|
f.write(response.to_json)
end
end
and also a corresponding method to read this file.
I searched but couldn't find an example of achieving this using Rails cache convention (and not some private code that I wrote), on data which isn't related with ActiveRecord.
Thanks!
Your solution should work fine, but using Rails.cache should be cleaner and a bit faster. Rails guides provides enough information about Rails.cache and how to get it to work with memcached, let me summarize how I would use it in your case
Heavy method
def heavyMethod
response = {}
# some calculations, eventually building the response
Rails.cache.write("heavy_method_response", response)
end
Request
response = Rails.cache.fetch("heavy_method_response")
The only problem here is that when ur server starts for the first time, the cache will be empty. Also if/when memcache restarts.
One advantage is that somewhere on the flow, the data u pass in is marshalled into storage, and then unmartialled on the way out. Meaning u can pass in complex datastructures, and dont need to serialize to json manually.
Edit: memcached will clear your item if it runs out of memory. Will be very rare since its using a LRU (i think) algoritm to expire things, and I presume you will use this often.
To prevent this,
set expires_in larger than your cron period,
change your fetch code to call the heavy_method if ur fetch fails (like Rails.cache.fetch("heavy_method_response") {heavy_method}, and change heavy_method to just return the object.
Use something like redis which will not delete items.
this question has been asked here already but it's quite some time ago. Does anyone know if Rails has any support for Microsoft Access? I'd need to import and export data every few weeks and would really like to avoid exporting/importing csv files.
Thanks!
It's worth noting that there's an mdb gem for Ruby. It requires mdbtools to be installed.
Add to your Gemfile:
gem 'mdb'
Usage is pretty straightforward, tables are basically lists of hashes:
require 'mdb'
database = Mdb.open('workshops_handouts_inactive_database.mdb')
table = database[:MainData]
results = table.select { |rec| rec[:"Schedule Type"] == "MU1" }
puts results.first
{:"Container Number"=>"17", :Location=>"1f6", :Department=>"tx", ...
I don't think ActiveRecord support exists for MS Access, though.
the win32OLE class allows you to retrieve data from Microsoft Acess you can find the docs here
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/win32ole/rdoc/WIN32OLE.html
I have a question regarding applying Nokogiri into my Rails app. I'm trying to collect baseball stats from a website and display the data into the view. I'm successful in parsing the data, however, I am not sure where to store the code in a RESTful manner.
Currently, I'm collecting the stats, putting them in an array and then matching them with another array (by rank, team, league, etc.). The two arrays are then put into a hash. Is there a more efficient way to do this (as in parse the data and then assign the data as a hash value, while the rank, team, league, etc. are assigned as hash keys)?
Lastly, I had placed the Nokogiri call into my controllers, but I do believe there is a better way. Ryan Bate's Railscasts suggests putting the Nokogiri call into a rake task (/lib/tasks/). Since, I want the website to receive the new baseball stats daily, will I have to run the rake task regularly? Secondly, how would I best implement the data into the view.
Searching online brought the idea of putting this into a config/initializers, but I'm not sure if that's a better solution.
The following is the Nokogiri call:
task :fetch_mets => :environment do
require 'nokogiri'
require 'open-uri'
url = "http..."
doc = Nokogiri::html(open(url))
x = Array.new
doc.css('tr:nth-child(14) td').each do |stat|
x << stat.content
end
a = %w[rank team league games_total games_won games_lost ratio streak]
o = Hash[a.zip x]
statistics = Hash[o.amp{|(y,z)| [y.to_sym, z]}]
#order_stat = statistics.each{|key, value| puts #{key} is #{value}."}
end
Please let me know if I have to clarify anything, thanks so much.
create a table in your db called statistics and include all the keys in your hash (plus created_on and id). To save your stats do:
Statistic.new(statistics).save
Then in your view pull the one with the highest created_on.
For running rake tasks on a cron schedule take a look at whenever.
also it might be cleaner to do it more like:
keys = %w[rank team league games_total games_won games_lost ratio streak].map(&:to_sym)
values = doc.css('tr:nth-child(14) td').map(&:text)
statistics = Hash[keys.zip values]
I'm writing an app for a company that uses Google Calendar internally and would need to use events they already have in their calendar in the app. So I need to get read only access to their calendars from the app (namely I need the events title, start and end dates and attendee emails for all future events).
What is the simplest way to do this in ruby (I would need it to work relatively seamlessly on Heroku)?
I tried using the GCal4Ruby gem which seemed the least outdated of the ones I found but I'm unable to even authenticate through the library (HTTPRequestFailed - Captcha required error) let alone get the info I need.
Clarification: What I'm talking about is the Google Apps version of the calendar, not the one at calendar.google.com.
OK I got the api via GCal4Ruby working. I'm not exactly sure what went wrong the first time. Thanks to Mike and James for their suggestions. This is sample code I used for anyone interested:
require "rubygems"
require "gcal4ruby"
serv = GCal4Ruby::Service.new
serv.authenticate "username#example.com", "password"
events = GCal4Ruby::Event.find serv, {'start-min' => Time.now.utc.xmlschema,
:calendar => 'example-cal%40example.com'}
events.each do |event|
puts event.title
puts event.attendees.join ", "
puts event.start_time
puts event.end_time
puts '-----------------------'
end
You should be able to use the Google Calendar private xml address feature to pull out the needed data.
You could then parse it with hpricot or nokogiri to extract whatever fields you need.