Suds is not caching wsdls - suds

I 'm using Suds for RPC calls over SOAP, and the client refuses to cache between calls (resulting in waiting 30+ seconds waiting for the client to initialise). Can anyone see what needs to be done in addition to the below in order for caching to be enabled?
client = Client(WSDL_URL)
cache = client.options.cache
cache.setduration(days=10)
cache.setlocation(SUDS_CACHE_LOCATION)

This is probably a bug in the library itself. The cache file needs to be written in binary mode. That can be fixed in cache.py:
1) In FileCache.put(), change this line:
f = self.open(fn, 'w')
to
f = self.open(fn, 'wb')
2) In FileCache.getf(), change this line:
return self.open(fn)
to
return self.open(fn, 'rb')
For more details, see:
Suds is not reusing cached WSDLs and XSDs, although I expect it to

Related

websocket memory management

I am new to websockets and am pulling some data via the Coinbase websocket, upon every message I append to a pandas df in memory (I know this is bad practice, just trying to get a working version). Every minute I upload this df to TimescaleDB and clear out old data from the df.
I am noticing though that on occasion I am failing to upload the df and as a result I am not clearing the df of old values, and eventually the df consumes all the memory.
Is this a feature of websockets? Or is something off with my scheduler?
This is my scheduler for reference -
while True
nowtime = datetime.now()
floornow = pd.Timestamp(nowtime).floor(freq='1S')
candlefloor = floornow.floor(freq=f'{candle_len}T')
if (floornow == candlefloor):
try:
upload_timescale()
if (candlefloor != 'last datetime'):
timetowait = candle_len*60-(datetime.now() - candlefloor).total_seconds()-0.05
time.sleep(timetowait)
except:
raise ValueError('bug in uploading to timescaledb')
else:
tts = candle_len*60-(nowtime - candlefloor).total_seconds()
if (tts > 2):
time.sleep(tts)
What is the best way to handle a use case where I need to store websocket data intermittently to process, before cleaning it out?
Thanks!

How to cache based on size in Varnish?

I've been trying to cache based on response size of varnish.
Other answers suggested using Content-Length to decide whether or not to cache but I'm using InfluxDB (Varnish reverse proxies to this) and it responds with a Transfer-Encoding:Chunked which omits the Content-Length header and I am not able to figure out the size of the response.
Is there any way I could access response body size and make decision in vcl_backend_response?
Cache miss: chunked transfer encoding
When Varnish processes incoming chunks from the origin, it has no idea ahead of time how much data will be received. Varnish streams the data through to the client and stores the data byte per byte.
Once the 0\r\n\r\n is received to mark the end of the stream, Varnish will finalize the object storage and calculate the total amount of bytes.
Cache hit: content length
The next time the object is requested, Varnish no longer needs to use Chunked Transfer Encoding, because it has the full object in cache and knows the size. At that point a Content-Length header is part of the response, but this header is not accessible in VCL because it seems to be generated after sub vcl_deliver {} is executed.
Remove objects after the fact
It is possible to remove objects after the fact by monitoring their size through VSL.
The following command will look at the backend request accounting field of the VSL output and check the total size. If the size is greater than 5MB, it generates output
varnishlog -g request -i berequrl -q "BereqAcct[5] > 5242880"
Here's some potential output:
* << Request >> 98330
** << BeReq >> 98331
-- BereqURL /
At that point, you know that the / resource is bigger than 5 MB. You can then attempt to remove it from the cache using the following command:
varnishadm ban "obj.http.x-url == / && obj.http.x-host == domain.com"
Replace domain.com with the actual hostname of your service and set / to the URL of the actual endpoint you're trying to remove from the cache.
Don't forget to add the following code to your VCL file to ensure that the x-url and x-host headers are available:
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
unset resp.http.x-url;
unset resp.http.x-host;
}
Conclusion
Although there's no turn-key solution to access the size of the body in VCL, but the hacky solution I suggested where we remove objects after the fact is the only thing I can think of.

cherrypy serve multiple requests / per connection

i have this code
(on the fly compression and stream)
#cherrypy.expose
def backup(self):
path = '/var/www/httpdocs'
zip_filename = "backup" + t.strftime("%d_%m_%Y_") + ".zip"
cherrypy.response.headers['Content-Type'] = 'application/zip'
cherrypy.response.headers['Content-Disposition'] = 'attachment; filename="%s"' % (zip_filename,)
#https://github.com/gourneau/SpiderOak-zipstream/blob/3463c5ccb5d4a53fc5b2bdff849f25bae9ead761/zipstream.py
return ZipStream(path)
backup._cp_config = {'response.stream': True}
the problem i faced is when i'm downloading the file i cant browse any other page or send any other request until the download done...
i think that the problem is that cherrypy can't serve more than one request at a time/ per user
any suggestion?
When you say "per user", do you mean that another request could come in for a different "session" and it would be allowed to continue?
In that case, your issue is almost certainly due to session locking in cherrypy. You can read more about it is the session code. Since the sessions are unlocked late by default, the session is not available for use by other threads (connections) while the backup is still being processed.
Try setting tools.sessions.locking = 'explicit' in the _cp_config for that handler. Since you’re not writing anything to the session, it’s probably safe not to lock at all.
Good luck. Hope that helps.
Also, from the FAQ:
"CherryPy certainly can handle multiple connections. It’s usually your browser that is the culprit. Firefox, for example, will only open two connections at a time to the same host (and if one of those is for the favicon.ico, then you’re down to one). Try increasing the number of concurrent connections your browser makes, or test your site with a tool that isn’t a browser, like siege, Apache’s ab, or even curl."

Does Ruby's 'open_uri' reliably close sockets after read or on fail?

I have been using open_uri to pull down an ftp path as a data source for some time, but suddenly found that I'm getting nearly continual "530 Sorry, the maximum number of allowed clients (95) are already connected."
I am not sure if my code is faulty or if it is someone else who's accessing the server and unfortunately there's no way for me to really seemingly know for sure who's at fault.
Essentially I am reading FTP URI's with:
def self.read_uri(uri)
begin
uri = open(uri).read
uri == "Error" ? nil : uri
rescue OpenURI::HTTPError
nil
end
end
I'm guessing that I need to add some additional error handling code in here...
I want to be sure that I take every precaution to close down all connections so that my connections are not the problem in question, however I thought that open_uri + read would take this precaution vs using the Net::FTP methods.
The bottom line is I've got to be 100% sure that these connections are being closed and I don't somehow have a bunch open connections laying around.
Can someone please advise as to correctly using read_uri to pull in ftp with a guarantee that it's closing the connection? Or should I shift the logic over to Net::FTP which could yield more control over the situation if open_uri is not robust enough?
If I do need to use the Net::FTP methods instead, is there a read method that I should be familiar with vs pulling it down to a tmp location and then reading it (as I'd much prefer to keep it in a buffer vs the fs if possible)?
I suspect you are not closing the handles. OpenURI's docs start with this comment:
It is possible to open http/https/ftp URL as usual like opening a file:
open("http://www.ruby-lang.org/") {|f|
f.each_line {|line| p line}
}
I looked at the source and the open_uri method does close the stream if you pass a block, so, tweaking the above example to fit your code:
uri = ''
open("http://www.ruby-lang.org/") {|f|
uri = f.read
}
Should get you close to what you want.
Here's one way to handle exceptions:
# The list of URLs to pass in to check if one times out or is refused.
urls = %w[
http://www.ruby-lang.org/
http://www2.ruby-lang.org/
]
# the method
def self.read_uri(urls)
content = ''
open(urls.shift) { |f| content = f.read }
content == "Error" ? nil : content
rescue OpenURI::HTTPError
retry if (urls.any?)
nil
end
Try using a block:
data = open(uri){|f| f.read}

Saving an ActiveRecord non-transactionally

My application accepts file uploads, with some metadata being stored in the DB, and the file itself on the file system. I am trying to make the metadata visible in the application before the file upload and post-processing are finished, but because saves are transactional, I have had no success. I have tried the callbacks and calling create_or_update() instead of save(), all to no avail. Is there a way to do this without re-writing the guts of ActiveRecord::Base? I've even attempted naming the method make() instead of save(), but perplexingly that had no effect.
The code below "works" fine, but the database is not modified until everything else is finished.
def save(upload)
uploadFile = upload['datafile']
originalName = uploadFile.original_filename
self.fileType = File.extname(originalName)
create_or_update()
# write the file
File.open(self.filePath, "wb") { |f| f.write(uploadFile.read) }
begin
musicFile = TagLib::File.new(self.filePath())
self.id3Title = musicFile.title
self.id3Artist = musicFile.artist
self.id3Length = musicFile.length
rescue TagLib::BadFile => exc
logger.error("Failed to id track: \n #{exc}")
end
if(self.fileType == '.mp3')
convertToOGG();
end
create_or_update()
end
Any ideas would be quite welcome, thanks.
Have you considered processing the file upload as a background task? Save the metadata as normal and then perform the upload and post-processing using Delayed Job or similar. This Railscast has the details.
You're getting the meta-data from the file, right? So is the problem that the conversion to OGG is taking too long, and you want the data to appear before the conversion?
If so, John above has the right idea -- you're going to need to accept the file upload, and schedule a conversion to occur sometime in the future.
The main reason why is that your rails thread will process the OGG conversion and can't respond to any other web-requests until it's complete. Blast!
Some servers compensate for this by having multiple rails threads, but I recommend a background queue (use BJ if you host yourself, or Heroku's background jobs if you host there).

Resources