Can I use fastcgi_finish_request() like register_shutdown_function? - fastcgi

This simple method for caching dynamic content uses register_shutdown_function() to push the output buffer to a file on disk after exiting the script. However, I'm using PHP-FPM, with which this doesn't work; a 5-second sleep added to the function indeed causes a 5-second delay in executing the script from the browser. A commenter in the PHP docs notes that there's a special function for PHP-FPM users, namely fastcgi_finish_request(). There's not much documentation for this particular function, however.
The point of fastcgi_finish_request() seems to be to flush all data and proceed with other tasks, but what I want to achieve, as would normally work with register_shutdown_function(), is basically to put the contents of the output buffer into a file without the user having to wait for this to finish.
Is there any way to achieve this under PHP-FPM, with fastcgi_finish_request() or another function?
$timeout = 3600; // cache time-out
$file = '/home/example.com/public_html/cache/' . md5($_SERVER['REQUEST_URI']); // unique id for this page
if (file_exists($file) && (filemtime($file) + $timeout) > time()) {
readfile($file);
exit();
} else {
ob_start();
register_shutdown_function(function () use ($file) {
// sleep(5);
$content = ob_get_flush();
file_put_contents($file, $content);
});
}

Yes, it's possible to use fastcgi_finish_request for that. You can save this file and see that it works:
<?php
$timeout = 3600; // cache time-out
$file = '/home/galymzhan/www/ps/' . md5($_SERVER['REQUEST_URI']); // unique id for this page
if (file_exists($file) && (filemtime($file) + $timeout) > time()) {
echo "Got this from cache<br>";
readfile($file);
exit();
} else {
ob_start();
echo "Content to be cached<br>";
$content = ob_get_flush();
fastcgi_finish_request();
// sleep(5);
file_put_contents($file, $content);
}
Even if you uncomment the line with sleep(5), you'll see that page still opens instantly because fastcgi_finish_request sends data back to browser and then proceeds with whatever code is written after it.

First of all,
If you cache the dynamic content this way, you're doing it wrong. Surely it can be used this way and it will be able to work, but its totally paralyzed by the approach itself.
if you want to efficiently cache and handle content, create one class and wrap all caching function into.
Yes, you can use fast_cgi_finish_request() like a register_shutdown(). The only difference is that, fast_cgi_finish_request() will send an output(if any) and WILL NOT TERMINATE the script, while an callback of register_shutdown() will be invoked on script termination.

This is an old question, but I don't think any of the answers correctly answer the problem.
As I understand it, the problem is that none of the output gets sent to the client until the ob_get_flush() call that happens during the shutdown function.
To fix that issue, you need to pass a function and a chunk size to ob_start() in order to handle the output in chunks.
Something like this for your else clause:
$content = '';
$write_buffer_func = function($buffer, $phase) use ($file, &$content) {
$content .= $buffer;
if ($phase & PHP_OUTPUT_HANDLER_FINAL) {
file_put_contents($file, $content);
}
return $buffer;
};
ob_start($write_buffer_func, 1024);

Related

PHP variable reverts back to last assigned after intensive curl operation

I'm querying one api and sending data to another. I'm also querying a mysql database. And doing all this about 40 times in one second. Then waiting a minute and repeating. I have a feeling I'm at the limit of what PHP can do.
My question is about two variables that will randomly revert back to their last value, from the previous loop. They only change their value after the call to self::apiCall() (below in the second function). Both $product and $productId will randomly change their value, about once every 40 loops or so.
I boosted PHP to 7.2, increased memory to 512, and assigned some variables to null to save memory. I'm not getting any official memory warnings, but watching the variables randomly go back to their last value is perplexing. Here's what the code looks like.
/**
* The initial create products loop which calls the secondary function where
* the variables can change.
**/
public static function createProducts() {
// Create connection
$conn = new mysqli(SERVERNAME, USERNAME, PASSWORD, DBNAME, PORT);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
// This will go through each row and echo the id column
$productResults = mysqli_query($conn, "SELECT * FROM product_creation_queue");
if(mysqli_num_rows($productResults) > 0) {
$rowIndex = 0;
while($row = mysqli_fetch_assoc($productResults)){
self::createProduct($conn, $product);
}
}
}
/**
* The second function where I see both $product and $productId changing
* from time to time, which completely breaks the code. Their values
* only change after the call to self::createProduct() which is simply a
* curl function to hit an api endpoint.
**/
public static function createProduct($mysqlConnection, $product) {
// convert back to array from json
$productArray = json_decode($product, TRUE);
// here the value of $productId is one thing
$productId = $productArray['product']['id'];
// here is the curl call
$addProduct = self::api_call(TOKEN, SHOP, ENDPOINT, $product, 'POST');
// and randomly here it can revert to it's last value in a previous loop
echo $productId;
}
The problem was that the entire 40-query procedure took more than one minute to complete. And the cron job that started the procedure on the minute would start the next one before the first one had completed, thereby somehow re-assigning variables on the fly. The queries usually took less than one minute, but when it was longer, the conflicts appeared, thus leading to the appearance of randomness.
I reduced the number of queries per minute so now the process completes in less than 60 seconds and no variables are ever overwritten. I still don't understand how the variables would change if two php processes are happening at the same time--it seems like they would be siloed.

Imagick causing output parseerror

I'm using php 7.2 and ImageMagick-7.0.8-12. I'm using it to create thumbnails like so:
function thumbimg($sourcePath, $thumbPath) {
try {
if (file_exists($sourcePath)) {
$imagick = new Imagick();
$imagick->readImage($sourcePath);
$imagick->setImageFormat("jpg");
header('Content-Type: image/jpeg');
$imagick->writeImage($thumbPath);
$imagick->clear();
$imagick->destroy();
chmod($thumbPath, 0755);
return;
}
} catch (ImagickException $e) {
echo $this->raiseError('Could not save image to file: ' . $e->getMessage(), IMAGE_TRANSFORM_ERROR_IO);
}
return;
}
The php script does return an echo'ed JSON as designed, but when I look at the network return preview it shows a blank image with the post link to that script. This behavior starts on the line $imagick = new Imagick(); Prior to that it behaves normally. While I do get the desired JSON, it messes with other functions that produce outputs.
I would look for another Imagick example as yours looks a bit of a mess. You have a header in the middle of your code and that is used for the display. No idea why you have a chmod and I would have thought if it was required it would be at the start on the Imagick code. I also do not see any thumbnailing code.
Try this:
$im = new Imagick($input);
$im->resizeImage( 100, 100, imagick::FILTER_LANCZOS, TRUE );
$im->writeImage('resizeImage.jpg');
$im->destroy();
( The filter is optional as Imagick will pick the best filter to use when increasing or decreasing size. )
I think as #Mark Setchell says the destroy is unnecessary

PhantomJs timeout

I am using Jasmine with PhantomJS to run test cases.
In my typical test case, I make a service call, wait for response and confirm response.
Some requests can return in a few seconds and some can take up to a minute to return.
When ran through PhantomJS, the test case fails for the service call that is supposed to take a minute ( fails because the response is not yet received).
What's interesting is that the test passes when ran through Firefox.
I have tried looking at tcpdump and the headers are same for requests through both browsers, so this looks like a browser timeout issue.
Has anyone had a similar issue ? Any ideas as to where could the timeout be configured ? Or do you think the problem is something else ?
Ah the pain of PhantomJS.
Apparently it turned out that I was using javascript's bind function which is not supported in PhantomJS .
This was causing the test to fail which resulted in messing up state of some global variable( my fault) and hence the failure.
But the root cause was using bind.
Solution: try getting a shim for bind like this from https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Function/bind
if (!Function.prototype.bind) {
Function.prototype.bind = function (oThis) {
if (typeof this !== "function") {
// closest thing possible to the ECMAScript 5 internal IsCallable function
throw new TypeError("Function.prototype.bind - what is trying to be bound is not callable");
}
var aArgs = Array.prototype.slice.call(arguments, 1),
fToBind = this,
fNOP = function () {},
fBound = function () {
return fToBind.apply(this instanceof fNOP && oThis
? this
: oThis,
aArgs.concat(Array.prototype.slice.call(arguments)));
};
fNOP.prototype = this.prototype;
fBound.prototype = new fNOP();
return fBound;
};
}
I had exactly same issue. All you have to do is add setTimeout to exit
setTimeout(function() {phantom.exit();},20000); // stop after 20 sec ( add this before you request your webpage )
page.open('your url here', function (status) {
// operations here
});

Using get_video on YouTube to download a video

I am trying to get the video URL of any YouTube video like this:
Open
http://youtube.com/get_video_info?video_id=VIDEOID
then take the account_playback_token token value and open this URL:
http://www.youtube.com/get_video?video_id=VIDEOID&t=TOKEN&fmt=18&asv=2
This should open a page with just the video or start a download of the video. But nothing happens, Safari's activity window says 'Not found', so there is something wrong with the URL. I want to integrate this into a iPad app, and the javascript method to get the video URL I use in the iPhone version of the app isn't working, so I need another solution.
YouTube changes all the time, and I think the URL is just outdated. Please help :)
Edit: It seems like the get_video method doesn't work anymore. I'd really appreciate if anybody could tell me another way to find the video URL.
Thank you, I really need help.
Sorry, that is not possible anymore. They limit the token to the IP that got it.
Here's a workaround by using the get_headers() function, which gives you an array with the link to the video. I don't know anything about ios, so hopefully you can rewrite this PHP code yourself.
<?php
if(empty($_GET['id'])) {
echo "No id found!";
}
else {
function url_exists($url) {
if(file_get_contents($url, FALSE, NULL, 0, 0) === false) return false;
return true;
}
$id = $_GET['id'];
$page = #file_get_contents('http://www.youtube.com/get_video_info?&video_id='.$id);
preg_match('/token=(.*?)&thumbnail_url=/', $page, $token);
$token = urldecode($token[1]);
$get = $title->video_details;
$url_array = array ("http://youtube.com/get_video?video_id=".$id."&t=".$token,
"http://youtube.com/get_video?video_id=".$id."&t=".$token."&fmt=18");
if(url_exists($url_array[1]) === true) {
$file = get_headers($url_array[1]);
}
elseif(url_exists($url_array[0]) === true) {
$file = get_headers($url_array[0]);
}
$url = trim($file[19],"Location: ");
echo 'Download video';
}
?>
I use this and it rocks: http://rg3.github.com/youtube-dl/
Just copy a YouTube URL from your browser and execute this command with the YouTube URL as the only argument. It will figure out how to find the best quality video and download it for you.
Great! I needed a way to grab a whole playlist of videos.
In Linux, this is what I used:
y=http://www.youtube.com;
f="http://gdata.youtube.com/feeds/api/playlists/PLeHqhPDNAZY_3377_DpzRSMh9MA9UbIEN?start-index=26";
for i in $(curl -s $f |grep -o "url='$y/watch?v=[^']'");do d=$(echo
$i|sed "s|url\='$y/watch?v=(.)&.*'|\1|"); youtube-dl
--restrict-filenames "$y/watch?v=$d"; done
You have to find the playlist ID from a common Youtube URL like:
https://www.youtube.com/playlist?list=PLeHqhPDNAZY_3377_DpzRSMh9MA9UbIEN
Also, this technique uses gdata API, limiting 25 records per page.
Hence the ?start-index=26 parameter (to get page 2 in my example)
This could use some cleaning, and extra logic to iterate thru all sets of 25, too.
Credits:
https://stackoverflow.com/a/8761493/1069375
http://www.commandlinefu.com/commands/view/3154/download-youtube-playlist (which itself didn't quite work)

nsIProtocolHandler: trouble loading image for html page

I'm building an nsIProtocolHandler implementation in Delphi. (more here)
And it's working already. Data the module builds gets streamed over an nsIInputStream. I've got all the nsIRequest, nsIChannel and nsIHttpChannel methods and properties working.
I've started testing and I run into something strange. I have a page "a.html" with this simple HTML:
<img src="a.png">
Both "xxm://test/a.html" and "xxm://test/a.png" work in Firefox, and give above HTML or the PNG image data.
The problem is with displaying the HTML page, the image doesn't get loaded. When I debug, I see:
NewChannel gets called for a.png, (when Firefox is processing an OnDataAvailable notice on a.html),
NotificationCallbacks is set (I only need to keep a reference, right?)
RequestHeader "Accept" is set to "image/png,image/*;q=0.8,*/*;q=0.5"
but then, the channel object is released (most probably due to a zero reference count)
Looking at other requests, I would expect some other properties to get set (such as LoadFlags or OriginalURI) and AsyncOpen to get called, from where I can start getting the request responded to.
Does anybody recognise this? Am I doing something wrong? Perhaps with LoadFlags or the LoadGroup? I'm not sure when to call AddRequest and RemoveRequest on the LoadGroup, and peeping from nsHttpChannel and nsBaseChannel I'm not sure it's better to call RemoveRequest early or late (before or after OnStartRequest or OnStopRequest)?
Update: Checked on the freshly new Firefox 3.5, still the same
Update: To try to further isolate the issue, I try "file://test/a1.html" with <img src="xxm://test/a.png" /> and still only get above sequence of events happening. If I'm supposed to add this secundary request to a load-group to get AsyncOpen called on it, I have no idea where to get a reference to it.
There's more: I find only one instance of the "Accept" string that get's added to the request headers, it queries for nsIHttpChannelInternal right after creating a new channel, but I don't even get this QueryInterface call through... (I posted it here)
Me again.
I am going to quote the same stuff from nsIChannel::asyncOpen():
If asyncOpen returns successfully, the
channel is responsible for keeping
itself alive until it has called
onStopRequest on aListener or called
onChannelRedirect.
If you go back to nsViewSourceChannel.cpp, there's one place where loadGroup->AddRequest is called and two places where loadGroup->RemoveRequest is being called.
nsViewSourceChannel::AsyncOpen(nsIStreamListener *aListener, nsISupports *ctxt)
{
NS_ENSURE_TRUE(mChannel, NS_ERROR_FAILURE);
mListener = aListener;
/*
* We want to add ourselves to the loadgroup before opening
* mChannel, since we want to make sure we're in the loadgroup
* when mChannel finishes and fires OnStopRequest()
*/
nsCOMPtr<nsILoadGroup> loadGroup;
mChannel->GetLoadGroup(getter_AddRefs(loadGroup));
if (loadGroup)
loadGroup->AddRequest(NS_STATIC_CAST(nsIViewSourceChannel*,
this), nsnull);
nsresult rv = mChannel->AsyncOpen(this, ctxt);
if (NS_FAILED(rv) && loadGroup)
loadGroup->RemoveRequest(NS_STATIC_CAST(nsIViewSourceChannel*,
this),
nsnull, rv);
if (NS_SUCCEEDED(rv)) {
mOpened = PR_TRUE;
}
return rv;
}
and
nsViewSourceChannel::OnStopRequest(nsIRequest *aRequest, nsISupports* aContext,
nsresult aStatus)
{
NS_ENSURE_TRUE(mListener, NS_ERROR_FAILURE);
if (mChannel)
{
nsCOMPtr<nsILoadGroup> loadGroup;
mChannel->GetLoadGroup(getter_AddRefs(loadGroup));
if (loadGroup)
{
loadGroup->RemoveRequest(NS_STATIC_CAST(nsIViewSourceChannel*,
this),
nsnull, aStatus);
}
}
return mListener->OnStopRequest(NS_STATIC_CAST(nsIViewSourceChannel*,
this),
aContext, aStatus);
}
Edit:
As I have no clue about how Mozilla works, so I have to guess from reading some code. From the channel's point of view, once the original file is loaded, its job is done. If you want to load the secondary items linked in file like an image, you have to implement that in the listener. See TestPageLoad.cpp. It implements a crude parser and it retrieves child items upon OnDataAvailable:
NS_IMETHODIMP
MyListener::OnDataAvailable(nsIRequest *req, nsISupports *ctxt,
nsIInputStream *stream,
PRUint32 offset, PRUint32 count)
{
//printf(">>> OnDataAvailable [count=%u]\n", count);
nsresult rv = NS_ERROR_FAILURE;
PRUint32 bytesRead=0;
char buf[1024];
if(ctxt == nsnull) {
bytesRead=0;
rv = stream->ReadSegments(streamParse, &offset, count, &bytesRead);
} else {
while (count) {
PRUint32 amount = PR_MIN(count, sizeof(buf));
rv = stream->Read(buf, amount, &bytesRead);
count -= bytesRead;
}
}
if (NS_FAILED(rv)) {
printf(">>> stream->Read failed with rv=%x\n", rv);
return rv;
}
return NS_OK;
}
The important thing is that it calls streamParse(), which looks at src attribute of img and script element, and calls auxLoad(), which creates new channel with new listener and calls AsyncOpen().
uriList->AppendElement(uri);
rv = NS_NewChannel(getter_AddRefs(chan), uri, nsnull, nsnull, callbacks);
RETURN_IF_FAILED(rv, "NS_NewChannel");
gKeepRunning++;
rv = chan->AsyncOpen(listener, myBool);
RETURN_IF_FAILED(rv, "AsyncOpen");
Since it's passing in another instance of MyListener object in there, that can also load more child items ad infinitum like a Russian doll situation.
I think I found it (myself), take a close look at this page. Why it doesn't highlight that the UUID has been changed over versions, isn't clear to me, but it would explain why things fail when (or just prior to) calling QueryInterface on nsIHttpChannelInternal.
With the new(er) UUID, I'm getting better results. As I mentioned in an update to the question, I've posted this on bugzilla.mozilla.org, I'm curious if and which response I will get there.

Resources