Web Scraping - Downloading Zip File - post

I'm trying to download a bunch of PDF files from a website that come bundled in a zip file with python. To download the zip file, I click a download button that makes a popup appear (I assume this is not important to the problem but I will include it for completeness). Chrome shows this when the download button is pressed and the popup appears:
I must then click the download button that's on the popup to actually begin the download. This is what follows:
I am quite confident that the first request is the only important one. If we look at the headers for this POST Request we see this:
All the POST data needed for this request can be scraped from the previous HTML page with the exception of the downloadedZipToken. This token is only generated/added to the html form after I click the download button on the popup and you can see that it's returned to me in the response header as a cookie.
So to summarize. In order to have a python script download the zip file for me I believe I have to mimic this POST request which I haven't been able to do because the zip token is not initially accessible. I apologize if this was confusing. Please let me know if more information is needed.

The downloadZipToken POST data that I wasn't able to find in my original question turns out to be a unix timestamp which makes a lot more sense as to why I couldn't find it in the HTML source. I assume it's generated by some JS script once the POST request is sent. To write my python code I just generated a unix timestamp with
timeStamp = math.ceil(time.time()*1000)

Related

Download or view the files sent as multi-part Request (PNG, PDF) through network proxy tool?

How to download or view the files sent as multi-part Request (e.g. PUT) via a software tool?
Is there any way to accomplish this with a specific tool like CharlesProxy on macOSX, to download and view files that were sent as a part of request (PUT multipart request)? I typically fix such issues by saving the file to sandbox via code changes. Ideally, need something that can be used by our QA and doesn't require any code modification.
Charles Proxy on macos is sufficient for the most dev/QA needs, such as:
Throttle network
Device debugging
Download response data
...
However, there is no option present to view or download files in HTTP request in Charles Proxy 4.x:
Charles Proxy 4.x (and earlier) allows saving response files, example pdf in this screenshot:
This can be done by editing binary file manually. It's a bit tricky, but can save the file in multipart HTTP request, without any modification to project code.
Here are the steps (verified on Charles v4.2.8 and macOS v10.12.6):
Save request. Right click a recorded HTTP request (the one that send file), and click "Save Request...". This will save the whole HTTP request in binary format.
Inspect Hex representation of request. Left click that recorded HTTP request, and click "Hex" tab of "Request" panel. This will show the binary representation of request, together with some parsed text.
Edit the saved request. Open the saved request (step 1) with editor that support binary, such as Sublime Text. Then, remove all non-image binary code according to the result of step 2. Especially, remove every bytes before (and include) the first empty line (0d0a0d0a in macOS and Windows, 0a0a in Linux), and remove the tail bytes. For example, following screenshot indicates request bytes of step 2, the selected bytes would be deleted (please note the 0d0a bytes, as this experiment is taken on Mac):
...
Save image file. Save the file after step 3 is finished. Then, append filename extension according to the Content-Type value in step 2. In this experiment, the Content-Type is image/png, so .png is appended to the filename.
That's it. You can open the xxx.png file now. It's a pure image file.
Note: this experiment only contain 1 file, but the strategy works when there are multiple file upload in request.

How to download multiple files one after another

I have a ajax download functionality in my MVC webapp.
User can select a criteria and click on export button. Internally it will fetch data and return an Excel file. up to this functionality is working fine.
But the issue occurs, While one download process is running and now user changes the filter criteria and again click on export button. Now two download processes are running. Whichever process completes first will return file to download. Now the user can see Open, save, cancel option to download first file. As this stage when second download request is also completed and returns file to download. When I opens one file the another file download option is also lost.
Initially I thought it might because both the files are having same name. So I made changes to set unique file name for every request. But It still gives only single file to download.
Can anyone help me on this?
edited :
On other pages where I have two different types of files to download, the above functionality works successfully.
In none ajax requests, page can only be waiting for one response.
In order to solve that problem and wait for multiple responses you should use target attribute with value "new" as the following code depicts:
Your download Text
The above code makes each response to be downloaded in a new tab.

how can use jsonkit to get data form website (created in wordpress) which enabled with json plugin

I want to create a ios project in which get data from website which is created in word press and json plugin enabled in word press. But i am getting html code as string when request with any url of website . It should be in json format so that i can parse it in relevant information.
So please help me how can perform done this task.

ActionScript FileReference upload onComplete

I am a complete beginner in Flash & Actionscript.
My pet project is this: To provide a www.imageshack.com like service where people could upload single images and later anyone can view it using the generated url.
So far I have gotten to upload an image using Flash and store it in a directory.
http://pixels.guygar.com/
You can check the uploaded image at:
http://pixels.guygar.com/warehouse/
The issue being, I was under the impression when the PHP file is called to store the image in the folder /warehouse the browser would automatically navigate to:
http://pixels.guygar.com/upload.php
Where I can process the image i.e. generate a unique file name and provide the user with a unique URL to later access the resource.
What is happening is the image gets uploaded by the PHP script but the browser page still stays the same page even when providing a new url in the PHP script.
So the question is how do I go about so that a new URL (image resource linked) is passed back to the flash so that onComplete is called I can navigate to image that was just loaded? Or other ways of doing such?
I welcome your perspectives on this issue and thank you for your guidance.
i would store all values that you need later on in a session on the server (don't forget to pass the sessionID to the upload-script via GET).
at the end of the PHP script you just return "ok" (or "ko" if sth went wrong) to flash and then (in the callback/listener) call/load a second PHP-script that's doing the rest ... and returns you an URL to a thumbnail or whatever you want to do.
hope this points you in the right direction ...
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/net/FileReference.html?filter_flash=cs5&filter_flashplayer=10.2&filter_air=2.6#event:uploadCompleteData
Shows how data can be returned to flash after an upload.

rails how to know when send_file done

As it takes some time to prepare the content of the data to be downloaded, I want to show a message "Preparing file to download" when the user submits the request
Then when the file is ready, I use send_file to send the data
Once it's done, I need to clear the message
Thanks a lot
Here is what I would do.
1. Create a prepare_file action
First, I would create an action whose job would just be to create the file and rather than rendering HTML, it would render a JSON object with the name of the file that was created.
2. Use AJAX to call the prepare_file action
On the client, when the user clicks to download the file, you display the message, "Preparing download..." and just do an AJAX request to that action. The response you'll get back via AJAX is the name of the file created.
3. Redirect to the file download
Finally, you can hide the preparing download message and redirect the browser to the file download via JavaScript with the name of the file that was created. You would use send_file in this action.
I know that, in the question, you also wanted to be able to display to the user a message when the file is downloading and another message when it is finished. However, this isn't possible unless you write your own client-side download manager. The browser handles file downloads entirely and the user will see in the browser that the file is downloading and what the progress is. So, I understand where you're coming from, but you shouldn't feel like the user isn't being informed of what's happening.
At least with this solution, you're displaying a message to them when the file is being prepared and then once that message disappears, they'll get the download file dialog from the browser.
If you need help with actual code samples of how to do this, let me know.

Resources