Input FASTA file required after local BLAST database is built? - fasta

I've downloaded a very large fasta file and have built a local BLAST database. I'm trying to maintain storage space and was wondering if the input fasta file can be deleted after the local BLAST database has been built?

I removed the fasta file and the blast still ran fine. I was initially worried about deleting it as the fasta files are big. But I was able to find a small blast database to test it on. Thanks Llopis!

Related

Avoid reading the same file multiple times using Telegraf and file input plugin

I need to read csv files inside a folder. New csv files are generated every time a user submits a form. I'm using the "file" input plugin to read the data and send it to Influxdb. These steps are working fine.
The problem is that the same file is read multiple times every data collection interval. I was thinking of a solution where I could move the file that was read to a different folder, but I couldn't do that with Telegraf's "exec" output plug.
ps: I can't change the way csv files are generated.
Any ideas on how to avoid reading the same csv file multiple times?
As you discovered file input plugin is used to read entire files at each collection interval.
My suggestion is for you to instead use the directory monitor input plugin. This will read files in a directory, monitor the directory for new files, and parse the ones that have not already been picked up yet. There are some configuration settings in that plugin that make it easier to time when new files are read as well.
Another option is to use the tail input plugin which will tail a file and only read new updates to that file as things come. However, I think the directory monitor is more likely something you are after for your scenario.
Thanks!

how to handle the data encoding issue while copying the data from CSV file to parquet using Azure copy activity?

I have a CSV file that I wanted to convert to the parquet the CSV file contains the value Querý in one column
So I am using use copy activity from the azure data factory and converting to the parquet but I get the value as Queryý. I don't find any enoding option in the sink. I have seen a few documentation but everything says about the CSV file ending. Could someone help with this?
There is no way to set the encoding of parquet in Azure Data Factory.
I created a pipeline to test and it can work fine.
Here are some advice for you to troubleshoot:
Make sure the encoding of your csv file is correct.
Make sure your schema of Parquet is correct.

SPSS indicates that the .sav file is empty. Is there any way to recover the file from the Temp folder?

I was just finishing my work. Had been working on this for 1 month, more than 100 questionnaires' data to enter on the database, there were only 3 left. My computer crashed, I turned it off and after I turned it on again, I opened SPSS and coudln't open the .sav file because of an error saying the file is empty (although its size is 68kb) and to try to open another file. I tried to recover the data through the journal file, through syntaxe, but I can't. Is there any way to recover the file from the Temp folder? I noticed that in the Temp folder there are some folders named like this: "pasw-cfe-1324076806484081348-tmp" which contain executable jar files inside. There are also some LCK files with the same names as those folders. After searching about PASW, I got the idea that it may be related to SPSS. Is there any way to open these folders/files? Also, I can send the file to anyone, in case this is needed.
Any help you can offer me will be truly appreciated. Thanks
Unfortunately, no there is no way to convert those temporary scratch file back into an SPSS Statistics system file (.sav). If you do not have backups of the file, perhaps in another format (.sav, *.xlsx, *.csv, etc.) or stored as a table in a database, your only recourse will be to start over.
I have an idea, you can do "system Restore"

PhpSpreadsheet load Excel file from memory rather than a file?

I'm downloading an Excel file from an Azure Storage Blob and therefore want to use stream_get_contents to get the file. But PhpSpreadsheet seems to only want to read the file off the filesystem.
For now, I'm saving it to a temp folder and reading it back, but that is less than ideal.
Is there a way to get PhpSpreadsheet to load via something other than a local file?
This is not supported. PhpSpreadsheet will always read from disk.
On a side note, since 1.13.0, PhpSpreasheet is able to write in memory. See https://github.com/PHPOffice/PhpSpreadsheet/pull/1292

Ruby file copy produces different file

I'm not very familiar with file handling in ruby. A problem I've come accross is that reading and writing a binary file doesn't produce exactly the same file.
clone = Tempfile.new(tempfile.original_filename)
FileUtils.copy_stream(tempfile, clone)
clone.flush
From the image below it is clear that it is not an exact file copy, when I try to open the newly created file in an image viewer it reports that the file is corrupt. I have tried copying the file in different ways such as clone.write(tempfile.read), etc. without success.
*The file viewer also indicates the original is ANSI Dos/Windows and the clone is ANSI Macintosh. The file size also differs by about 200 bytes.
What I'm trying to accomplish is actually simply using a Tempfile twice. A file is uploaded via rails and given to me as a Tempfile. I want to submit it to two different restful services and RestClient.post closes the file automatically. Another option would be to submit some sort of in memory stream clone to RestClient so that it can not close my file. If I submit File.open(tempfile.path) to RestClient it produces the same broken file, this indicates that the reading is the problem and not the writing. If I submit the original Tempfile object to RestClient it works perfectly but then it is closed and deleted and I cannot send it again.
Please help!
Regards,
Pierre
It would be much more helpful to see a hex view of these files instead of a text editor's intepretation. My guess is that at least one of the files is not opened in binary mode. In Ruby 1.9, try
open(filename, 'rb')
open(filename, 'wb')
Tempfile.new(filename, :binmode => true)
for opening a file for reading / writing and to create a binary temporary file, respectively.

Resources