Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am trying to find a way to read hyperion e-01 satellite data (hyperspectral data) which has .L1R file extension , in python. Kindly suggest any library to read this data in python.
Here is a function to read the HDF data from the .L1R file and save it in ENVI format. If you simply want to read the data as a numpy array, modify it to return data and omit the rest of the function definition.
from pyhdf.SD import SD
import spectral as spy
import os
def hdf4_to_envi(hdf4_filename, envi_hdr=None, outdir=None, inhdr=None,
**kwargs):
'''Converts an Hyperion HDF4 file to ENVI format.
Arguments:
`hdf4_filename` (str):
Name of the HDF4 file (often ends with ".LR1"
`envi_hdr` (str, default None):
Name of the ENVI header file to create. A ".img" file will also
be created. If not specified, the header will have the same name
as the HDF file with '.hdr' appended.
`outdir` (str, default None):
Directory in which to create the new file. If not specifed, new
file will be created in the same directory as the HDF4 file.
`inhdr` (str, default None):
Name of optional ENVI header file from which to extract additional
metadata, which will be added to the new image header file.
Keyword Arguments:
All keyword arguments are passed to `spectral.envi.save_image`.
'''
(indir, infile) = os.path.split(hdf4_filename)
if envi_hdr is None:
header = infile + '.hdr'
else:
header = envi_hdr
if outdir is None:
outdir = indir
fin = SD(hdf4_filename)
ds = fin.select(0)
data = ds.get()
data = data.transpose(0, 2, 1)
if inhdr is None:
metadata = {}
else:
metadata = spy.envi.read_envi_header(inhdr)
outfile = os.path.join(outdir, header)
spy.envi.save_image(outfile, data, ext='.img', metadata=metadata)
Related
I'm trying to create a code generator that takes input a JSON file and generates multiple classes in multiple files.
And my question is, is it possible to create multiple files for one input using build from dart lang?
Yes it is possible. There are currently many tools in available on pub.dev that have code generation. For creating a simple custom code generator, check out the package code_builder provided by the core Dart team.
You can use dart_style as well to format the output of the code_builder results.
Here is a simple example of the package in use (from the package's example):
import 'package:code_builder/code_builder.dart';
import 'package:dart_style/dart_style.dart';
final _dartfmt = DartFormatter();
// The string of the generated code for AnimalClass
String animalClass() {
final animal = Class((b) => b
..name = 'Animal'
..extend = refer('Organism')
..methods.add(Method.returnsVoid((b) => b
..name = 'eat'
..body = refer('print').call([literalString('Yum!')]).code)));
return _dartfmt.format('${animal.accept(DartEmitter())}');
}
In this example you can use the dart:io API to create a File and write the output from animalClass() (from the example) to the file:
final animalDart = File('animal.dart');
// write the new file to the disk
animalDart.createSync();
// write the contents of the class to the file
animalDart.writeAsStringSync(animalClass());
You can use the File API to read a .json from the path, then use jsonDecode on the contents of the file to access the contents of the JSON config.
This is a question about Nifi.
I made Nifi pipeline to convert flowfile with xml format to csv format.
Now, I would like to concatenate or union the converted csv flowfile to existing tables by filename (which stands for table name as well).
Simply put, my processor flow is following.
GetFile (from a particular directory) -> 2. Convert xml to csv -> 3.Update the flowfile with table name
-> 4. PutFile (to a different directory)
But, at the end of the flow, PutFile processor throws an error, saying "file with the same name already exists".
I have no ideas how flowfile can be added to existing csv table.
Any advice, tips, ideas are appreciated.
Thank you in advance.
there is no support to append file however you could use ExecuteGroovyScript to do it:
def ff=session.get()
if(!ff)return
ff.read().withStream{s->
String path = "./out_folder/${ff.filename}"
//sync on file path to avoid conflict on same file writing (hope)
synchronized(path){
new File( path ).append(s)
}
}
REL_SUCCESS << ff
if you need to work with text (reader) content rather then byte (stream) content
the following example shows how to exclude 1 header line from flow file if destination file already exists
def ff=session.get()
if(!ff)return
ff.read().withReader("UTF-8"){r->
String path = "./.data/${ff.filename}"
//sync on file path to avoid conflict on same file writing (hope)
synchronized(path){
def fout = new File( path )
if(fout.exists())r.readLine() //skip 1 line (header) only if out file already exists
fout.append(r) //append to the file the rest of reader content
}
}
REL_SUCCESS << ff
i am "playing" with apache beam/dataflow in datalab.
I am trying to read a csv file from gcs.
when i create the pcollection using:
lines = p | 'ReadMyFile' >> beam.io.ReadFromText('gs://' + BUCKET_NAME + '/' + input_file, coder='StrUtf8Coder')
I get the following error:
LookupError: unknown encoding: "THE","NAME","OF","COLUMNS"
it seems the name of columns is interpreted as encoding?
I do not understand what's wrong.
If i do not specify the "coder" i get
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 1045: invalid continuation byte
Outside apache beam I am able to handle this error by reading the file from gcs:
blob = storage.Blob(gs_path, bucket)
data = blob.download_as_string()
data.decode('utf-8', 'ignore')
I read apache beam only support utf8 and the file does not contain only utf8.
Should I download and then convert to pcollection?
Any suggestion?
A possible hack is to create a class that inherits from the Coder class (apache_beam.coders.coders.Coder)
from apache_beam.coders.coders import Coder
class ISOCoder(Coder):
"""A coder used for reading and writing strings as ISO-8859-1."""
def encode(self, value):
return value.encode('iso-8859-1')
def decode(self, value):
return value.decode('iso-8859-1')
def is_deterministic(self):
return True
and pass it as an argument to the ReadFromText IO transform (apache_beam.io.textio.ReadFromText) provided by beam
like this
from apache_beam.io import ReadFromText
with beam.Pipeline(options=pipeline_options) as p:
new_pcollection = ( p | 'Read From GCS' >>
beam.io.ReadFromText('input_file', coder=ISOCoder())
The logic behind this detailed here
https://medium.com/#khushboo_16578/cloud-dataflow-and-iso-8859-1-2bb8763cc7c8
I would suggest changing the coding on the actual file. If you save the file with "Save as" you can select UTF-8 encoding for the format on excel CSVs and regular .txt. Once you do that you need to make sure you add a line of code like
class DoWork(beam.DoFn):
def process(self, text):
text = textfilePcollection.encode('utf-8')
Do other stuff
This isn't how I would like to do it because it isn't code-centric, but it has work for me before. Unfortunately, I don't have a code-centric solution.
How can you extract metadata for a batch of images? My first thought was to record a macro and then modify it to operate on a list of file names.
In that vein, I tried recording a macro doing something like this:
Ctrl-o # Open a file
12.dm3Enter # Select file to open
Ctrl-i # Open metadata in a new window
Ctrl-s # Save file
Info for 12.txtEnter# Name of file being saved
Ctrl-w# Close current window
Ctrl-w# Close current window
These steps work when I do them manually. This results in the following macro, which seems to be missing most of what I tried to record:
open("/path/to/file/12.dm3");
run("Show Info...");
run("Close");
run("Close");
Modifying a Jython script that is supposed to extract dimension metadata from an image:
from java.io import File
from loci.formats import ImageReader
from loci.formats import MetadataTools
import glob
# Create output file
outFile = open('./pixel_sizes.txt','w')
# Get list of DM3 files
filenames = glob.glob('*.dm3')
for filename in filenames:
# Open file
file = File('.', filename)
# parse file header
imageReader = ImageReader()
meta = MetadataTools.createOMEXMLMetadata()
imageReader.setMetadataStore(meta)
imageReader.setId(file.getAbsolutePath())
# get pixel size
pSizeX = meta.getPixelsPhysicalSizeX(0)
# close the image reader
imageReader.close()
outFile.write(filename + "\t" + str(pSizeX) + "\n")
# Close the output file
outFile.close()
(Gist).
You could use getImageInfo() instead of run("Show Info..."). This will create a string in the macro containing the run("Show Info...") output, but can then be modified as you like. See http://rsb.info.nih.gov/ij/developer/macro/functions.html#getImageInfo for more information.
I am downloading an xls file from the internet. It is in .xls format but I need 'Sheet1' to be in csv format. I use xlrd to make the conversion but seem to have run into an issue where the file I write to is empty?
import urllib2
import tempfile
import csv
import xlrd
url_2_fetch = ____
u = urllib2.urlopen(url_2_fetch)
wb = xlrd.open_workbook(file_contents=u.read())
sh = wb.sheet_by_name('Sheet1')
csv_temp_file = tempfile.TemporaryFile()
with open('csv_temp_file', 'wb') as f:
writer = csv.writer(f)
for rownum in xrange(sh.nrows):
writer.writerow(sh.row_values(rownum))
That seemed to have worked. But now I want to inspect the values by doing the following:
with open('csv_temp_file', 'rb') as z:
reader = csv.reader(z)
for row in reader:
print row
But I get nothing:
>>> with open('csv_temp_file', 'rb') as z:
... reader = csv.reader(z)
... for row in reader:
... print row
...
>>>
I am using a tempfile because I want to do more parsing of the content and then use SQLAlchemy to store the contents of the csv post more parsing to a mySQL database.
I appreciate the help. Thank you.
This is completely wrong:
csv_temp_file = tempfile.TemporaryFile()
with open('csv_temp_file', 'wb') as f:
writer = csv.writer(f)
The tempfile.TemporaryFile() call returns "a file-like object that can be used as a temporary storage area. The file will be destroyed as soon as it is closed (including an implicit close when the object is garbage collected)."
So your variable csv_temp_file contains a file object, already open, that you can read and write to, and will be deleted as soon as you call .close() on it, overwrite the variable, or cleanly exit the program.
So far so good. But then you proceed to open another file with open('csv_temp_file', 'wb') that is not a temporary file, is created in the script's current directory with the fixed name 'csv_temp_file', is overwritten every time this script is run, can cause security holes, strange bugs and race conditions, and is not related to the variable csv_temp_file in any way.
You should trash the with open statement and use the csv_temp_file variable you already have. You can try to .seek(0) on it before using it again with the csv reader, it should work. Call .close() on it when you are done with it and the temporary file will be deleted.