i have many pcd files which were collected in every scan of a lidar. I want to convert my pcd files into pointcloud2 format to use them as a rosbag. I saw the pcd_to_pointcloud from point cloud library, however it is only applicable to a single pcd file. How is it possible to iterate this code for multiple pcd files?
rosrun pcl_ros pcd_to_pointcloud <file.pcd> [ <interval> ]
Files are like scan1.pcd scan2.pcd scan3.pcd etc.
Thank you
You could do the loop in your shell. For example, the bash command would look as follows:
for F in my_pcd_directory/*.pcd; do rosrun pcl_ros pcd_to_pointcloud ${F} 0; done
This loop publishes all pcd files one by one.
Starting a rosbag record --all or rosbag record cloud_pcd in another shell records the published point clouds and stores them in a bag in your current working directory. Of course, you need to start the recording before running the for-loop.
Some solution in Python
import os
from os import listdir
from os.path import isfile, join
path = "/your/path/to/pcd_files/"
pcd_files = [f for f in listdir(path) if isfile(join(path, f))]
for file in pcd_files:
cmd = "rosrun pcl_ros pcd_to_pointcloud " + path+file
os.system(cmd)
Related
my task is to create a database with all street names and town names in Germany. As this is a large query I chose to download the pbf file with the python pyrosm package. Once I unpack the data with OSM() and use get_network() I run into memory issues as the loaded DataFrame is to large. See here for roads (this works for smaller areas such as regions in germany):
from pyrosm import get_data
from pyrosm import OSM
import pandas as pd
#Downloading the germany Data
de = get_data("germany")
#Turning it into an OSM object
de_osm = OSM(de)
#Extracting all driving objects, e.g. roads
roads = osm_object.get_network(network_type="driving")
#Extract all road names and turning it into a list
road_names = pd.Series(roads.name).values
road_names = list(road_names)
I wanted to solve this problem with generator functions but I cant seem to iterate over the data like i would with a csv file. Here are my attempts that failed:
osm.object= (OSM(obj) for obj in de)
#Extracting all driving objects, e.g. roads
roads = osm_object.get_network(network_type="driving")
#Extract all road names and turning it into a list
road_names = pd.Series(roads.name).values
road_names = list(road_names)
Alternative:
def generator_osm():
for i in OSM(de).get_network(network_type="driving"):
yield i
res = generator_osm()
#Extract all road names and turning it into a list
road_names = list()
for i in res:
road_names = road_names.append(pd.Series(i.name).values)
Thank you in advance for any tipps you can provide :)
I would suggest to use pyosmium. It allows you to analyse osm files easily without having to deal with the geometry. I tried pyrosm a bit and I think it tries to create a road network when using .get_network(…), which is unnecessary if you only want to know what names the roads objects in your osm files have.
I took an example of the pyosmium documentation and applied it to collecting road names in a short example:
import osmium
from collections import Counter
# handler that processes your file
class RoadNameHandler(osmium.SimpleHandler):
def __init__(self):
super(RoadNameHandler, self).__init__()
self.road_names = []
def way(self, o):
if 'highway' in o.tags and 'name' in o.tags:
self.road_names.append(o.tags['name'])
# process file
h = RoadNameHandler()
h.apply_file("germany-latest.osm.pbf")
# some examples to print & count the names
print(h.road_names)
print(Counter(h.road_names))
print(len(h.road_names))
This script did not take more than 500-600 MB of memory, the pbf file had to be downloaded manually from Geofabrik.
P.S.: If you want to have a distinct list of the road names you can either use a Counter() or a set instead of the list for self.road_names.
currently two avro files are getting generated for 10 kb file, If I follow the same thing with my actual file (30MB+) I will n number of files.
so need a solution to generate only one or two .avro files even if the source file of large.
Also is there any way to avoid manual declaration of column names.
current approach...
spark-shell --packages com.databricks:spark-csv_2.10:1.5.0,com.databricks:spark-avro_2.10:2.0.1
import org.apache.spark.sql.types.{StructType, StructField, StringType}
// Manual schema declaration of the 'co' and 'id' column names and types
val customSchema = StructType(Array(
StructField("ind", StringType, true),
StructField("co", StringType, true)))
val df = sqlContext.read.format("com.databricks.spark.csv").option("comment", "\"").option("quote", "|").schema(customSchema).load("/tmp/file.txt")
df.write.format("com.databricks.spark.avro").save("/tmp/avroout")
// Note: /tmp/file.txt is input file/dir, and /tmp/avroout is the output dir
Try specifying number of partitions of your dataframe while writing the data as avro or any format. To fix this use repartition or coalesce df function.
df.coalesce(1).write.format("com.databricks.spark.avro").save("/tmp/avroout")
So that it writes only one file in "/tmp/avroout"
Hope this helps!
Tensorboard can visualize several runs of a tensorflow graph, by storing each run in a sub-directory of the logging directory.
For instance, the documentation provides this example:
experiments/
experiments/run1/
experiments/run1/events.out.tfevents.1456525581.name
experiments/run1/events.out.tfevents.1456525585.name
experiments/run2/
experiments/run2/events.out.tfevents.1456525385.name
/tensorboard --logdir=experiments
To start the next run (run3), a new directory should then be passed to the SummaryWriter constructor:
summary_writer = tf.train.SummaryWriter('experiments/run3/', sess.graph)
where the directory is the top-level logging directory (experiments) and a unique ID (run3).
Is there a way to automatically create a new unique run ID?
Sequential integer IDs would be good, so would time-based IDs.
You can check in python what are the directories existing in experiments and create a new one with an incremented number.
If the list is empty, we start at run_01.
import os
previous_runs = os.listdir('experiments')
if len(previous_runs) == 0:
run_number = 1
else:
run_number = max([int(s.split('run_')[1]) for s in previous_runs]) + 1
logdir = 'run_%02d' % run_number
summary_writer = tf.train.SummaryWriter(os.path.join('experiments', logdir), sess.graph)
I used "%02d" to have names like: run_01, run_02, run_03, ... run_10, run_11.
My python code is receiving a byte array which represents the bytes of the hdf5 file.
I'd like to read this byte array to an in-memory h5py file object without first writing the byte array to disk. This page says that I can open a memory mapped file, but it would be a new, empty file. I want to go from byte array to in-memory hdf5 file, use it, discard it and not to write to disk at any point.
Is it possible to do this with h5py? (or with hdf5 using C if that is the only way)
You could try to use Binary I/O to create a File object and read it via h5py:
f = io.BytesIO(YOUR_H5PY_STREAM)
h = h5py.File(f,'r')
You can use io.BytesIO or tempfile to create h5 objects, which showed in official docs http://docs.h5py.org/en/stable/high/file.html#python-file-like-objects.
The first argument to File may be a Python file-like object, such as an io.BytesIO or tempfile.TemporaryFile instance. This is a convenient way to create temporary HDF5 files, e.g. for testing or to send over the network.
tempfile.TemporaryFile
>>> tf = tempfile.TemporaryFile()
>>> f = h5py.File(tf)
or io.BytesIO
"""Create an HDF5 file in memory and retrieve the raw bytes
This could be used, for instance, in a server producing small HDF5
files on demand.
"""
import io
import h5py
bio = io.BytesIO()
with h5py.File(bio) as f:
f['dataset'] = range(10)
data = bio.getvalue() # data is a regular Python bytes object.
print("Total size:", len(data))
print("First bytes:", data[:10])
The following example uses tables which can still read and manipulate the H5 format in lieu of H5PY.
import urllib.request
import tables
url = 'https://s3.amazonaws.com/<your bucket>/data.hdf5'
response = urllib.request.urlopen(url)
h5file = tables.open_file("data-sample.h5", driver="H5FD_CORE",
driver_core_image=response.read(),
driver_core_backing_store=0)
I have .tiff files which contain 25 sections of a stack each. Is there a way to use the "Image to Stack" command in batch? Each data set contains 60 tiffs for all three channels of color.
Thanks
Christine
The general way to discover how to do these things is to use the macro recorder, which you can find under Plugins > Macros > Record .... If you then go to File > Import > Image Sequence... and select the first file of the sequence as normal, you should see something like the following appear in the recorder:
run("Image Sequence...", "open=[/home/mark/a/1.tif] number=60 starting=1 increment=1 scale=100 file=[] or=[] sort");
To allow this to work for arbitrary numbers of slices (my example happened to have 60) just leave out the number=60 bit. So, for example, to convert this directory of files to a single file from the command-line you can do:
imagej -eval 'run("Image Sequence...", "open=[/home/mark/a/1.tif] starting=1 increment=1 scale=100 file=[] or=[] sort"); saveAs("Tiff", "/home/mark/stack.tif");' -batch