Give tqdm a hint about task duration to improve initial accuracy - tqdm

I am using tqdm to track progress when processing 6 large files. Each file is very large and takes about 10 min. During the first 10 min, tqdm is unable to offer an accurate time estimate because it has not collected any data points yet. After the first file is done, the progress report works satisfactorily.
Is there a way to hint tqdm with my own estimate of how long the task will take? For example, something like tqdm(file_list, hint="10 min") would have tqdm start with "60 min remaining" rather than "? min remaining". As it actually progresses through the iterable, it would then update this initial estimate accordingly.

Related

Calculate the InfluxDB average

I want to process the value from InfluxDB on Grafana.
The final demand is to show how many miles the current vehicle has traveled in a certain time frame.
You can use the formula: average velocity * time.
Do the seniors have any good methods?
So what I'm thinking is: I've got the mean function for the average speed over a fixed period of time and the corresponding mileage, and then I want to add all the mileage together. How do I do that?
What if you only use SQL?
1.) InfluxDB uses InfluxQL, not a SQL
2.) Your approach average velocity * time is innacurate
3.) Use suitable InfluxDB functions, I would say INTEGRAL() is the best function for this case + some basic arithmetic. Don't expect the 100% accuracy. Accuracy depends heavily on the metric sampling, e.g. 1 minute sampling - but what if vehicle is driving 59 seconds and it is not moving for that second when sampling is happening. So don't be supprised, when even 10 sec sampling will be inacurrate.

Don't estimate total runtime with tqdm

I'm using tqdm to generate the progress bar for a loop where iterations take an increasing amount of time with increasing value of the iterator. The iterations per second and estimated completion metrics are thus not particularly meaningful, as previous iterations cannot (easily) be used to predict the runtime of future iterations.
Is there an easy way to disable displaying the estimation of iterations per second and total runtime with tqdm?
Relevant example code:
from tqdm import tqdm
import time
for t in tqdm(range(10)):
time.sleep(t)
tqdm's README describes the bar_format argument as follows:
Specify a custom bar string formatting. May impact performance.
[default: '{l_bar}{bar}{r_bar}'], where l_bar='{desc}: {percentage:3.0f}%|' and r_bar='| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}{postfix}]'...
Since the part you don't care about is mostly in "{r_bar}", you can just tweak that part of the default value as follows to omit [{elapsed}<{remaining}, {rate_fmt}:
from time import sleep
from tqdm import tqdm
for time in tqdm(range(10),
bar_format = "{l_bar}|{bar}| {n_fmt}/{total_fmt}{postfix}"):
sleep(time)

Setting correct input for RNN

In a database there are time-series data with records:
device - timestamp - temperature - min limit - max limit
device - timestamp - temperature - min limit - max limit
device - timestamp - temperature - min limit - max limit
...
For every device there are 4 hours of time series data (with an interval of 5 minutes) before an alarm was raised and 4 hours of time series data (again with an interval of 5 minutes) that didn't raise any alarm. This graph describes better the representation of the data, for every device:
I need to use RNN class in python for alarm prediction. We define alarm when the temperature goes below the min limit or above the max limit.
After reading the official documentation from tensorflow here, i'm having troubles understanding how to set the input to the model. Should i normalise the data beforehand or something and if yes how?
Also reading the answers here didn't help me as well to have a clear view on how to transform my data into an acceptable format for the RNN model.
Any help on how the X and Y in model.fit should look like for my case?
If you see any other issue regarding this problem feel free to comment it.
PS. I have already setup python in docker with tensorflow, keras etc. in case this information helps.
You can begin with a snippet that you mention in the question.
Any help on how the X and Y in model.fit should look like for my case?
X should be a numpy matrix of shape [num samples, sequence length, D], where D is a number of values per timestamp. I suppose D=1 in your case, because you only pass temperature value.
y should be a vector of target values (as in the snippet). Either binary (alarm/not_alarm), or continuous (e.g. max temperature deviation). In the latter case you'd need to change sigmoid activation for something else.
Should i normalise the data beforehand
Yes, it's essential to preprocess your raw data. I see 2 crucial things to do here:
Normalise temperature values with min-max or standardization (wiki, sklearn preprocessing). Plus, I'd add a bit of smoothing.
Drop some fraction of last timestamps from all of the time-series to avoid information leak.
Finally, I'd say that this task is more complex than it seems to be. You might want to either find a good starter tutorial on time-series classification, or a course on machine learning in general. I believe you can find a better method than RNN.
Yes you should normalize your data. I would look at differencing by every day. Aka difference interval is 24hours / 5 minutes. You can also try and yearly difference but that depends on your choice in window size(remember RNNs dont do well with large windows). You may possibly want to use a log-transformation like the above user said but also this seems to be somewhat stationary so I could also see that not being needed.
For your model.fit, you are technically training the equivelant of a language model, where you predict the next output. SO your inputs will be the preciding x values and preceding normalized y values of whatever window size you choose, and your target value will be the normalized output at a given time step t. Just so you know a 1-D Conv Net is good for classification but good call on the RNN because of the temporal aspect of temperature spikes.
Once you have trained a model on the x values and normalized y values and can tell that it is actually learning (converging) then you can actually use the model.predict with the preciding x values and preciding normalized y values. Take the output and un-normalize it to get an actual temperature value or just keep the normalized value and feed it back into the model to get the time+2 prediction

change in value of one frequency bin affect FFT and IFFT values of non changing bins

I have a 3001x577 matrix. I want to apply a operation to the first 120 samples. I have applied to the first 120 samples which accounts to 20 Hz of frequency. The sampling rate is 2 msec. So I have Fnyq =250hz. Now I have taken out the first 120 samples. I noticed that after applying the filter and replacing it with the older 120 samples, the values of bins greater than 120 has changed after I applied an IFFT . And this is evident on my final result. I got the desired filter result but it ends up changing values of samples which i want untouched.
Can someone explain why change in value of few frequency bins affect the ifft or fft of non changing bins. I am using matlab. And how can i prevent it?
You took part of the spectrum (the first 120 samples), changed this part somehow and transformed the outcome back into the time domain by using an IFFT. It is to be expected that the signal has changed beyond the 120 samples since you manipulated frequency components which will alter all samples in the time domain. Think of it this way: You changed the amplitude (and phase) of 120 sinuses and then expect that the outcome to be limited to a certain time extent. Maybe you can post a new question where you describe what you actually want to achieve instead of the experiment you perform to get the job done.

Number Generator wave cycle to graph output

I'm looking to generate a wave form generated by a cycle of numbers that increase and then decrease on a given rate. The frequency can vary between 1 to 40 per minute and the amplitude varies between 100 and 3000. The idea is to form a breathing like pattern for "breaths per minute" (1-40) and an inhaled volume per breath (100-3000).
I'm new here and I can only find random generators. I have looked at NSTimer and UIGraphs from the Ios-Developer Tesla tutorial app.
Could anyone point me in the right direction.
Many Thanks.

Resources