Connecting and accessing data from influxdb(self hosted server) to jupyter notebook - influxdb

I have been trying to access influxdb self hosted data through jupyter notebook.
ref link: https://www.influxdata.com/blog/streaming-time-series-with-jupyter-and-influxdb/
.
from influxdb import InfluxDBClient
from datetime import datetime
import os
from influxdb_client import InfluxDBClient, Point, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS
client= InfluxDBClient(url="", token=token, org=org,bucket=bucket,host=influxdb_host)
Code has been sucessfully executed upto here and i have been trying the next step suggested in the ref link and have been getting an import error for stream although i have sucessfully imported hvplot.
def source_data(auto_refresh: int, query: str, sink: Stream):
rx \
.interval(period=timedelta(seconds=auto_refresh)) \
.pipe(ops.map(lambda start: f'from(bucket: "my-bucket") '
f'|> range(start: -{auto_refresh}s, stop: now()) '
f'{query}')) \
.pipe(ops.map(lambda query: client.query_api().query_data_frame(query, data_frame_index=['_time']))) \
.pipe(ops.map(lambda data_frame: data_frame.drop(columns=['result', 'table']))) \
.subscribe(observer=lambda data_frame: sink.emit(data_frame), on_error=lambda error: print(error))
pass
Is this the right way to extract the self hosted data to jupyter notebook ? or have I been looking into the wrong references.
Please suggest any reference links to extract selfhosted data from influx db to jupyter notebook.
Thanks,
Deepika

Related

Flux via cURL will not run if the input data file contains import statement

Following is working in REPL mode:
influx -precision rfc3339 -type=flux -path-prefix=/api/v2/query
> bName = "benchmark_db/autogen"
> import "influxdata/influxdb/v1"
> v1.measurements(bucket: bName)
I would like to wrap the query. According to this doc, I created a script called demo.flux under /home/demo.flux and put the above content in it and initiate following statement:
curl -XPOST localhost:8086/api/v2/query -sS -H 'Accept:application/csv' -H 'Content-type:application/vnd.flux' --data-binary "#/home/demo.flux"
However, it returns with an error:
error
loc 3:1-3:7: invalid statement #3:1-3:7: import
import statement must be first, before any other declarations.

vscode dev container python interactive (`tkagg`) plots

Expected Behavior (local environment: fresh MacOS 12.4 installation)
With no environment updates except $ pip3 install matplotlib, I can successfully run this simple plot from the Matplotlib documentation:
Example Code:
# testplot.py
import matplotlib.pyplot as plt
import numpy as np
# Data for plotting
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)
fig, ax = plt.subplots()
ax.plot(t, s)
ax.set(xlabel='time (s)', ylabel='voltage (mV)',
title='About as simple as it gets, folks')
ax.grid()
fig.savefig("test.png")
plt.show()
Actual Output (saved to a .png after window opens):
Run $ python3 testplot.py in the terminal:
Observed Behavior (vscode python 3.8 dev container)
Disclaimer: This post does not address notebook-based plots (which work fine but are not always preferred)
However, when I run this in my dev container, I get the following error:
testplot.py:16: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
plt.show()
First Attempted Solution:
Following this previously posted solution, I specified the backend (export MPLBACKEND=TKAgg) before running the interpreter, but the error persists.
Second Attempted Solution:
Following the comments, I added the following lines to the script:
import matplotlib
matplotlib.use('tkagg')
In the v3.8 dev container, this addition changes the error to:
Traceback (most recent call last):
File "testplot.py", line 5, in <module>
matplotlib.use('tkagg')
File "/usr/local/python/lib/python3.8/site-packages/matplotlib/__init__.py", line 1144, in use
plt.switch_backend(name)
File "/usr/local/python/lib/python3.8/site-packages/matplotlib/pyplot.py", line 296, in switch_backend
raise ImportError(
ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running
Note: adding these two lines broke the local script as well. The point of the local example was to show that it plots stuff without installing anything except matplotlib.

cannot import name 'convert_examples_to_features' deeppavlov

I tried to run the snippet of code from the tutorial about NER from deeppavlov.
Link tutorial
First I run command:
python -m deeppavlov install ner_ontonotes_bert_mult
python -m deeppavlov interact ner_ontonotes_bert_mult [-d]
Then
from deeppavlov import configs, build_model
ner_model = build_model(configs.ner.ner_ontonotes_bert_mult, download=True)
ner_model(['World Curling Championship will be held in Antananarivo'])
Catch this error:
ImportError: cannot import name 'convert_examples_to_features'
So if paste code in VSCode, line
configs.ner.ner_ontonotes_bert_mult
will give a signal that "Instance of 'Struct' has no 'ner' member"
(Config is Struct)
How I can fix it?
Google didn 't find an answer

Neo4j duplicate input id exception

I am new to neo4j and I am trying to construct bitcoin transaction graph using it. I am following this link behas/bitcoingraph to do so, I came across the neo4j import command to create a database
$NEO4J_HOME/bin/neo4j-import --into $NEO4J_HOME/data/graph.db \
--nodes:Block blocks_header.csv,blocks.csv \
--nodes:Transaction transactions_header.csv,transactions.csv \
--nodes:Output outputs_header.csv,outputs.csv \ .......
After executing the above command I encountered an error
Exception in thread "Thread-1" org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.DuplicateInputIdException: Id '00000000f079868ed92cd4e7b7f50a5f8a2bb459ab957dd5402af7be7bd8ea6b' is defined more than once in Block, at least at /home/nikhil/Desktop/Thesis/bitcoingraph/blocks_0_1000/blocks.csv:409 and /home/nikhil/Desktop/Thesis/bitcoingraph/blocks_0_1000/blocks.csv:1410
Here is the block_header. csv
hash:ID(Block),height:int,timestamp:int
Does anyone know how to fix it? I read there is a solution available in id-spaces but I am not quiet sure how to use it. Thanks in advance for any help
The --skip-duplicate-nodes flag will skip import of nodes with the same ID instead of aborting the import.
For example:
$NEO4J_HOME/bin/neo4j-import --into $NEO4J_HOME/data/graph.db \
--nodes:Block blocks_header.csv,blocks.csv --skip-duplicate-nodes \
--nodes:Transaction transactions_header.csv,transactions.csv \
--nodes:Output outputs_header.csv,outputs.csv \ .......

how to use the "display" function in a scala 2.11 with Spark 2.0 notebook in dsx

In dsx is there a way to use "display" in a scala 2.11 with Spark 2.0 notebook (I know it can be done in a python notebook with pixiedust). Eg:
display(spark.sql("SELECT COUNT(zip), SUM(pop), city FROM hive_zips_table
WHERE state = 'CA' GROUP BY city ORDER BY SUM(pop) DESC"))
But I want to do the same in a scala notebook. Currently I am just doing a show command below that just give data in a tabular format with no graphics etc.
spark.sql("SELECT COUNT(zip), SUM(pop), city FROM hive_zips_table
WHERE state = 'CA' GROUP BY city ORDER BY SUM(pop) DESC").show()
Note:
Pixiedust currently works with Spark 1.6 and Python 2.7.
Pixiedust currently supports Spark DataFrames, Spark GraphFrames and Pandas
Reference:-
https://github.com/ibm-cds-labs/pixiedust/wiki
But if you can use Spark 1.6 ,here is a quick way around to use that fancy display function:-
You can go the other way around, Since Pixidust let you use scala and python in one python notebook with %%scala line magic.
https://github.com/ibm-cds-labs/pixiedust/wiki/Using-Scala-language-within-a-Python-Notebook
Step 1. Create a notebook with python 2 and spark 1.6
Install pixidust and import it
!pip install --user --no-deps --upgrade pixiedust
import pixiedust
Define your variables or your dataframe in Scala under
%%scala
import org.apache.spark.sql._
print(sc.version)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val __df = sqlContext.read.json("people.json")
__df.show()
or
do whatever to create your dataframe
val __df = dataframe1.sql("SELECT COUNT(zip), SUM(pop), city FROM hive_zips_table
WHERE state = 'CA' GROUP BY city ORDER BY SUM(pop) DESC").show()
Step 2: In separate cell run following to access df variable in your python shell.
display(__df)
Reference to my sample Notebook:-
IBM Notebook: https://apsportal.ibm.com/analytics/notebooks/095520cb-c9ff-4f4a-a829-f458f20b4505/view?access_token=d4de7944ad7d6bfc179632a3036a7971c130e54d1a30ecf5df34ece8c4f8c3b5
Github: https://github.com/charles2588/bluemixsparknotebooks/blob/master/pixiedust/PixiedustTestCase.ipynb
Thanks,
Charles.
You can get similar result in Zeppelin
z.show(dataframe)

Resources