How to monitor Python Application(without django or other framework) in instana - instana

i have a simple python application without any framework and API calls.
how i will monitor python application on instana kubernates.
i want code snippet to add with python application ,which trace application
and display on instana

how i will monitor python application on instana kubernates
There is publicly available guide, that should help you setting up the kubernetes agent.
i have a simple python application without any framework and API calls
Well, instana is for distributed tracing, meaning distributed components calling each other, each other's APIs predominantly by using known frameworks (with registered spans).
Nevertheless, you could make use of SDKSpan, here is a super simple example:
import os
os.environ["INSTANA_TEST"] = "true"
import instana
import opentracing.ext.tags as ext
from instana.singletons import get_tracer
from instana.util.traceutils import get_active_tracer
def foo():
tracer = get_active_tracer()
with tracer.start_active_span(
operation_name="foo_op",
child_of=tracer.active_span
) as foo_scope:
foo_scope.span.set_tag(ext.SPAN_KIND, "exit")
result = 20 + 1
foo_scope.span.set_tag("result", result)
return result
def main():
tracer = get_tracer()
with tracer.start_active_span(operation_name="main_op") as main_scope:
main_scope.span.set_tag(ext.SPAN_KIND, "entry")
answer = foo() + 21
main_scope.span.set_tag("answer", answer)
if __name__ == '__main__':
main()
spans = get_tracer().recorder.queued_spans()
print('\nRecorded Spans and their IDs:',
*[(index,
span.s,
span.data['sdk']['name'],
dict(span.data['sdk']['custom']['tags']),
) for index, span in enumerate(spans)],
sep='\n')
This should work in any environment, even without an agent and should give you an output like this:
Recorded Spans and their IDs:
(0, 'ab3af60079f3ca57', 'foo_op', {'span.kind': 'exit', 'result': 21})
(1, '53b67f7298684cb7', 'main_op', {'span.kind': 'entry', 'answer': 42})
Of course in a production, you wouldn't want to print the recorded spans, but send it to the well configured agent, so you should remove setting the INSTANA_TEST.

Related

how to set up proper printout destination for dask multiprocessing in jupyter notebook on linux

I am using dask in jupyter notebook on a linux server to run python functions on multiple CPUs. The python functions have standard print statement. I would like the output of the print to be shown in the jupyter notebook right below the cell. However, the print out were all shown in the console. Can anyone explain why this happens and how to make dask.function.print output to the notebook, or both the console and the notebook.
The following is a simplified version of the problem:
import dask
import functools
from dask import compute, delayed
iter_list=[0,1]
def iFunc(item):
print('Meme',item)
# call this function itself will print normally to the
# notebook below the cell, desired.
with dask.config.set(scheduler='processes',num_workers=2):
func1=functools.partial(iFunc)
ret=compute([delayed(func1)(item) for item in iter_list])
# surprisingly, Meme 0, Meme 1 only print out to the console,
# not the notebook, Not desired, hard to debug. Any clue?
The whole point of dask is leveraging multiple threads, processes, or nodes/machines to distribute work. The workers you create are therefore not on the same thread as your client, and may not be on the same process, or even the same machine (or like, in the same country) as your client, depending on how you set up your cluster.
If you start a LocalCluster from your jupyter notebook, whether you're using threads or processes, you should see printed output appearing as output in the cells which execute jobs on the workers:
In [1]: import dask.distributed as dd
In [2]: client = dd.Client(processes=4)
In [3]: def job():
...: print("hello from a worker!")
In [4]: client.submit(job).result()
hello from a worker!
However, if a different process is spinning up your workers, it is up to that process to decide how to handle stdout. So if you're spinning up workers using the jupyterlab terminal, stdout will appear there. If you're spinning up workers in a kubernetes pod, stdout will appear in the worker logs. Dask doesn't actively manage standard out, so it's up to you to handle this. Note that this also applies to logging - neither stdout nor logs are captured by dask. This is actually a really important design feature - many distributed systems have their own systems for managing the standard out & logging of nodes, and dask does not want to impose its own parallel/conflicting system for handling output. The main focus of dask is executing the tasks, not managing a distributed logging system.
That said, dask does have the infrastructure for passing around messages, and this is something the package could support. There is an open issue and pull request attempting to add this ability as a feature, but it looks like there are a lot of open design questions that would need to be resolved before this could be added. Many of them revolve around the issues I raised above - how to add a clean distributed logging feature without overburdening the scheduler, complicating the already complex set of configuration options, or overriding the important, existing logging systems users rely on. The dask core team seems to agree that this is a good idea, if the tough design questions can be resolved.
You certainly always have the option of returning messages. For example, the following would work:
In [10]: def job():
...: return_blob = {"diagnostics": {}, "messages": [], "return_val": None}
...: start = time.time()
...: return_blob["diagnostics"]["start"] = start
...:
...: try:
...: return_blob["messages"].append("raising error")
...: # this causes a DivideByZeroError
...: return_blob["return_val"] = 1 / 0
...: except Exception as e:
...: return_blob["diagnostics"]["error"] = e
...:
...: return_blob["diagnostics"]["end"] = time.time()
...: return return_blob
...:
In [11]: client.submit(job).result()
Out[11]:
{'diagnostics': {'start': 1644091274.738912,
'error': ZeroDivisionError('division by zero'),
'end': 1644091274.7389162},
'messages': ['raising error'],
'return_val': None}

Dart Functions Framework usage

I'm new to the Dart functions framework. My goal is to use this package to create several functions and deploy them to Cloud Run (in combination with Firebase, but I guess that's irrelevant to this question).
I've run the quick starts and I've read all of the contents in the docs.
The quick start mentions just one function at a time (e.g. Hello World, Cloud Events, etc..), like this:
import 'package:functions_framework/functions_framework.dart';
import 'package:shelf/shelf.dart';
#CloudFunction()
Response function(Request request) {
return Response.ok('Hello, World!');
}
But as you can see in the quickstarts only one function is handled in a project at a time. How about me wanting to deploy several functions? Should I:
Write several functions in the same project / file, so that the function framework compiles the 'server.dart` by itself
OR
Create a different functions_framework for each function?
Let me be more specific. Should I do the following (option 1 - which makes more sense to me):
import 'dart:math';
import 'package:functions_framework/functions_framework.dart';
import 'package:shelf/shelf.dart';
#CloudFunction()
Response function(Request request) {
return Response.ok('Hello, World!');
}
#CloudFunction()
Response function2(Request request) {
if (Random().nextBool()) {
return Response.ok('Hello, World!');
} else {
return Response.internalServerError();
}
}
Or should I build a different folder by running a build_runner for each function I need in my project?
Is there a difference and/or a best practice?
Thanks in advance.
EDIT. This question is related to the deployment on Cloud Run itself, and not just testing on my own PC. To test my own functions I did the following:
Run dart run build_runner build, so that it updates the server.dart file correctly (I can see that the framework does a lot behind the scenes and that the _nameToFunctionTarget is basically a router);
Run the server in two different terminals, like this: dart run bin/server.dart --port MYPORT --target MYFUNCTION (where MYPORT and MYFUNCTION are either 8080/8081 or function/function2 respectively).
I guess I'm just confused on how to correctly manage this framework once deployed on Cloud Run.
EDIT 2. I just gave up using Dart as a Serverless language or even a Backend language. There's just too much jargon even for the basic things. Any backend framework is either dead, or maintained by one single enthusiast guy (props to him!). This language has not yet received enough love from the Google Team / the community and at this moment in time is basically not possible to go fullstack on just Dart. It's a dream, but it can't be realized now. Furthermore, Dart hardly lacks a proper SDKs to use Firestore, etc., so Firebase isn't an option. I find it easier to just learn NodeJS and exploit the Firebase support for Firebase Functions written in NodeJS, and I'll wait for more support in there in the future, if there ever will be.
The documentation is a bit sparse right now (and I'm new to it also! I couldn't find any good examples, so here goes...)
You can only have a single function that is served. It should be
named 'function' (the type and name can be overriden, see the
cloudevent example dartfn generate cloudevent)
You 'could' have many of these deployed so that each does a specific thing, such as processing cloudevents above, but most people
want something more REST-like (see next)
You need to attach a Router() so that you can have the single entry point (function) handled by specific logic in your code.
Example for Rest
add to pubspec.yaml (in dependencies:) shelf_router: ^1.1.2
delegate the #CloudFunction to use the Router()
functions.dart
import 'package:functions_framework/functions_framework.dart';
import 'package:shelf/shelf.dart';
import 'package:shelf_router/shelf_router.dart';
Router app = Router()
..get('/health', (Request request) {
return Response.ok('healthy');
})
..get('/user/<user>', (Request request, String user) {
// fetch the user... (probably return as json)
return Response.ok('hello $user');
})
..post('/user', (Request request) {
// convert request body to json and persist... (probably return as json)
return Response.ok('saved the user');
});
#CloudFunction()
Future<Response> function(Request request) => app.call(request);

How can I keep a PBSCluster running?

I have access to a cluster running PBS Pro and would like to keep a PBSCluster instance running on the headnode. My current (obviously broken) script is:
import dask_jobqueue
from paths import get_temp_dir
def main():
temp_dir = get_temp_dir()
scheduler_options = {'scheduler_file': temp_dir / 'scheduler.json'}
cluster = dask_jobqueue.PBSCluster(cores=24, memory='100GB', processes=1, scheduler_options=scheduler_options)
if __name__ == '__main__':
main()
This script is obviously broken because after the cluster is created the main() function exits and the cluster is destroyed.
I imagine I must call some sort of execute_io_loop function, but I can't find anything in the API.
So, how can I keep my PBSCluster alive?
I'm thinking that the section of the Python API (advanced) in the docs might be a good way to try to solve this issue.
Mind you this is an example of how to create Schedulers and Workers, but I'm assuming that the logic could be used in a similar way for your case.
import asyncio
async def create_cluster():
temp_dir = get_temp_dir()
scheduler_options = {'scheduler_file': temp_dir / 'scheduler.json'}
cluster = dask_jobqueue.PBSCluster(cores=24, memory='100GB', processes=1, scheduler_options=scheduler_options)
if __name__ == "__main__":
asyncio.get_event_loop().run_until_complete(create_cluster())
You might have to change the code a bit, but it should keep your create_cluster running until it finished.
Let me know if this works for you.

IBM Cloud Functions - Python Actions

I'm learning how to use Serverless Functions, I'm working trying to connect a Watson assistant through webhooks using a python action that is processing a small dataset, I'm still struggling to succeed on it.
I’ve done my coding on Jupyter environment calling raw csv dataset from Github and using pandas to handle it. The issue is when I’m invoking the action into IBM Functions works 10% of the times. I did debug on Jupyter and Visual Studio environments and the code seems to be ok, but once I move the code to the IBM Functions environment it doesn't perform.
import sys
import csv
import json
import pandas as pd
location = ('Germany') #Passing country parameter for testing purpose
data = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/03-24-2020.csv')
def main(args):
location = args.get("location")
for index, row in data.iterrows():
currentLoc = row['Country/Region']
if currentLoc == location:
covid_statistics = {
"Province/State": row['Province/State'],
"Country/Region": row['Country/Region'],
"Confirmed":row['Confirmed'],
"Deaths":row['Deaths'],
"Recovered":row['Recovered']
}
return {"message": covid_statistics}
else:
return {"message": "Data not available"}

Network Tables C++

I am quite new to C++ socket programming. Since I am in an FRC team, I need to communicate between my application and the Compact RIO via an interface known as "Network Tables". I need to communicate from my C++ vision application to our robot code in Java. How do I implement NetworkTables in regular C++?
So here is what I did in python but the concept is the same. The goal would be to move motors based on values (sensor data) from what you receive in your driver station? so, how do I accomplish this... data transfers will be done through network tables
first, initlize...
from networktables import NetworkTables
# As a client to connect to a robot
NetworkTables.initialize(server='roborio-XXX-frc.local')
creating the instance you will be able to access NetworkTables conections, configure settings, listeners and create table objects which is what is actually being used to send data
next,
sd = NetworkTables.getTable('SmartDashboard')
sd.putNumber('someNumber', 1234)
otherNumber = sd.getNumber('otherNumber')
Here, we're interacting with the SmartDashboard and calling two methods, to send and recieve values.
another example, from API docs
#!/usr/bin/env python3
#
# This is a NetworkTables server (eg, the robot or simulator side).
#
# On a real robot, you probably would create an instance of the
# wpilib.SmartDashboard object and use that instead -- but it's really
# just a passthru to the underlying NetworkTable object.
#
# When running, this will continue incrementing the value 'robotTime',
# and the value should be visible to networktables clients such as
# SmartDashboard. To view using the SmartDashboard, you can launch it
# like so:
#
# SmartDashboard.jar ip 127.0.0.1
#
import time
from networktables import NetworkTables
# To see messages from networktables, you must setup logging
import logging
logging.basicConfig(level=logging.DEBUG)
NetworkTables.initialize()
sd = NetworkTables.getTable("SmartDashboard")
i = 0
while True:
print("dsTime:", sd.getNumber("dsTime", -1))
sd.putNumber("robotTime", i)
time.sleep(1)
i += 1

Resources