So, I´ve been using influxdb for a while now, but I can't retain data for more than 30 days. But I´ve been struggling to find how to do this. Wanted to know if that kinda retention policy is a paid feature on the influx cloud and if it is which plan has that specific feature.
I searched about the feature, but I did not find anything regarding it in the plans descriptions
Related
I would like to store audit logs on our GCP cluster (where our app is). There are different storage/db options out there. We are looking into one table, bucket on similar without some relationships.
Background: we are delivering enterprise high-scale saas solution
What I need to do with our audit logs write, search them by audit logs fields/columns and to combine (AND, OR). Also sort options are important.
I focused on following options (please let me know if there is something else that matches better)
Cloud Storage
Cloud Firestore
GCP managed Atlas Kafka
Our requirements are:
to have a scalable and high performance storage
that data are encrypted at rest
to have search capability (full test search will be perfect but I'm good with simple search by column/filed)
What I've found so far from requirements point:
Mongo has greater performances then Firebase. Not sure comparing Cloud Storage (standard mode) with Mongo.
Cloud Storage and Cloud Firestore do encrypt data. Not sure about Mongo
Cloud Firestore and Mongo have search capability out of the box (not full text search). Cloud Storage has search with the BigQuery and over the permanent/temp tables.
My god-feeling is that Cloud Storage is not the best choice. I think that search capability is kind of cumbersome. Also that's document based structure for large binary docs (images, videos). Please correct me if I'm wrong.
Last 2 are more close to the matching solution. From the enterprise standpoint Mongo looks closer.
Please let me know your thoughts.
Use BigQuery! You can sink the logs directly in BigQuery. In GCP, all the data are encrypted at rest. BigQuery is a powerful datawarehouse with strong query capacity. All your requirement are met with this solution.
I want to present release data complexity which is associated with each node like at epic, userstory etc in grafana in form of charts but grafana do not support neo4j database.Is there any way Directly or indirectly to present neo4j database in grafana?
I'm having the same issues and found this question among others. From my research I cannot agree with this answer completely, so I felt I should point some things out, here.
Just to clarify: a graph database may seem structurally different from a relational or time series database, but it is possible to build Cypher queries that basically return graph data as tables with proper columns as it would be with any other supported data source. Therefore this sentence of the above mentioned answer:
So what you want to do is just not possible.
is not absolutely true, I'd say.
The actual problem is, there is no datasource plugin for Neo4j available at the moment. You would need to implement one on your own, which will be a lot of work (as far as I can see), but I suspect it to be possible. For me at least, this will be too much work to do, so I won't use any approach to read data directly from Neo4j into Grafana.
As a (possibly dirty) workaround (in my case), a service will regularly copy relevant portions of the Neo4j graph into a relational database (or a time series database, if the data model is sufficiently simple for that), which Grafana is aware of (see datasource plugins), so I can query it from there. This is basically the replication idea also given in the above mentioned answer. In this case you obviously end up with at least 2 different database systems and an additional service, which is not so insanely great, but at the moment it seems to be the quickest way to resolve the problem with the missing datasource plugin. Maybe this is applicable in your case, too.
Using neo4j's graphite metrics you can actually configure data to be sent to grafana, and from there build whichever dashboards you like.
Up until recently, graphite/grafana wasn't supported, but it is now (in the recent 3.4 series releases), along with prometheus and other options.
Update July 2021
There is a new plugin called Node Graph Panel (currently in beta) that can visualise graph structures in Grafana. A prerequisite for displaying your graph is to make sure that you have an API that exposes two data frames, one for nodes and one for edges, and that you set frame.meta.preferredVisualisationType = 'nodeGraph' on both data frames. See the Data API specification for more information.
So, one option would be to setup an API around your Neo4j instance that returns the nodes and edges according to the specifications above. Note that I haven't tried it myself (yet), but it seems like a viable solution to get Neo4j data into Grafana.
Grafana support those databases, but not Neo4j : Graphite, InfluxDB, OpenTSDB, Prometheus, Elasticsearch, CloudWatch
So what you want to do is just not possible.
You can replicate your Neo4j data inside of those database, but the datamodel is really different ... (timeseries vs graph).
If you just want to have some charts, you can use Apache Zeppeline for that.
apologies in advance if this is trivial, but I only have about a month’s worth of experience in Realm and Swift (but I’m not new to programming in general), and I’ve devoted for literally dozens of hours trying to find a solution.
What I am trying to make: An app that retrieves price data from a single, global realm, /Companies, in a Realm Cloud instance, then graphs it. I need exactly two users to have access to the global realm. One “user" who periodically (every few months) updates the data (but does nothing else), and one user who can read and graph anything that the other user puts in /Companies.
Note: The app works as intended if it's using only a local database, but I’m fairly sure this means that it can’t be updated.
Apparently, this is an extremely complex goal, because I cannot find a single example of something like this. I have tried so many “Permissions" and multi-user examples from the realm.io.docs and every site multiple pages into Google, but everything is on how to create private realms or restrict additional users from seeing an existing realm, or they do not explain how to just create a user and let them access a realm in the Cloud. There are a few hundred objects in /Companies, so doing it by hand using Realm Studio is not practical. Could someone please explain how to hard code just a single global realm with two users?
This is really stupid that I’m having so much trouble with this, but I’m completely stumped. I don’t need the app to be adaptive, I don’t need any fancy UI, security, restrictions, none of that. I just need to be able to update some data from a separate entity--Realm Studio, a macOS app, an ios app that never leaves the computer-- whatever, so that the client iOS app can see and use the new data.
If someone is able to provide a solution, I’ll probably cry out of gratitude, I’m so exhausted and frustrated.
As far as I can tell, you simply want one user to update and one user to retrieve and use the updated data. I suggest the following two cloud solutions:
Firebase
One solution could be to use a backend, say Firebase, that will be used by the one updating the Realm - Jared Davidson has a decent tutorial that will get you started.
Realm Platform Cloud
Furthermore for a simplified and powerful solution, you could use Realm's own cloud service, Realm Platform Cloud, having a 30-day free trial followed by 30$/month.
On the old pricing page they mention that all the Google Compute instances used by Cloud Dataflow workers are billed based on sustained use price rules, but the new pricing page does not mention it anymore.
I asume that since internally it is using the same Compute instances, the discount should probably apply, but since I couldn't find any mention of it anywhere, I would appreciate if anyone is able to confirm this.
Old Pricing
New Pricing
In the new pricing model there is no sustained use discount.
I'm trying to establish whether Amazon SimpleDB is suitable for a subset of data I have.
I have thousands of deployed autonomous sensor devices recording data.
Each sensor device essentially reports a couple of values four times an hour each day, over months and years. I need to keep all of this data for historic statistical analysis. Generally, it is write once, read many times. Server-based applications run regularly to query the data to infer other information.
The rows of data today, in SQL look something like this:
(id, device_id, utc_timestamp, value1, value2)
Our existing MySQL solution is not going to scale up much further, with tens of millions of rows. We query things like "tell me the sum of all the value1 yesterday" or "show me the average of value2 in the last 8 hours". We do this in SQL but can happily change to doing it in code. SimpleDBs "eventual consistency" appears fine for our puposes.
I'm reading up all I can and am about to start experimenting with our AWS account, but it's not clear to me how the various SimpleDB concepts (items, domains, attributes, etc.) relate to our domain.
Is SimpleDB an appropriate vehicle for this and what would a generalised approach be?
PS: We mostly use Python, but this shouldn't matter when considering this at a high level. I'm aware of the boto library at this point.
Edit:
Continuing to search on solutions for this I did come across Stack Overflow question What is the best open source solution for storing time series data? which was useful.
Just following up on this one many months later...
I did actually have the opportunity to speak to Amazon directly about this last summer, and eventually got access to the beta programme for what eventually became DynamoDB, but was not able to talk about it.
I would recommend it for this sort of scenario, where you need a primary key and what might be described as a secondary index/range - eg timestamps. This allows you much greater confidence in search, ie "show me all the data for device X between monday and friday"
We haven't actually moved to this yet for various reasons but do still plan to.
http://aws.amazon.com/dynamodb/
I my opinon, Amazon SimpleDb as well as Microsoft Azure Tables is a fine solution as long as your queries are quite simple. As soon as you trying to do stuff that's absolutely a non-issue on relational databases like aggregates you begin to run into trouble. So if you are going to do some heavy reporting stuff it might get messy.
It sounds like your problem may be best handled by a round-robin database (RRD). An RRD stores time variable data in such a way so that the file size never grows beyond its initial setting. It's extremely cool and very useful for generating graphs and time series information.
I agree with Oliver Weichhold that a cloud based database solution will handle the usecase you described. You can spread your data across multiple SimpleDB domains (like partitions) and stored your data in a way that most of your queries can be executed from a single domain without having to traverse the entire database. Defining your partition strategy will be key to the success of moving towards a cloud based DB. Data set partitioning is talked about here