Is there a way in SumoLogic to store some data and use it in queries? - sumologic

I have a list of IPs that I want to filter out of many queries that I have in sumo logic. Is there a way to store that list of IPs somewhere so it can be referenced, instead of copy pasting it in every query?
For example, in a perfect world it would be nice to define a list of things like:
things=foo,bar,baz
And then in another query reference it:
where mything IN things
Right now I'm just copying/pasting. I think there may be a way to do this by setting up a custom data source and putting the IPs in there, but that seems like a very round-about way of doing it, and wouldn't help to re-use parts of a query that aren't data (eg re-use statements). Also their template feature is about parameterizing a query, not re-use across many queries.

Yes. There's a notion of Lookup Tables in Sumo Logic. Consult:
https://help.sumologic.com/docs/search/lookup-tables/create-lookup-table/
for details.
It allows to store some values (either manually once, or in a scheduled way as as a result of some query) with | save operator.
And then you can refer to these values using | lookup which is conceptually similar to SQL's JOIN.
Disclaimer: I am currently employed by Sumo Logic.

Related

Neo4j Cypher single query performance vs separation the same query on two different queries

I think I'm doing something wrong, but I noticed that sometimes it is better to divide the single query on two separate queries and pass the result of the first one as a parameter to the second query. Doe this make sense? Is there any magic under the hood for external parameters vs query variable passed from one part of the same query to another part?
Yes, it makes sense, although you can chain parts of your queries together using WITH. When you do that, make sure that you watch the cardinality, especially when you have UNWIND clauses.
When you decide to use multiple queries, and depending on their type (as in WRITE / READ ) it may be smart to make sure that, in the case of a WRITE query, they are all part of the same transaction. See https://neo4j.com/docs/cypher-manual/4.0/introduction/transactions/

How to run complex queries in Tarantool

I've always worked with relational DBs and recently decided to migrate a performance-critial service from SQL Server to Tarantool with a hope to take advantage of the fast in-memory search and processing. I've got a couple of questions while planning for the migration.
I've got a table with about one million records containing pricing information which means I'm dealing mostly with numbers and uuids. First, I need to run a select containing multiple conditions to get a subset of the data, like
SELECT * FROM rates WHERE SupplierId = #SupplierId AND ProductId = #ProductId AND (LocalDistributionZoneId = #LocalDistributionZoneId OR LocalDistributionZoneId IS NULL)
Q1: What is the strategy of running such a query in Lua? Do I create an index for each field in the predicate or I can go along with one secondary composite index?
Q2: Will it be more covenient to run such a query in SQL (box.sql.execute) rather than in pure Lua? Will it be considerably slower than running the same query in pure Lua?
Q3: If I use SQL, is it possible to review the execusion plan to make sure that the query I run really uses the indexes I've defined in the space?
Ok, after I've get the results from the first query I need to analyse the data and then based on the results of analysis, run one more query on the dataset returned by the first query.
Q4: Can Tarantool help me in dealing with the intermediate dataset? More specifically, may I somehow run more queries against the intermediate subset of tuples leveraging the indexes created in the space? Or, I would need to implement alternative strategies like re-add the intrim results to a temporary space with pre-defined indexes and then do another select, or implement further search myself?
Thank you!
Don't. Use SQL, it's faster: it doesn't create garbage collected objects for intermediate execution results.
Yes, please use our SQL features for that.
Use EXPLAIN statement.
I don't know what you exactly mean by "help". You could try to whatever strategy works best: create a more complex query, save the original query in a view to use in the resulting query, create a temporary table and work with it. To give more details let's look if the execution plan Tarantool chooses is good enough or you have to manually optimize it.

Is it possible to execute read only cypher queries from java?

I'd like to know just what the title says.
The reason I'd want this is to permit constrained read-only cypher queries to be executed; the data results would later be interpreted and serialized by a separate API layer.
I've seen code that makes basic assumptions in an attempt to mimic this behavior, e.g. the code might filter out any Cypher query that contains certain special words associated with write query structures (merge, create, delete, set, and so on).
This approach tends to be limited and naive though; if it very simply looks for those tokens, it would prevent a query like MATCH n WHERE n.label =~ '.*create.*' RETURN n even though it's a read-only query.
I'd really prefer not to do a full parse on a candidate query and then descend through the AST trying to figure out whether something is read-only or not (although I would gladly accept an answer that shows how to do this easily in java)
EDIT - I'm aware it's possible to start the entire database in read-only mode via the configuration property read_only=true, but this would be undesirable; no other aspect of the java API would be able to change the database.
EDIT 2 - I found another possible strategy, but I'm not sure of its advisability. Comments welcome on this, and potential downsides:
try (Transaction ignore = graphDb.beginTx()) {
ExecutionResult result = executionEngine.execute(query);
// Do nifty stuff with result, then...
// Force transaction to fail.
ignore.failure();
}
The idea here is that if queries happen within transactions and the transaction is always force-failed, then nothing can ever be written to the DB no matter what the result.
Read-only Cypher is (not yet) directly supported. However I can think of two workarounds for that:
1) assuming you're running a Neo4j enterprise cluster: you can set read_only=true on one instance. That instance is then used for the read only queries where the other cluster instances are used for r/w. A load balancer in front of the cluster can be set up to send the requests to the right instance.
2) Use a TransactionEventHandler that vetos a transaction if its TransactionData contains write operations. Just for fun I've invested some minutes to implement that, see https://github.com/sarmbruster/read-only-cypher - feedback is appreciated.

Query templating in cypher? How to avoid repeating myself

My group has many queries that tend to refer to a class of relationship types. So we tend to write a lot of repetitive queries that look like this:
match (n:Provenance)-[r:`input to`|triggered|contributed|generated]->(m:Provenance)
where (...etc...)
return n, r, m
The question has do to with the repetition of the set of different relationship types. Really we're looking for any relationship in a set of relationship types. Is there a way to enumerate a bunch of relationship types into a set ("foo relationships") and then use that as a variable to avoid repeating myself over and over in many queries? This repetitive querying of relationship types tends to create problems when we might add a new relationship type; now many queries distributed through the code base need to all be updated.
Enumerating all possible relationships isn't such a big deal in an individual query, but it starts to get difficult to manage and update when distributed across dozens (or hundreds) of queries. What's the recommended solution pattern here? Query templating?
This is not currently possible as a built-in feature, but it seems like an interesting feature. I would encourage you to post this to the ideas trello board here:
https://trello.com/b/2zFtvDnV/public-idea-board
Perhaps suggesting something like allowing parameters for relationship types:
MATCH (n)-[r:{types}]->(p)
Of course, that makes it much harder for the query engine to optimize queries ahead of time.. A relationship type hierarchy could work, but we are incredibly hesitant to introduce new abstractions to the model unless absolutely necessary. Still, suggestions for improvements are very welcome!
For now, yes, something like you suggest with templates would solve it. Ideally, you'd send the query to neo containing all the relationship types you are interested in, and with other items parameterized, to allow optimal planning. So to do that, you'd do some string replacement on your side to inject the long list of reltypes into the query before sending it off.

How to change model lazyloadness at runtime in Symfony?

I use sfPropelORMPlugin.
Lazyload is ok if I operate on one object per web page. But if there are hundreds I get hundreds of separate DB queries. I'd like to completely disable lazyload or disable it for needed columns on those particularly heavy pages but couldn't find a way so far.
You should join all your relations when you build your query, that way you'll get all data in a single query. Note, you have to use joinWithRelation() where Relation is a related table name.
Elaborating on William Durand's answer, perhaps you should also look at the Propel function doSelectjoinAll(), which should pre-load all of the objects related to your relations. Just keep in mind this can be expensive as it relates to memory.
Another technique is to create a custom criteria with your needed joins, then use a manual hydrate technique to add on to your base object. I do this often when the data I need is using aggregates or other columns that are not exactly mapped to objects. There are plenty of hydrate() examples around.
Added utility method to peer to be able to set what columns I want to load. Using "pseudo columns" for this type of DB queries. Also I have overridden hydrate() to understand this "markup". All were good until I found out that even though data is hydrated symfony won't understand it and won't let you use it as intended.
PS join was never considered as an option because site is kind of high load.

Resources