How to search multiple data with high performance in mysql query browser? - mysql-5.5

I'm using MySQL Query Browser and having two tables. One is tblbusiness having records 1.4 million with one of the columns zipcode and other table tblbusinessnew having 300,000 records with columns pincode, longitude and latitude.
SELECT tblbusiness.*,
tblbusinessnew.zipcode, tblbusinessnew.longitude, tblbusinessnew.latitude
FROM tblbusiness, tblbusinessnew
WHERE tblbusiness.zipcode=tblbusinessnew.pincode;
If I search for few records it is fine but mysql query browser suddenly dissapper when I run this query because one zipcode of tblbusiness have multiple pincode in tblbusinessnew. Please fix this problem.

Related

why does rows get the "duplicated" records in Tableau when the tables are integrated?

I cannot find any helpful information about Tableau. For your information, I have to make generic tables.
I have used the particular data source, Snowflake, as a database via Tableau. With tableau's data source, I am able to edit the connection for each endpoint's environment such as sandbox, production, etc.
At this point, in Snowflake I made some queries to check the accurate data. I have to make up the simple table and query below. The US table and England table get for every 300 rows (it's just an example).
SELECT name, gender, age,
FROM USA a
LEFT JOIN England b ON (a.gender = b.gender)
WHERE gender = 'female';
Above the query is technically making double rows that are duplicated. The rows get 200. I need to avoid the duplicated rows, so I need to take care of joins operation and group by.
SELECT name, gender, age, COUNT(*)
FROM USA a
LEFT JOIN England b ON (a.gender = b.gender)
WHERE gender = 'female'
GROUP BY 1,2,3;
As a result, the data is accurate and gets 100 rows in Snowflake.
I created the dashboard in Tableau since I integrated the two connections from different environments such as Production or Sandbox. I think this screenshot is a bit helpful for different colors in the connections section (its source is here). I have identified the inaccurate data for 200 rows since I made logical or physical tables in the data source. Also, I made a relationship and index for the US and England table.
With Tableau, in my goal, I need 100 rows for the dashboard. How can I fix this to avoid the duplicated rows in Tableau's data source?
Infomation:
The version Tableau 2021.4

How can I quickly query SQLite given a list of rowids?

How can I quickly query for records that match a list of rowids? I have a query in my iOS app that looks like this:
SELECT rowid, category_id
FROM items
WHERE rowid in (2, 4, 89, 4243, 44, 555, ...)
The list of rowids can be somewhat long - a few hundred items would be a typical example.
The odd thing is that this query takes several seconds to run - as much as 12 seconds in some cases. It's slow whether I run it in the SQLite shell or whether I run it in my app.
However, if I replace this query with just:
SELECT rowid, category_id
FROM items
so I retrieve EVERY item in the table (in my test case, around 1000 rows) and just have my app ignore the rowids it doesn't need, the query executes in just a few hundred milliseconds. It also responds quickly at the SQLite shell.
What's happening here? rowid is a primary key, so this should be a fast, indexed lookup. Is there a faster way to run queries like this? I would've thought it was parsing the query string that makes the difference, but profiling shows almost all time is spent in sqlite3_step.
SQL query with out where clause is always faster than query with where class. You mention RowID is primary key (so, its cluster index created by default) in your table.
Please try to replace 'where...In' with 'where...between' or where...rowId>X and RowId
Other solution is move comparable row ids list to temp table / view and perform inner join to get the results faster.

Is it necessary to add index to latitude and longitude fields

I am using rails and geocoder gem with postgres database. Therefore I have to add the latitude and longitude fields to database. So generally speaking isnt it better to add indexing to those fields for faster querying?
If you end up querying database for records using latitude and longitude, you'll definitely benefit from adding an index. Indexes will be used not only for exact matching queries, but also for comparison queries, such as select * from table_name where latitude between 30 and 40 and longitude > 50.
Depending on the queries and the number of records, postgres query planner will choose the most optimal way to find matching records (either sequential scan, or index scan).

InfluxDB goes down for huge data

I am building a dashboard using InfluxDB. I have a source which generates approx. 2000 points per minute. Each point has 5 tags, 6 fields. There is only one measurement.
Everything works fine for about 24hrs but as the data size grows, I am not able to run any queries on influx. Like for example, right now I have approx 48hrs of data and even a basic select brings down the influx db,
select count(field1) from measurementname
It times out with the error:
ERR: Get http://localhost:8086/query?db=dbname&q=select+count%28field1%29+from+measuementname: EOF
Configuration:
InfluxDB version: 0.10.1 default configuration
The OS Version:Ubuntu 14.04.2 LTS
Configuration: 30GB RAM, 4 VCPUs, 150GB HDD
Some Background:
I have a dashboard and a web app querying the influxdb. The webapp lets a user query the DB based on tag1 or tag2.
Tags:
tag1 - unique for each record. Used in a where clause in the web app to get the record based on this field.
tag2 - unique for each record. Used in a where clause in the web app to get the record based on this field.
tag3 - used in group by. Think of it as departmentid tying a bunch of employees.
tag4 - used in group by. Think of it as departmentid tying a bunch of employees.
tag5 - used in group by. Values 0 or 1 or 2.
Pasting answer from influxdb#googlegroups.com mailing list: https://groups.google.com/d/msgid/influxdb/b4fb503e-18a5-4bd5-84b1-632dc4950747%40googlegroups.com?utm_medium=email&utm_source=footer
tag1 - unique for each record.
tag2 - unique for each record.
This is a poor schema. You are creating a new series for every record, which puts a punishing load on the database. Each series must be indexed, and the entire index currently must reside in RAM. I suspect you are running out of memory after 48 hours because of series cardinality, and the query is just the last straw, not the actual cause of the low RAM situation.
It is very bad practice to use a unique value in tags. You can still use fields in the WHERE clause, they just aren't as performant, and the damage to your system is much less than having a unique series for every point.
https://docs.influxdata.com/influxdb/v0.10/concepts/schema_and_data_layout/
https://docs.influxdata.com/influxdb/v0.10/guides/hardware_sizing/#when-do-i-need-more-ram

Same Cypher Query has different performance on different DBs

I have a fullDB, (a graph clustered by Country) that contains ALL countries and I have various single country test DBs that contain exactly the same schema but only for one given country.
My query's "start" node, is identified via a match on a given value for a property e.g
match (country:Country{name:"UK"})
and then proceeds to the main query defined by the variable country. So I am expecting the query times to be similar given that we are starting from the same known node and it will be traversing the same number of nodes related to it in both DBs.
But I am getting very difference performance for my query if I run it in the full DB or just a single country.
I immediately thought that I must have some kind of "Cartesian Relationship" issue going on so I profiled the query in the full DB and a single country DB but the profile is exactly the same for each step in the plan. I was assuming that the profile would reveal a marked increase in db hits at some point in the plan, but the values are the same. Am I mistaken in what profile is displaying?
Some sizing:
The fullDB would have 70k nodes, the test DB 672 nodes, the time in full db for the query to complete is 218764ms while the test db is circa 3407ms.
While writing this I realised that there will be an increase in the number of outgoing relationships on certain nodes (suppliers can supply different countries) which I think is probably the cause, but the question remains as to why I am not seeing any indication of this in the profiling.
Any thoughts welcome.
What version are you using?
Both query times are way too long for your dataset size.
So you might check your configuration / disk.
Did you create an index/constraint for :Country(name) and is that index online?
And please share your query and your query plans.

Resources