Swift & Firebase - Cloud firestore scalable? - ios

I'm really new on Cloud Firestore, so it's a bit strange for me to structure the database.
I would like to save my workouts. If I were on RealtimeDatabase I would do something like that:
WorkoutResults
|
+--AutoID
| |
| +--date
| +--userID
| +--result
AND
UserWorkoutResult
|
+--UserID
| |
| +--WodResultGeneratedID
|
In that way, I can only fetch one node to a specific user.
But if I understand well on Cloud Firestore, it's not possible to query on subcollections.
So my question is, do you think this structure is good enough to scale?
WorkoutResults
|
+--AutoID
| |
| +--date
| +--userID
| +--result
By doing something like:
.whereField("userID", isEqualTo: "userIDString").whereField("date", isEqualTo: theDateIWant) ?

Your query looks fine to me. And as Firestore promises, its performance is only related to the number of matching WorkoutResults, not to the size of that collection.
But you could get the exact same result by querying collection("Users").doc("userIDString").collection("WorkoutResults").whereField("date", isEqualTo: theDateIWant) in your first data structure. The only thing that isn't possible there is to query across the WorkoutResults for multiple users, since querying across multiple collections isn't possible.

Related

Odata breaking with unconventional association

I'm trying to make an unconventional join, like this:
builder.HasOne(x => x.MATERIAL_OBJ)
.WithMany()
.HasForeignKey(c => c.MATERIAL)
.HasPrincipalKey(p => p.MATERIAL_CODE);
because the data from one of my tables comes from an external source, and I need to make a join with another table by a non-PK (VARCHAR) field.
My tables are as follow:
Transits table
+---------+----------+
| ID | MATERIAL |
+---------+----------+
| 1 | ABC |
| 2 | HIJ |
+---------+----------+
Material table:
+---------------+---------------+
| MATERIAL_CODE | SUPPLIER_NAME |
+---------------+---------------+
| ABC | SUP 1 |
| DEF | SUP 2 |
+---------------+---------------+
The transits table always comes filled, and sometimes with materials we dont have avaliable. If we have the material, then the object comes filled correctly, the problem I'm facing is that whenever the material doesn't exist in the table, my odata simply doesn't work properly, breaking the return object, like so:
Is there any way to odata to return null, instead of breaking the return?
EDIT: below is the return value:
{"#odata.context":"http://MYAPI/odata/$metadata#TRANSIT(Id,MATERIAL,MATERIAL_OBJ,MATERIAL_OBJ()","value":[{"Id":12567,"MATERIAL":"REDACTED"
Also, this seems to be something with odata, as the objects are filled in the API.
I figured out that was a problem with EF Core because of the unconventional mapping I did. I decided to do a View instead and mapped that to EF.

Rails using Views instead of Tables

I need to create a Rails app that will show/utilize our current CRM system data. The thing is - I could just take Rails and use current DB as backend, but the table names and column names are the exact opposite Rails use.
Table names:
+-------------+----------------+--------------+
| Resource | Expected table | Actual table |
+-------------+----------------+--------------+
| Invoice | invoices | Invoice |
| InvoiceItem | invoice_items | InvItem |
+-------------+----------------+--------------+
Column names:
+-------------+-----------------+---------------+
| Property | Expected column | Actual column |
+-------------+-----------------+---------------+
| ID | id | IniId |
| Invoice ID | invoice_id | IniInvId |
+-------------+-----------------+---------------+
I figured I could use Views to:
Normalize all table names
Normalize all column names
Make it possible to not use column aliases
Make it possible to use scaffolding
But there's a big but:
Doing it on a database level, Rails will probably not be able to build SQL properly
App will probably be read-only, unless I don't use Views and create a different DB instead and sync them eventually
Those disadvantages are probably even worse when you compare it to just plain aliasing.
And so I ask - is Rails able to somehow transparently know the id column is in fact id, but is InvId in the database and vice versa? I'm talking about complete abstraction - simple aliases just don't cut it when using joins etc. as you still need to use the actual DB name.

Ruby on Rails: Join Tables Concept

So I have been out of the coding game for a while and recently decided to pick up rails. I have a question about the concept of Join tables in rails. Specifically:
1) why are these join tables needed in the database?
2) Why can't I just JOIN two tables on the fly like we do in SQL?
A join table allows a clean linking of association between two independent tables. Join tables reduce data duplication while making it easy to find relationships in your data later on.
E.g. if you compare a table called users:
| id | name |
-----------------
| 1 | Sara |
| 2 | John |
| 3 | Anthony |
with a table called languages:
| id| title |
----------------
| 1 | English |
| 2 | French |
| 3 | German |
| 4 | Spanish |
You can see that both truly exist as separate concepts from one another. Neither is subordinate to the other the way a single user may have many orders, (where each order row might store a unique foreign_key representing the user_id of the user that made it).
When a language can have many users, and a user can have many languages -- we need a way to join them.
We can do that by creating a join table, such as user_languages, to store every link between a user and the language(s) that they may speak. With each row containing every matchup between the pairs:
| id | user_id | language_id |
------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 4 |
| 4 | 2 | 1 |
| 5 | 3 | 1 |
With this data we can see that Sara (user_id: 1) is trilingual, while John(user_id: 2) and Anthony(user_id: 3) only speak English.
By creating a join table in-between both tables to store the linkage, we preserve our ability to make powerful queries in relation to data on other tables. For example, with a join table separating users and languages it would now be easy to find every User that speaks English or Spanish or both.
But where join tables get even more powerful is when you add new tables. If in the future we wanted to link languages to a new table called schools, we could simply create a new join table called school_languages. Even better, we can add this join table without needing to make any changes to the languages SQL table itself.
As Rails models, the data relationship between these tables would look like this:
User --> user_languages <-- Language --> school_languages <-- School
By default every school and user would be linked to Language using the same language_id(s)
This is powerful. Because with two join tables (user_languages & school_languages) now referencing the same unique language_id, it will now be easy to write queries about how either relates. For example we could find all schools who speak the language(s) of a user, or find all users who speak the language(s) of a school. As our tables expand, we can ride the joins to find relations about pretty much anything in our data.
tl;dr: Join tables preserve relations between separate concepts, making it easy to make powerful relational queries as you add new tables.

Join between Streaming data vs Historical Data in spark

Let say I have transaction data and visit data
visit
| userId | Visit source | Timestamp |
| A | google ads | 1 |
| A | facebook ads | 2 |
transaction
| userId | total price | timestamp |
| A | 100 | 248384 |
| B | 200 | 43298739 |
I want to join transaction data and visit data to do sales attribution. I want to do it realtime whenever transaction occurs (streaming).
Is it scalable to do join between one data and very big historical data using join function in spark?
Historical data is visit, since visit can be anytime (e.g. visit is one year before transaction occurs)
I did join of historical data and streaming data in my project. Here the problem is that you have to cache historical data in RDD and when streaming data comes, you can do join operations. But actually this is a long process.
If you are updating historical data, then you have to keep two copies and use accumulator to work with either copy at once, so it wont affect the the second copy.
For example,
transactionRDD is stream rdd which you are running at some interval.
visitRDD which is historical and you update it once a day.
So you have to maintain two databases for visitRDD. when you are updating one database, transactionRDD can work with cached copy of visitRDD and when visitRDD is updated, you switch to that copy. Actually this is very complicated.
I know this question is very old but lemme share my viewpoint.Today, this can be easily done in Apache Beam. And this job can run on same spark cluster.

Cassandra cql kind of multiget

i want to make a query for two column families at once... I'm using the cassandra-cql gem for rails and my column families are:
users
following
followers
user_count
message_count
messages
Now i want to get all messages from the people a user is following. Is there a kind of multiget with cassandra-cql or is there any other possibility by changing the datamodel to get this kind of data?
I would call your current data model a traditional entity/relational design. This would make sense to use with an SQL database. When you have a relational database you rely on joins to build your views that span multiple entities.
Cassandra does not have any ability to perform joins. So instead of modeling your data based on your entities and relations, you should model it based on how you intend to query it. For your example of 'all messages from the people a user is following' you might have a column family where the rowkey is the userid and the columns are all the messages from the people that user follows (where the column name is a timestamp+userid and the value is the message):
RowKey Columns
-------------------------------------------------------------------
| | TimeStamp0:UserA | TimeStamp1:UserB | TimeStamp2:UserA |
| UserID |------------------|------------------|------------------|
| | Message | Message | Message |
-------------------------------------------------------------------
You would probably also want a column family with all the messages a specific user has written (I'm assuming that the message is broadcast to all users instead of being addressed to one particular user):
RowKey Columns
--------------------------------------------------------
| | TimeStamp0 | TimeStamp1 | TimeStamp2 |
| UserID |------------|------------|-------------------|
| | Message | Message | Message |
--------------------------------------------------------
Now when you create a new message you will need to insert it multiple places. But when you need to list all messages from people a user is following you only need to fetch from one row (which is fast).
Obviously if you support updating or deleting messages you will need to do that everywhere that there is a copy of the message. You will also need to consider what should happen when a user follows or unfollows someone. There are multiple solutions to this problem and your solution will depend on how you want your application to behave.

Resources