I'm implementing a search feature in my app. I would like the user to look up a word simultaneously in multiple attributes of a given Entity.
Here is an example for an Entity with 3 String attributes: Person
(firstName, lastName, notes)
Let's use a mock dataset with 3 people:
"Emily", "Bridges", "She will be in town real soon."
"Johnny", "Williams", "This dude is really cool."
"Will", "Smith", "He does not remember anything for some reason."
Now, let's assume the user is looking up the occurence "will" and that we run a case insensitive search. All three previously described people will match the word "will" thanks to the use of an orPredicateWithSubpredicates
Ideally I would like the results to be displayed in this order for relevancy purposes:
"Will", "Smith", "He does not remember anything for some reason."
"Johnny", "Williams", "This dude is really cool."
"Emily", "Bridges", "She will be in town real soon."
For this search feature "firstName" is more relevant than "lastName" which are both more relevant than the "notes" attribute.
Since I'm using a UISearchDisplayController, I also use an NSFetchedResultsController which requires an NSSortDescriptor. The problem for me now is what attribute/key I am going to use to init the NSSortDescriptor?
I've been through many posts already and thought a transient property could help me with this issue, but I can't figure out how/when to set up this transient property which could be named something like "sortKey" and be set to these values:
1: For a match on "firstName"
2: For a match on "lastName"
3: For a match on "notes"
Eventually I guess I could try to run three different requests but then I'd have to give up using NSFetchedResultsController and all its magic...
I don't know whether I'm hitting the limits of NSFetchedResultsController or something but any pointer would be great, thanks!
Joss.
Related
Say you have a Core Data model with Entity Subjects and Attributes Algebra, Biology, Calculus, Chemistry, Physics, like this:
Subjects
======
Algebra (Boolean)
Biology (Boolean)
Calculus (Boolean)
Chemistry (Boolean)
Physics (Boolean)
Let's say that Algebra, Calculus, and Physics are true while Biology and Chemistry haven't yet been assigned values. How can I get the array: ["Algebra", "Calculus", "Physics"]? I would think it involves NSPredicate but I'm really not sure how to do it.
And as a side question, would default values matter here? i.e., should I make everything start off as false?
Thank you!
You propose an entity named Subjects, with attributes named Algebra, Biology, etc, each of which is a boolean that represents whether the user has "learned" that subject. From that, you wish to compose an array comprising all the names of the subjects that the user has learnt.
Although that is possible, it's quite difficult. I recommend instead defining your Subject entity as:
Subject
=======
name (String)
learnt (boolean)
(Note the convention that entity names are usually singular and start with uppercase, whereas attribute names begin with lowercase.) In the first run of your app you might then create a number of instances of the Subject entity, with the required default values:
name learnt
==== ======
"Algebra" false
"Biology" false
"Calculus" false
... ...
This has the advantage that you can add further subjects at a later date, without needing to modify the structure of your data: it is a simple matter of creating an additional instance with the appropriate name.
As your user progresses through their learning in the app, you can set the value for the learnt attribute to true for the relevant Subject instance. To obtain an array containing only those Subject instances which the user has successfully learnt, you can then use a fetch request with a predicate:
fetchRequest.predicate = NSPredicate(format:"learnt == true")
The array that is returned contains the relevant instances of the Subject entity. It is then simple to obtain the names for display purposes.
You mention in comments having further information about all the subjects. If you model all that in CoreData (eg. having Topic, Lesson, Test entities, etc - I speculate, that's for you to design) you can create relationships to that data from your Subject entity. For example, a Subject might have a relationship to many Topics, each of which has many Lessons and many Tests, etc. That way, when your app starts, you can fetch all the Subject instances and display them in a tableView (showing the name attribute). When the user taps a Subject to begin or continue their learning, you have the relevant Subject instance which provides the link (via the relevant relationship) for you to display the material for the chosen subject.
For the simplicity of my question, this is my Core Data model (doesn't make perfect sense, only for the example):
Book
-------
- title
- readers (to-many relationship to Reader)
Reader
------
- name
- book (to-one relationship to Book)
Currently a book with the same title can have multiple instances in db.
but I want to change that, I want to merge all the books with the same title to one instance (delete all the rest) and merge their readers.
For example, if my db looks like this:
1. Book title "A" readers: "1", "2", "3"
2. Book title "B" readers: "4", "5", "3", "7"
3. Book title "A" readers: "4", "1"
the new db will be:
1. Book title "A" readers: "1", "2", "3", "4"
2. Book title "B" readers: "4", "5", "3", "7"
as you can see, both books with title "A" where merge to one record including the readers.
So my question is how to do this effectively.
I'm thinking of some kind of query that will bring me all the books with more then one instance with the same title, and then maybe order them to groups according to the title.
Not sure if this is the right solution here,
Any help will be appreciated
I do not have enough time for a complete answer, but here is the pointer:
You can fetch duplicates with a fetch request of dictionary result type. You have to group it by title (property description) and add a count column (expression description). You can filter the result with a having argument, so you get the titles with count>1 only.
Hope that helps.
Why are you keeping multiple instances of a book with same title.
Even if two objects have the same contents they are still different.
Why don't you prefer adding readers to the already created book object.
But still if you want to merge the changes something you would want to do is
First fetch all books with the same title like 'A' using a predicate
Than seeing the first object as the one you want to merge in all other objects iterate over all the objects and add its readers but only those with a name that are not already present in the first one.
delete all the book objects with title 'A' except the first one.
Now you would have only one object with a unique Book title having all the readers and the next time you have to add a reader
Fetch the already created book instance from the store.
Add the reader object in it.
UPDATE
Get only the first Book object with a specific title
Then query for all the readers in the store whose book title is the same and set there book object to this one.
Delete all other Book objects with a specific title except the first u used.
Remove multiple occurrences of readers in the book object.
Fetch all the books. Then find the unique titles:
NSArray *uniqueBooks = [allBooks valueForKeyPath:#"#distinctUnionOfObjects.title"];
Then loop over the uniqueBooks and for each book fetch all the matching entities. Convert the entities into an array and do another unique filtering.
NSArray *uniqueReader = [allReaders valueForKeyPath:#"#distinctUnionOfObjects.name"];
Now you have a book and all the readers for that book insert them into a new store. When you are finished building the new store you can use that and delete the old one. You could also insert them into the same store but it would get complicated how to keep the new ones and delete the old ones and when deleting you would have to be careful with the delete behaviours and such.
I am relatively new to MongoDB and am still getting used to schema design.
In a project that I am currently working on, users can tag files that they upload. There are three types of tags: descriptive, brand, and store_department. They are presented as three fields to the user but in reality they are merged together and saved as tags, i.e.:
"tags" : [
{
"type" : "descriptive",
"tag" : "this is my tag"
},
{
"type" : "brand",
"tag" : "this is another tag"
}
]
This is to make searching very easy. By using a type, I can present the user three distinct fields to encourage them to provide the information as well as then allow for more advanced queries such as search by brand or store department. A default search will just search for matching tags.
The issue is that I provide autocomplete functionality in all of the fields. As the user types in the "brand" field, all created tags of type "brand" are displayed that match their input. This is easily accomplished by having a stand alone tag collection. New tag documents are created and updated when the file document is saved. The autocomplete queries against the stand alone tag collection instead of the embedded tags for performance.
Something feels wrong with this design. It is a duplication of efforts in some regard but seems to work great as far as the user experience is concerned. I use Mongoid and to accomodate this design have had to create two models for my tag collections. One that defines the two attributes and a second that inherits from the first but adds the embedded_in macro.
I could see this pattern being useful in other instances as well: products and shopping carts, products and purchase orders, etc. Is there a better way?
Something feels wrong with this design. It is a duplication of efforts in some regard but seems to work great as far as the user experience is concerned.
In a NoSQL database, you have to denormalize sometimes. This will lead to some amount of data duplication. But as it can greatly improve performance (and works great for user experience), this should be worth it.
So a collection with distinct tag name to be used for autocompletion might make sense. It will be much smaller than the non-distinct tags in the embedded documents. Nothing wrong with that approach.
This "master" collection would be a good place to add additional meta-data for the tag, for example the description and wiki Stackoverflow has for tags here.
Also, it might be better to have a separate field for every type of tag if there are only a few types. This way, you can index them separately.
"tags" : { "descriptive: [ "this is my tag" ],
"brand": ["this is another tag" ] }
I'm new to MongoDB and I've used RDBMS for years.
Anyway, let's say I have the following collections:
Realtors
many :bookmarks
key :name
Houses
key :address, String
key :bathrooms, Integer
Properties
key :address, String
key :landtype, String
Bookmark
key :notes
I want a Realtor to be able to bookmark a House and/or a Property. Notice that Houses and Properties are stand-alone and have no idea about Realtors or Bookmarks. I want the Bookmark to be sort of like a "join table" in MySQL.
The Houses/Properties come from a different source so they can't be modified.
I would like to be able to do this in Rails:
r = Realtor.first
r.bookmarks would give me:
House1
House2
PropertyABC
PropertyOO1
etc...
There will be thousands of Houses and Properties.
I realize that this is what RDBMS were made for. But there are several reasons why I am using MongoDB so I would like to make this work.
Any suggestions on how to do something like this would be appreciated.
Thanks!
OK, first things first. You've structured your data as if this were an RDBMS. You've even run off and created a "join table" as if such a thing were useful in Mongo.
The short answer to your question is that you're probably going to have re-define "first" to load the given "Bookmarks". Either "server-side" with an $in clause or "client-side" with a big for loop.
So two Big Questions about the data:
If Bookmarks completely belong to a Realtor, why are they in their own collection?
If Realtors can Bookmark Houses and Property, then why are these in different collections? Isn't this needless complication? If you want something like Realtor.first on bookmarks why put them in different collections?
The Realtors collection should probably be composed of items that look like this:
{"name":"John", "bookmarks": [
{"h":"House1","notes":[{"Nice location","High Ask"}] },
{"p":"PropertyABC","notes":[{"Haunted"}] }
] }
Note how I've differentiated "h" and "p" for ID of the house and ID of the property? If you take my next suggestion you won't need even that.
Taking this one step further, you probably want Houses and Properties in the same collection, say "Locations". In the "Locations" collection, you're just going to stuff all Houses and Properties and mark them with "type":"house" or "type":"property". Then you'll index on the "type" field.
Why? Because now when you write the "first" method, your query is pretty easy. All you do is loop through "bookmarks" and grab the appropriate key ("House1", "PropertyABC") from the "Locations" collection. Paging is straight forward, you query for 10 items and then return.
I know that at some level it seems kind of lame."Why am I writing a for loop to grab data? I tried to stop doing that 15 years ago!" But Mongo is a "document-oriented" store, so it's optimized for loading individual documents. You're trying to load a bunch of documents, so you have to jump through this little hoop.
Fortunately, it's not all bad. Mongo is really fast at loading individual docs. Running a query to fetch 10 items at once is still going to be very quick.
How do you handle real name conflicts? Is there an established best practice or UI design pattern for disambiguating records like this? If authors can have many articles but more than one author can possibly have the same name how would you enable users to select the author they actually want when creating articles?
I can't dictate the author names be unique. The authors may have some other information that could individuate them (their articles or other optional fields).
To make this clearer - users are not authors. Users are people entering information about authors and articles. The only guaranteed information present for an author is the author's name. Other details are optional.
So if a user is creating a new record for an article they will have to either select or create an author for the many-to-many relationship between authors & articles.
With unambiguous rails examples such as the blog post category dropdown, like ryan bates uses in his railscasts, it is easy to create or update. If it exists link the blog post to it, if it doesn't then create and link the blog post to it.
My case is much messier. If it exists isn't that meaningful but I don't want to create a separate author entry for every article the author does.
Presumably you have a key that means you know which user authored which records, so it comes down to how you can best disambiguate them for your users.
Perhaps you need to ask your authors for a brief summary of themselves in their profile that you can use to disambiguate them on their terms. Alternatively depending on the type of article you might choose to describe them in terms of geography ("John Biggs, Florida", "John Biggs, California" ) or perhaps by the subject areas they choose to write about: "John Biggs, Java Expert", "John Biggs, Indonesia Specialist" and so on.
You could even just have "John Biggs (1)", "John Biggs (2)" and so on. I seem to recall this works alright for IMDB, who are a good example of a site that has had to sort this problem.
The important thing in usability where these types of thing are concerned is consistency- you need to always identify your authors in the same way so you don't have "John Biggs, Florida" and "John Biggs (2)" and you need to make sure that the identity you give to an author doesn't change once it is set up, so "John Biggs (2)" never becomes "John Biggs (5)" and your users can identify them whenever they see the disambiguated name as the same person who had that name previously.
One thing that worked for me on a past project is to have a text box in which users can type in the author's name. As they type, I update a div with possible matches - similar to Stack Overflow when you type a tag in the ignored or interesting box.
Users can click on a name in the div which opens the record in a new window - new window has a button, "select this author," which takes you back to the original page with that author in the textfield as Author Name (id).
If they submit the form with an ambiguous name, we have an extra step where we display matches, and they choose which one they mean.
I imagine you'd want something a little more streamlined if this is a data-entry type application, but on that project adding an author was an infrequent operation.
Several things to think about:
Can you filter by subject matter first?
For instance if John Jones (1) writes articles about genetics and John Jones (2) writes articles about computer networking, bu having the user select the general subejct matter first, you may be able to filter out many of the less applicable possible duplicate names.
(I would however have a button to see the unfiltered list becasue sometimes people write arrticles in a new subject matter). If you don't want to limit the choices perhaps a sort by subject matter or location could make it easier to find the right one.
When you show the list of possible duplicate names, show general information about the author including address and university affiliation and possibly the name of one article. Have a button to click on to show existing articles for any one of them. That way if you know the John Jones you want is located in FL, you only need to check out the three in Fl for articles not all 37 John Jones who wrote genetics articles.
Be aware that users are often lazy, they would rather just insert a new name than choose from a long list of existing names. So make it harder to insert a new name than to pick one. They have to go through the pick process first before they can enter a new name. We have an application which doesn't even show the button to add a new person until after you have done a search. Since names can have variations consider if you want to use fuzzy logic for your search. You might want to display J. Jones, Johnny Jones and Jon Jones as well as John Jones in your pick results.
Now a lot of this depends on how much knowledge your users have about the author ahead of time. If they know nothing beyond the name, they have no basis to judge between the 37 John Jones you have in the database. In this case it might be better just to accept the duplicates and return results based on a filtering by keywords or whatever you are storing about the article. Is it really necessary to make sure that the articles are ascribed to the correct John Jones, if you really know nothing about the author other than his name? Are you more concerned with the subject matter and name of the article or with having a list of all articles written by John Jones from UVA who is a professor of Political Science?
You don't! Names are a bad method of identification as you're finding out. You have a number of methods around this:
Add some form of unique identifier with normal users this would be a username to check for uniqueness. In your case, the method described above name(1) might have to do, if you really have no other information other than the name.
An alternative would be to use multiple attributes to make a composite key (e.g. name + dob)