Core data migrating from rdbms

Core data migrating from rdbms - ios

I am trying to migrate from pure sqlite (with FMDB wrapper) to core data.
My main reason is the icu problem (I have some multilingual projects - German, Spanish, Greek, Chinese) that are difficult to be searched from sqlite, as opposed with the icu built-in on core data. (NSDiacriticInsensitive | NSCaseInsensitive)
Generally I have my data (a coded book) in the following structure:
id
parentId
content
contentType
nContent
vieworder
where nContent is a diacriticinsensitive/caseinsensitive field that I need to ditch, since it slows my database very much (I have used indexes, I have used optimizations but I can't find anything to speedup the search process).
I am baffled with the core data concept -- I can understand it on a master-detail project but I can't understand how to achieve a self referenced item object --
A typical data stored with the above structure is this:
Chapter A
Chapter A.1
Title 1
Content #1
Title 2
Content #2
Chapter A.2
[...]
Where "Chapter/Title/Content" is the content field (so it varies from a small >256 string to a large block of text).
So my questions are:
* How to achieve this structure in core data entity/class (I know that it will need the self-reference relationship)
* How to find the items of each level (for example I would like to find all Title types -- that's why I have the contentType field)
* Is indexing this on the core data structures will provide me with a better indexing and better time searching rates than normal sql (I use LIKE %% structures on the nContent field)?
* Is it better to leave it on SQLITE and try finding a different indexing strategy?
Please feel free to answer any of these questions or give me at least an insight.
UPDATE
here is another more "realistic" example of what I mean:
Beginning HTML (Type: Chapter, parentid:0,id:1)
The fundamental pieces (Type Chapter, parentid:1, id:2)
How to begin (Type: title, parentid:2, id:3)
[content] (Type: content, parentid:3, id:4)
Using paragraphs (Type title, parentid:2, id:5)
[content] (Type: content, parentid:5, id:6)
Using Forms (Type: chapter, parentid:1, id:7)
... (and so on)

EDIT
Considering your clarification...
You probably want to revisit your design to see what works best. However, a simple approach to start with would be something like...
ContentObject
title: NSString
type: whatever
content: NSString
subcontent: 1-to-Many relationship to ContentObject
In Xcode model view, you would just control-click-drag from ContentObject to itself. A self-reference will be made.
Then, make it to-many, and give it the name "subcontent" or whatever. Then, name the inverse relationship "parent."
Now, you have a list of objects and you can add sub object to each object, and CoreData will automatically manage the pointers back to each other. You can also add an index for any attribute for faster searching.
If your actually content may grow large, you may want to make it an entity of its own, with a relationship to it.

Related

database design for dictionary of words

(my reason for asking this question is based on having read this answer, which made me rethink my current setup)
I currently am developing a ruby on rails application in which there are many languages, each of which has a dictionary of base words attached to it, as well as a list of the words that map to each base word. The way I currently have it set up, there is a base_words table that contains the base_word as a string, along with the language_id as a foreign key. There is also a words table, each row of which contains a word string, along with the base_word_id as a foreign key. There is also a language_id indexed on each column, although I'm almost positive that this is superfluous due to the language_id on base_word, so I'm planning to take it off (although this could be a bad assumption on my part).
In sum, on the contrary to the answer I mentioned in the beginning, the tables are not separated by language, because I've reasoned that I can simply pull out the language words programmatically when the time comes. However, my application will also have translation(s) associated with each base word (as did the answer I referenced), and so I'm doubting my structure due to the realization that each translation will actually be a base_word in the same table as itself, which would mean that the translation would actually be just an id of another base word in said table. This may be completely fine, or it may not be - I have no clue (this is my first ever programming project).
Is this ok? Do I need to separate my base_words into separate tables for each language, or can I leave it all in one table?
Another example: I also need to store many phrases for each language, along with their translations. Should I have one table where each row has the appropriate translation of the phrase, or one table where each row contains simply one phrase and a language_id, or multiple tables (one for each language)?
Un saludo,
Michael

As in the other scenario, you'll have a translations table. There is no technical reason it couldn't have multiple foreign keys to base_words (a source_word_id and target_word_id, perhaps). So yes, you can absolutely store all your words in one table. There are some minor side effects involved with translations being directional relationships: it becomes possible to have translations which only work one way, and there will be many pairs of entries with opposite source and target. Neither of these is much of a worry: the first is even potentially desirable in order to represent words with double meanings in one language but not the other, and as for the second, space is cheap and indexing is easy.
You are correct that you do not need words.language_id, so long as you always join base_words when you're querying words and the language matters. This obviously changes if you have a use case where it makes sense to leave base_words out, but that scenario sounds unlikely based on what you describe.
As for phrases: why should they be handled any differently than base_words?

Database design for book structure (table of contents) and content

I have a list of entries, which can be thought of as paragraphs from a book, stored as separate objects of the same class. These objects have a ‘num’ property, along with the actual text, so that I know their order and can later display them in as a list in the correct order (1,2,3, …).
Now I want to bring this one step further and be able to ‘record’ the structure of the book, like the table of contents. In other words, say the book is divided into chapters, and each chapter is further divided into sections. The first few paragraphs are found under Ch.1 Sec.1, then Ch.1 Sec. 2, and so on all the way to Ch. n, S. m. What I’m not sure of is what’s a good way to record this information? I've been told that I should use a database with SQL but I'm not sure where to begin.
The implementation must allow me to ‘quickly’ determine the following two things at any point: (1) Given a chapter and section #, what paragraphs are contained within this section? (2) Given a paragraph #, which chapter and section is it under? It must also be flexible enough that I could use the same platform in the future with few edits if the structure (depth-wise) of the book changes (e.g. sections are divided into subsections, etc.). Finally, should be able to handle optional divisions (i.e. some sections have subsections while others do not).
This is for an iOS app and my code is written in Objective-C so far.

SQL would certainly be one possibility. If you follow this route, there is a certain trade-off between flexibility and easy of coding which impacts maintainability. For example, if you build a fixed structure, say with some additional levels attempting to cater for the future, such as:
Book
Chapter
Section
Sub-section
Paragraph
you will have code with unambiguous references, such as section.fk_chapter, paragraph.fk_subSection, etc. This will make it easier to troubleshoot and build queries. However you have the problem of having to refactor your code a fair amount if you wanted to add, say, sub-paragaphs, or sub-sub-sections. Your UI will be simpler to code in this approach as you always know which "level" you are working at. Alternatively, you can go for a hierarchical approach:
Book
Chapter
Content Item
Content Item
Content Item
....
where the contentItem table has a self-reference foreign key. This has the quite big advantage of allowing you any number of levels. Some attribute on the Content Item could tell you the name and "type" of level you are at if needed. It is definitely much more flexible, but will come with some complexity in implementation and UI presentation. columns called contentItem.fk_contentItem to refer to the parent level do not tell the coder where they are in the hierarchy. Queries will be a bit more difficult to write. The UI will have to cater for "any" number of levels. But on the other hand, these problems are not insurmountable and many have gone before you on this route.
Your question is quite broad, so opinions will vary on the approach and the above is admittedly very general.

Find changes quickly in larger SQL database?

There is a Java Swing application which uses an Informix database. I have user rights granted for the Swing application (i.e. no source code), and read only access to a mirror of the database.
Sometimes I need to find a database column, which is backing a GUI element (TextBox, TableField, Label...). What would be best approach to find out which database column and table is holding the data shown e.g. in a TextBox?
My general approach is to capture the state of the database. Commit a change using the GUI and then capture the state of the database again. Then I need to examine the difference. I've already tried:
Use the nrows field of systables: Didn't work, because the number in nrows does not seem to be a realtime representation of the row count.
Create a script with SELECT COUNT(*) ... for all tables: didn't work because too many tables (> 5000). Also tried to optimize by removing empty tables, but there are still too many left.
Is there a simple solution that I'm missing?

Please look at the Change Data Capture API and check if this suits your needs

There probably isn't a simple solution.
You probably need to build yourself a map of the database, or a data dictionary for it. It sounds as though you can eliminate many of the tables from consideration since they're empty — at least for a preliminary pass. If you're dealing with information in a text box, the chances are it is some sort of character data; you can analyze which (non-empty) tables which contain longer character strings, and they'd be the primary targets of your searches. If the schema is badly designed with lots of VARCHAR(255) columns even though the columns normally only hold short strings, life is more difficult. Over time, you can begin to classify tables and columns so that you end up knowing where to look for parts of the application.
One problem to beware of: the tabid in informix.systables isn't necessarily as stable as you'd like. Your data dictionary needs to record its own dd_tabid for the table it describes, and can store the last known tabid from informix.systables, but it needs to be ready to find a new tabid value on occasion. You should probably only mark data in your dictionary for logical deletion.
To some extent, this assumes you can create a database in which to record this information. If you can't create an Informix database, you may have to use something else (MySQL, or SQLite, perhaps) to store the data dictionary. Alternatively, go to your DBA team and ask them for the information. Unless you're trying something self-evidently untoward, they're likely to help (but politics can get in the way — I've no idea how collegial your teams are).

SQL SELECT with table aliases in Core Data

I have the following SQL query that I want to do using Core Data:
SELECT t1.date, t1.amount + SUM(t2.amount) AS importantvalue
FROM specifictable AS t1, specifictable AS t2
WHERE t1.amount < 0 AND t2.amount < 0 AND t1.date IS NOT NULL AND t2.date IS NULL
GROUP BY t1.date, t1.amount;
Now, it looks like CoreData fetch requests can only fetch from a single entity. Is there a way to do this entire query in a single fetch request?

The best way I know is to crate an abstract parent entity for entities you wish to fetch together.
So if you have - 'Meat' 'Vegetables' and 'Fruits' entities, you can create a parent abstract entity for 'Food' and then fetch for all the sweet entities in the 'Food' entity.
This way you will get all the sweet 'Meat' 'Vegetables' and 'Fruits'.
Look here:
Entity Inheritance in Apple documentation.

Nikolay,
Core Data is not a SQL system. It has a more primitive query language. While this appears to be a deficit, it really isn't. It forces you to bring things into RAM and do your complex calculations there instead of in the DB. The NSSet/NSMutableSet operations are extremely fast and effective. This also results in a faster app. (This is particularly apparent on iOS where the flash is slow and, hence, big fetches are to be preferred.)
In answer to your question, yes, a fetch request operates on a single entity. No, you are not limited to data on that entity. One uses key paths to traverse relationships in the predicate language.
Shannoga's answer is one good way to solve your problem. But I don't know enough about what you are actually trying to accomplish with your data model to judge whether using entity inheritance is the right path for your app. It may not be.
Your SQL schema from a server may not make sense in a CD app. Both the query language and how the data is used in the UI probably force a different structure. (For example, using a fetched results controller on iOS can force you to denormalize your data differently than you would on a server.)
Entity inheritance, like inheritance in OOP, is a stiff technology. It is hard to change. Hence, I use it carefully. When I do use it, I gain performance in some fetches and some simplification in other calculations. At other times, it is the wrong answer, performance wise.
The real answer is a question: what are you really trying to do?
Andrew

User-adjustable data structures

assume a data structure Person used for a contact database. The fields of the structure should be configurable, so that users can add user defined fields to the structure and even change existing fields. So basically there should be a configuration file like
FieldNo FieldName DataType DefaultValue
0 Name String ""
1 Age Integer "0"
...
The program should then load this file, manage the dynamic data structure (dynamic not in a "change during runtime" way, but in a "user can change via configuration file" way) and allow easy and type-safe access to the data fields.
I have already implemented this, storing information about each data field in a static array and storing only the changed values in the objects.
My question: Is there any pattern describing that situation? I guess that I'm not the first one running into the problem of creating a user-adjustable class?
Thanks in advance. Tell me if the question is not clear enough.

I've had a quick look through "Patterns of Enterprise Application Architecture" by Martin Folwer and the Metadata Mapping pattern describes (at quick glance) what you are describing.
An excerpt...
"A Metadata Mapping allows developers to define the mappings in a simple tabular form, which can then be processed bygeneric code to carry out the details of reading, inserting and updating the data."
HTH

I suggest looking at the various Object-Relational pattern in Martin Fowler's Patterns of Enterprise Application Architecture available here. This is a list of patterns it covers here.
The best fit to your problem appears to be metadata mapping here. There are other patterns, Mapper, etc.

The normal way to handle this is for the class to have a list of user-defined records, each of which consists of list of user-defined fields. The configuration information forc this can easily be stored in a database table containing the a type id, field type etc, The actual data is then stored in a simple table with the data represented only as (objectid + field index)/string pairs - you convert the strings to and from the real type when you read or write the database.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart