I have a Window 8 store application that I would like to be able to use an exteremely large custom dictionary of words for (1.2MB). We've been able to add words to the microsoft dictionary in a user's roaming profile located at:
C:\Users{username}\AppData\Roaming\Microsoft\Spelling\en-US\default.dic
but it looks like when that file is larger than 64KB it just doesn't work.
Does anyone have any suggestions to remedy this?
Related
Local travel cards in Saint-Petersburg, Russia have got huge id numbers that aren't easy to read and type into a web page when topping up the card online. So I want to build a small app that would take a photo of a travel card and parse the number out.
The task is a bit easier than a free form recognition:
card is of the very well known size
id numbers are of known size, are located in the very well known location on a card and they are number only, no letters (okay, there are two variations I think and maybe they will add 1-2 more in the future)
even the font is known in advance
even the first several numbers are the same for most of the card (so far there are only two prefixes used)
How would you do it? Are there any libraries tuned not for the general OCR, but for a "hinted" OCR like I need?
Best regards,
Artem.
P.S.
Actually a free/cheap web service for this task would also be good enough
Yes Google has a library called Tesseract and there is an iOS SDK on Github you can import into your application. So you can use this SDK and it has some documentation that you can read that will explain how to set it up in your app. It has methods that will return you a string with the text of the card in the string. BUT it will be ALL of the text from the card. So best thing to do would be to:
1 "clip" the original image to extract a sub image that displays only the portion of the card you wish to get the numbers from.
2 Process this sub image through Tesseract to retrieve the string you are looking for.
3 Then parse through the string and pick out the data that you need.
But just be warned, it can be a bit quirky. This SDK tends to recognize words best from images that are scanned, not "taken a picture of". Because although it is an advance piece of technology, it isn't perfect. So to get it to work as perfectly as possible for you, try to get scanned copies of the originals.
Best of luck.
The ideal solution for you would have three components:
1) Detection of the card. This is useful because if you have the detection, then the end users have much easier time actually using the scanner, because they can place the phone above the card in an arbitrary direction
2) Accurate OCR component. Ideally, customizable for this exact font you have on the card, for the exact position on the card.
3) Parsing mechanism. This would enable you to obtain the exact string written on the card without writing huge amount of OCR parsing code.
BlinkID SDK has all this. It has a preset for detection cards in the ID-1 format. It has integrated OCR engine. And it provides RegexParser, where you can define the exact format of the text which you're trying to extract from the document.
BlinkID was initially built for scanning ID documents which have very similar properties as the problem you're trying to solve.
Note. I'm one of the developers working on BlinkID.
If I store important values in a plist in xcode, is that less secure than if it was hard coded in a class? Could jail broken devices mess with those values easily? I know there's a certain level of risk with everything, but can someone explain the relative risks of a flat file vs hard coded values (in a MyClass.m file)?
Sub question:
How do you go about storing large amounts of initial data for a game/app to run on? It's fine if the values are readable, I just don't want them easily writable.
as for reading data:
plist data is not secure at all - getting plist content takes virtually no time! (and as the ipa is just a renamed zip you don't even need a device ;))
Extracting compiled code is 'harder' but in case of plain text strings only by a small margin.
(again: no need for a device)
as for writing to it:
data is you deliver is never writable without breaking the code signature. Therefore any method is fine. Often one ships CoreData databases when using CD, but I also use xmld, jsons, plists.. to deliver my content. whatever suits the needs best
note: breaking the code signature makes the app unusable on a stock iOS device but I think It'd remain usable on a jailbroken phone as the kernel doesn't really check the signature there
The values stored in you source files (.m) are safe, it is quite hard to access them. On the other hand accessing an app's plist, image sources, and other files are quite easy, there programs to achieve this (for example: Iexplorer) and it doesn't have to jailbroken at all.
So if you have sensitive information stored in your plist, it worth to encode the file, or store it in your source code.
Anyone can access a .plist file. But if is hard coded in a class is much more secure, use the second option. Nothing is 100% secure, but hard-coded in a class if someone want to access this value, the work is more hard.
You can store your Data into NSDictionary, then convert it to NSData, then do some simple crypto (re-order bytes for ex), then write to you application folder. When you want to read them, just take the content of the file, then decrypt, then re-create NSDictionary.
convert NSDictionary to NSData:
NSData *someDatas = [NSKeyedArchiver archivedDataWithRootObject:aDictionary];
convert NSData to NSDictionary:
NSDictionary *aDictionary = (NSDictionary*) [NSKeyedUnarchiver unarchiveObjectWithData:someDatas];
The data is secure in the way that the user cannot modify the content the right way cause the data won't be valid while application read it.
If you're looking to store sensitive values that you don't want jailbroken devices or reversed engineered app to get access to, you can easily think of using UAObfuscatedString.
As quoted:
When you write code that has a string constant in it, this string is saved in the binary in clear text. A hacker could potentially discover exploits or change the string to affect your app's behavior.
UAObfuscatedString only ever stores single characters in the binary, then combines them at runtime to produce your string. It is highly unlikely that these single letters will be discoverable in the binary as they will be interjected at random places in the compiled code. Thus, they appear to be randomized code to anyone trying to extract strings.
Having values hard coded in code or in a plist file is considered risky for sure.
The company I work for has a program that is no longer supported called QADisplay. Inside of this program is a tool for annotating images. It's very similar to photoshop in that it takes a layer based approach to the annotations with each annotation as its own class in Delphi 7. These annotations are stored as the base image and a text file with the information describing the contents of the annotaion.
The issue is that the text that is displayed in the annotations is somehow encoded in the text file. For example, if the annotation displays as "Arial" (without the quotes), the text file will be written as:
TEXT (Type of annotation)
5 (Length of the literal string, in this case: Arial)
07)I86P (The encoded string)
What I need to do is extract all of the text from the annotations in preparation for the installation of our new software system.
I am not familiar with Delphi and do not have access to the source code. I have tried to disassemble the executable but haven't had much luck there. Does anyone have any ideas on how to approach decoding this? I've googled around a bit (Arial "07)I86P") and found some results relating to virus scan error logs and things of that nature but no dice on anything that I found helpful in relation to the issue I'm having.
That is not a standard text encoding. Maybe it is encrypted?
Without documentation or contact with the original developers, you will have to reverse engineer the app. Using a disassembler/debugger like IDA, if you can pause the app after it loads 07)I86P into memory, you can follow the code as it processes the characters, which will help you reconstruct the decode algorithm.
I'm developing an app which needs to show some logos. These logos are just 8kb PNG files, and I'm just going to handle a little amount of them (10-20 at most). However, these are downloaded from the Internet because they might change. So, what I'm trying to achieve is, making the app to download them (done), storing them into the file system, and only downloading again whenever they change (might be months).
Everyone seems to use Core Data, which in my opinion is something designed for bigger and more complex things, because my files will always have the same name plus don't have relations between them.
Is the file system the way to go? Any good tutorial?
Yes, the file system is probably your best option for this. You say that you've already implemented the downloading. How have you done so? With NSURLConnection? If so, then at some point, you have an NSData object. This has a couple of write... methods you can use to save the data to a file on the filesystem. Be sure to save the files in the right place, as your app is sandboxed and you can't write anywhere you like.
The advantage Core Data brings is efficiency. Using NSFetchedResultsController to display your logos in a tableview gets you optimized object loading and memory management. It will automatically load only the items which can be displayed on one screen, and as the user flicks through the table it will handle releasing items which move offscreen. Implementing that on your own is not a simple task.
If you want to build and display your data without Core Data, you'll probably want to use NSKeyValueCoder, which will allow you to easily write an array or dictionary of objects (including nested arrays, dictionaries, and images).
By best I mean most efficient.
So don't go on about subjectiveness.
I have a list of websites and I want to store the list on the iphone locally, there must be an URL, title and a small image (like 32x32 max image size). I don't think I should be using CoreData for this. Should I be using a plist?
EDIT:
Efficient's definition I though was obvious. Take up the least amount of room, use lowest memory/CPU.
Sorry I forgot to say About 10-15 max items. And they just get loaded into a table view when the app first loads or when that view is brought back by a nav controller.
If you can, leave the images in the resources, and put the url, title and imagename in a pList. Alternatively, you could just create a "Site" class with the three properties, and generate an array of Sites in code. (Or an Array of Dictionaries)
You say not to "go on about subjectiveness" but you don't provide your definition of efficient for this.
You don't specify how many websites you want to store or how you want to use them or what is important to you - storage size, i/o perf, ability to query in specific ways etc.
It doesn't sound like a plist would be a bad fit but I guess my earlier point is just that way you are going to read, write data is generally equally or more important in setting context for questions like this.