Advanced hex editor - binary-data

I need to edit some binary data to create some complex register values.
It would be nice to have a hex editor that would have some "advanced" features besides the standart ones:
Ability to enter bit reversed data words (automatic reversion)
Ability to enter a data word in any position of the file (i.e. not byte aligned)
Cross platform (at least linux, Windows).
Any suggestions?

Related

Random mobile app icon generation algorithms [duplicate]

This question already has an answer here:
How to generate the random default "gravatars" like on Stack Overflow?
(1 answer)
Closed 8 years ago.
What is a suitable algorithm that can be used to generate random, but likely humanly distinguishable, graphic square icons?
Icons, from 57x57 up to 1024 square, such as used for mobile apps, preferably using something like Core Graphics commands/operations? (or an equivalent)
I tried filling square bitmaps with rand(), but they all look like mud, very hard to distinguish between by sight.
Identicon
Random icon you are talking about is an Identicon.
Identicons are icons that are generated from some form of user information.
An Identicon is a visual representation of a hash value, usually of an
IP address, that serves to identify a user of a computer system as a
form of avatar while protecting the users' privacy. The original
Identicon was a 9-block graphic, and the representation has been
extended to other graphic forms by third parties. – Wikipedia
Sample Implementation
You can have a look at:
NIdenticon - a C# library that helps creating simple Identicons. Examine IdenticonGenerator class that has only one method called Create(). You should be able to extract the algorithm/general idea from it.
Contact-Identicons source - Android app source code. The app generated Identicons. This blog post includes a sample of Java code used to generate a 5*5 pixel, horizontally symmetrical identicon much like the ones github uses.
IGIdenticon source - Objective-C identicon generator. A port of identicon library written in Java.
Good luck!
One way to approach this is similar to a random sentence generator: Rather than a random sequence of letters or words, you can use simple grammar templates like "The (adjective) (noun) (transitive verbed) a (adjective) (noun)." Then pick random nouns etc. to fill it in.
So here, you could compose an icon by randomly selecting some small image pieces like a document icon, a person icon, a right arrow, a question mark, etc. Randomly colorize the pieces, using a randomly chosen color scheme. Randomly arrange the pieces together. Add a shadow. Stuff like that.
For avatars, this could work similar to Mr. Potato Head.

How do I store and view graphically formatted data?

I have an app (written in D2010) which is similar to a text retrieval app... It has a list of questions, with their corresponding answers. Most answers are strictly text, but some answers have graphics, and formatting. My dilemma has to do with the formatted answer. The user should be able to copy this answer (formatting and graphics) in order to paste it into another app. I have tried using a Word OCX. This is a little problematic. User has to have word, it gives random errors when using inside a virtual machine, etc. I am now playing with using a built in browser component, and viewing the data as a PDF. This is nice and easy, but when I copy and paste it, I loose all formatting, and the graphic shows up as a large totally black box.
I can store the data in whatever format I choose. It is stored as a BLOB in a DB file. I write it to a temp file and then I call some type of viewing routine, so I have flexibility there. My issue is really, what viewer mechanism is simple to implement, and allows copying/pasting, while maintaining text formatting (bullets, indents, etc) and graphics.
Thanks,
GS
The TRichEdit (or any of TRichEdit descendants or similar classes) will allow the users to visualize text formatting and images, and when the content is copied, the RTF representation of the data will be copied into the clipboard.
When the clipboard data is pasted into a RTF compatible text editor (like Wordpad and Word), all the formatting, bullets and images are preserved.

General question - copy, cut, paste

How do the applications transfer the copied strings into each other? Is this a clipboard usage? If so, how can i access the clipboard in a program?
Edit: I'm interested in Windows systems, I know a bit of C#, and C++.
Yes, cut-and-paste is usually done using the system-wide clipboard.
In both Windows Forms and WPF applications, there are (different) classes called 'Clipboard', which contain the stuff you need to access the system clipboard.
Basically, the clipboard allows you to put pretty much anything on to it, along with markers that say what format the data is in. You can put the same data on in lots of different formats. That's how, for example, you can cut and paste a part of a spreadsheet in Excel into Notepad - Excel has put the data onto the clipboard in both a native Excel format and a plain text format.

Terminology and concepts surrounding the use of code pages

I'm in the process of researching code pages and have come across many conflicting uses of terminology, even amongst different Wikipedia entries. I just can't find a source of information that spells out the entire character handling process from start to finish. Could someone well versed in this field suggest ways in which the following information is inaccurate or incorrect:
The process of character representation as far as I understand:
We start with sets of symbols (not sure of the correct terminology here, possibly 'scripts') that are not associated with any specific platform. 'The Cyrillic alphabet' is understood to refer to the same entity in the context of Windows as in Linux, for example.
Members of these sets are selected, generally in bunches, by vendors to form a platform specific character set. The platform might assign these various codes such as GDI values on Windows (eg. 0 for ANSI_CHARSET and the other codes mentioned here: http://asa.diac24.net/wiki/index.php?title=ASS:fe&printable=yes). I cannot find much information on these sets such as whether they are in fact coded character sets or if they are simply unordered and abstract.
From these sets, individual code pages are developed that appear to have a one to one mapping with GDI values. Since these GDI values appear to represent sets that are platform dependent, does this mean Windows code pages are essentially a coded version of each individual set?
I've been having trouble reconciling this idea with a link shown to me earlier (which I've lost) that showed a one to many mapping between these GDI charsets and code pages across different platforms. Is this accurate, do these GDI values point to sets from which different code pages across different platforms can be developed?
Each code page maps a member of an abstract character set onto an integer to represent its position in the set. In the case of the 'simpler' code pages mentioned on the above webpage, these can be referred to using the more precise 'character map' term. Is this term worth considering or is the distinction too subtle and unimportant?
A font resolves a code point to a glyph if it contains one for that code point, otherwise it reports a failure. I've also read that a font may return its own blank glyph for those code points which it doesn't support. Can an application distinguish between this blank glyph and a successful resolution, ie. does the font return an error code of sorts with this blank glyph?
I believe that's the extent of my confusion. Any clarification in this regard would be invaluable. Thanks in advance.
You are essentially correct:
Start with the number of known characters.
Select a subset of this characters (a character set)
Map these to bit patterns (code page and encoding)
Render these to an output device by combining the character with a glyph (ie. using a font, a bit pattern, and a codepage/encoding that maps bit pattern to character).
Across platforms, there are similar code pages. And even across many code pages there are similar mappings of value to character. For example, Windows Latin, Mac Roman and unicode share characters for the first 127 values. There is some standardization (eg. http://en.wikipedia.org/wiki/Shift_JIS for Japanese) of codepages so that machines can interact.
Generally for new development, you should be using a unicode codepage with one of the popular encodings. UTF8 is popular on most modern systems. UTF16LE is used for Windows system calls ending in W.
This might be a good match: http://mihai-nita.net/2006/08/06/basic-lingo/

How does "cut and paste" affect character encoding and what can go wrong?

I have a document A in encoding A displayed in tool A and a document B in encoding B displayed in tool B. If I cut and paste (part of) B into A what might be the resultant character encoding? I realise this depends on tool A and tool B and the information held in the paste buffer (which presumably can contain an encoding?) and the operating system.
What should high-quality tools do? and in practice how many of the common tools (e.g. Word, TextPad, various IDEs, etc.) do a good job?
First of all, a text editor's internal representation of text has no bearing on how the text is encoded (serialized) when you save the file. So a document is not "in" an encoding; it's a sequence of abstract characters. When the document is saved to a file (or transmitted over the network) then it gets encoded.
It's up to each application to decide what it puts on the clipboard. Typically, a windows app that knows what it's doing will put a number of different representations on the clipboard. When you paste in the other app, the app will look for the representation that best suits its need.
In your case, a text editor (that knows what it's doing) will put a Unicode representation of a selected string onto the clipboard (where Unicode, in Windows, is typically moved around as UTF-16, but that's not important). When you paste in the other app, it will insert that sequence of Unicode characters into the document at the selection point.
There's an app floating around called "ClipSpy" that will help you see what I'm talking about, interactively.
I observed the following behavior when I looked into Unicode normalization: When copying a canonically decomposed string (NFD) in Firefox in macOS 10.15.7, the string is normalized to NFC when pasting it in Chrome. What's weird is that the pasting affects the content of the clipboard: When pasting the string in Firefox again, it's then also canonically composed there. If I don't paste it anywhere else before pasting it in Firefox again, the NFD form survives. Interestingly, the problem doesn't occur in the other direction: When copying a canonically decomposed string in Chrome, it's pasted in NFD form anywhere I can tell. My conclusion is that Firefox stores text to the clipboard differently from other applications. One way to play around with this yourself is to copy 'mañana' === 'mañana' to your JavaScript console. The statement returns false if the NFD form of the string on the right survived the copy & paste.
This is a very good question. When you copy/paste, exactly what is copied/pasted - CHARACTERS or BYTES?. And if BYTES, what encoding are they in?
From the answers, it sounds like the answer is "it depends". Different programs will put different things in the clipboard, sometimes placing multiple representations.
Then the pasting program needs to pick the best one and "do the right thing" with it.
Following my conversion with #Kaspar Etter, I did some testing. Here is what I found:
Copy from and Paste to:
Firefox:
Firefox to Firefox: NO normalization
Other apps to Firefox: NO normalization
Firefox to other apps: normalization
Even if we use AppleScript, JXA, or Python to directly read the SystemClipboard that contains the text copied from Firefox, the text is still normalized. Since copying and pasting from Firefox to Firefox does not involve normalization, Firefox probably does not normalize the text during the copy process. I have no idea when the normalization happens.
Safari (MacOS, not iOS):
Safari to Safari: normalization
Other apps to Safari: normalization
Safari to other apps: NO normalization
For Safari (MacOS), the normalization also happens at least on Canvas by instructure.com. In the fill-in-blank questions of Classic Quizzes, when students type Hebrew words in quizzes and hit "submit", the input was normalized, but the answer key was not. In that of the New Quizzes, however, both the input and the answer key are normalized. It's a mystery to me.
Chrome:
Chrome to Chrome: NO normalization
Other apps to Chrome: NO normalization (Firefox overrides)
Chrome to other apps: NO normalization (Safari overrides)
Conclusion: Firefox and Safari behave in the opposite way. Chrome behaves normally and consistently (except when it is overridden by Firefox and Safari).

Resources