I have a lot of music chords that can have alternate names, rather than having to create longer lines with a lot of == and or's for each alternate name, and if chord=="Maj" or "maj" or.. don't work with Lua:
if chord=="Maj9" or chord="maj9" or chord=="M9" or chord=="Maj7(add9)" or chord=="M7(add9)" then notes="0,4,7,11,14" end
I need a simpler way to do it, maybe just reformat the lines in Notepad++ to use an array,
at the moment each of the 200+ chords on one line each:
if chord=="Maj9,maj9,M9,Maj7(add9),M7(add9)" then notes="0,4,7,11,14" end
if chord=="mMaj7,minmaj7,mmaj7,min/maj7,mM7,m(addM7),m(+7),-(M7)" then notes="0,3,7,11" end
The correct way is to normalize your input. For example, take whatever chord value comes in and use Lua’s string.lower() function to make the string all lowercase. By normalizing your input, you simplify the logic you need to write to work with that data. Consider other ways as well to normalize the data. You might, for example, write a method that converts all notes into an enumerated list (C = 1, C# = 2, etc.). That way equivalent notes get the same in-memory values.
Those are just a few ideas to get you on track. You should not try to think up and then hard-code every possible way a user may input a chord name.
Related
So I want to read a text file with the following format:
Bob, G92f22f, Fggggfdff32
Rob, f3h9123, fdsgfdsg3
Sally, f2g4g, g3g3hgdsd
I want a simple Lua program that can filter out say "bob" and then say throw the data into a variable to use in a program.
a = Bob
b = G92f22f
c = Fggggfdff32
I assume then I could do
print(a,b,c)
Still quite new to Lua having a heck of a time with anything read / variable though.
You'll want to look at the io and string modules; they handle things like reading from / writing to files and doing pattern matching on strings.
Luas pattern matching is a bit simple compared to the regular expressions most modern languages use, but from what I see in your example, you could probably match a "word" as [^, ]+, that is, one or more characters that aren't a comma or a space
Being a novel on SPSS I am struggling with finding duplicate cases based on a string-variable in a dataset containing approx 33,000 cases.
I have a variable named "nr" that is supposed to be unique id for every case. However, it turns out that some cases might have two different values in "nr" entered,the only difference being the last character. Resulting in a case being shown as two separate rows.
The structure of the var "nr" is a as follows: XX-XXXXXXX-X or X-XXXXXXX-X i.e 2-7-1 characters or 1-7-1 characters.
I would like to sort out all cases that have a "nr" equal to another case except for the last character.
To illustrate, with a succesfull syntax I would hopefully be able to sort cases like these out from the whole dataset:
20-4026988-2
20-4026988-3
5-4026992-5
5-4026992-8
20-4027281-2
20-4027281-3
Anyone have an idea on how to make a syntax for this? Would be so grateful for any input!
I suggest to create a new variable without that last character, and then look for the doubles:
* first creating some sample data to play with.
data list list/ID (a15).
begin data.
20-4026988-2
12-2345678-7
20-4026988-3
5-4026992-5
5-4026992-8
12-1234567-1
20-4027281-2
6-1234567-1
20-4027281-3
end data.
* now creating the new variable and counting the occurrences of each shortened ID.
string ShortID (a15).
compute ShortID=char.substr(ID,1,char.rindex(ID,"-")).
* also possible: compute ShortID=char.substr(ID,1,char.length(rtrim(ID))-1).
aggregate out=* mode=add /break=ShortID/occurrences=n.
* at this point you can filter based on the number or `occurrences` or sort them.
sort cases by occurrences (d) ShortID.
After removing the last character, you can use Data > Identify Duplicate Cases to find the dups. It as a number of useful options for this.
Looking at the book Mining of Massive Datasets, section 1.3.2 has an overview of Hash Functions. Without a computer science background, this is quite new to me; Ruby was my first language, where a hash seems to be equivalent to Dictionary<object, object>. And I had never considered how this kind of datastructure is put together.
The book mentions hash functions, as a means of implementing these dictionary data structures. This paragraph:
First, a hash function h takes a hash-key value as an argument and produces
a bucket number as a result. The bucket number is an integer, normally in the
range 0 to B − 1, where B is the number of buckets. Hash-keys can be of any
type. There is an intuitive property of hash functions that they “randomize”
hash-keys
What exactly are buckets in terms of a hash function? it sounds like buckets are array-like structures, and that the hash function is some kind of algorithm / array-like-structure search that produces the same bucket number every time? What is inside this metaphorical bucket?
I've always read that javascript objects/ruby hashes/ etc don't guarantee order. In practice I've found that keys' order doesn't change (actually, I think using an older version of Mozilla's Rhino interpreter that the JS object order DID change, but I can't be sure...).
Does that mean that hashes (Ruby) / objects (JS) ARE NOT resolved by these hash functions?
Does the word hashing take on different meanings depending on the level at which you are working with computers? i.e. it would seem that a Ruby hash is not the same as a C++ hash...
When you hash a value, any useful hash function generally has a smaller range than the domain. This means that out of a large list of input values (for example all possible combinations of letters) it will output any of a smaller list of values (a number capped at a certain length). This means that more than one input value can map to the same output value.
When this is the case, the output values are refered to as buckets.
Consider the function f(x) = x mod 2
This generates the following outputs;
1 => 1
2 => 0
3 => 1
4 => 0
In this case there are two buckets (1 and 0), with a bunch of input values that fall into each.
A good hash function will fill all of these 'buckets' equally, and so enable faster searching etc. If you take the mod of any number, you get the bucket to look into, and thus have to search through less results than if you just searched initially, since each bucket has less results in it than the whole set of inputs. In the ideal situation, the hash is fast to calculate and there is only one result in each bucket, this enables lookups to take only as long as applying the hash function takes.
This is a simplified example of course but hopefully you get the idea?
The concept of a hash function is always the same. It's a function that calculates some number to represent an object. The properties of this number should be:
it's relatively cheap to compute
it's as different as possible for all objects.
Let's give a really artificial example to show what I mean with this and why/how hashes are usually used.
Take all natural numbers. Now let's assume it's expensive to check if 2 numbers are equal.
Let's also define a relatively cheap hash function as follows:
hash = number % 10
The idea is simple, just take the last digit of the number as the hash. In the explanation you got, this means we put all numbers ending in 1 into an imaginary 1-bucket, all numbers ending in 2 in the 2-bucket etc...
Those buckets don't really exists as data structure. They just make it easy to reason about the hash function.
Now that we have this cheap hash function we can use it to reduce the cost of other things. For example, we want to create a new datastructure to enable cheap searching of numbers. Let's call this datastructure a hashmap.
Here we actually put all the numbers with hash=1 together in a list/set/..., we put the numbers with hash=5 into their own list/set ... etc.
And if we then want to lookup some number, we first calculate it's hash value. Then we check the list/set corresponding to this hash, and then compare only "similar" numbers to find our exact number we want. This means we only had to do a cheap hash calculation and then have to check 1/10th of the numbers with the expensive equality check.
Note here that we use the hash function to define a new datastructure. The hash itself isn't a datastructure.
Consider a phone book.
Imagine that you wanted to look for Donald Duck in a phone book.
It would be very inefficient to have to look every page, and every entry on that page. So rather than doing that, we do the following thing:
We create an index
We create a way to obtain an index key from a name
For a phone book, the index goes from A-Z, and the function used to get the index key, is just getting first letter from the Surname.
In this case, the hashing function takes Donald Duck and gives you D.
Then you take D and go to the index where all the people with Surnames starting with D are.
That would be a very oversimplified way to put it.
Let me explain in simple terms. Buckets come into picture while handling collisions using chaining technique ( Open hashing or Closed addressing)
Here, each array entry shall correspond to a bucket and each array entry (if nonempty) will be having a pointer to the head of the linked list. (The bucket is implemented as a linked list).
The hash function shall be used by hash table to calculate an index into an array of buckets, from which the desired value can be found.
That is, while checking whether an element is in the hash table, the key is first hashed to find the correct bucket to look into. Then, the corresponding linked list is traversed to locate the desired element.
Similarly while any element addition or deletion, hashing is used to find the appropriate bucket. Then, the bucket is checked for presence/absence of required element, and accordingly it is added/removed from the bucket by traversing corresponding linked list.
I come from the strongly typed world and I want to write some Lua code. How should I document what type things are? What do Lua natives do? Hungarian notation? Something else?
For example:
local insert = function(what, where, offset)
It's impossible to tell at a glance whether we're talking about strings or tables here.
Should I do
local sInsert = function(sWhat, sWhere, nOffset)
or
-- string what, string where, number offset, return string
local insert = function(what, where, offset)
or something else?
What about local variables? What about table entries (e.g. someThing.someProperty)?
For a reference on thoughts and opinions on Lua style in the community (or a particular community?), read this: LuaStyleGuide.
The closest one could get to an enforced style would be the format used by LuaDoc, as it's a fairly popular documentation generator used by high profile projects such as LuaFileSystem.
There are only seven types in Lua.
Here are some conventions (some of them might sound a bit obvious; sorry):
Anything that sounds like a string, should be a string: street_address, request_method. If you are not sure you can add _name (or any other suffix that makes clear it's a substantive) to it: method_name
Anything that sounds like a number, should be a number: mass, temperature, percentage. When in doubt, add number, amount, coefficient, or whatever fits : number_of_children, user_id. The names n and i are usually given to numbers. If a number must be positive or natural, make assertions at the top of the function.
Boolean parameters are either an adjective (cold, dirty) or is_<adjective> (is_wet, is_ready).
Anything that sounds like a verb should be a function: consume, check. You can add _function, _callback or _f if you need to clarify it further: update_function, post_callback. The single letter f represents a function quite often. And usually you should only have one parameter of type function (recommended to put it at the end)
Anything that sounds like a collection should be a table: children, words, dictionary. People typically don't differentiate between array-like tables and dictionary-like tables, since both can be parsed with pairs. If you need to specify that a table is an array, you could add _array or _sequence at the end of the name. The letter t typically means table.
Coroutines are not used quite often; you can follow the same rules as with functions, and you can also add _cor to their names.
Any value can be nil.
If it's an optional value, initialize it at the top of the function: options = options or {}
If it's a mandatory value, make an assertion (or return error): assert(name, "The name is mandatory")
I've done some Google searching but couldn't find what I was looking for.
I'm developing a scrabble-type word game in rails, and was wondering if there was a simple way to validate what the player inputs in the game is actually a word. They'd be typing the word out.
Is validation against some sort of English language dictionary database loaded within the app best way to solve this problem? If so, are there any libraries that offer this kind of functionality? If not, what would you suggest?
Thanks for your help!
You need two things:
a word list
some code
The word list is the tricky part. On most Unix systems there's a word list at /usr/share/dict/words or /usr/dict/words -- see http://en.wikipedia.org/wiki/Words_(Unix) for more details. The one on my Mac has 234,936 words in it. But they're not all valid Scrabble words. So you'd have to somehow acquire a Scrabble dictionary, make sure you have the right license to use it, and process it so it's a text file.
(Update: The word list for LetterPress is now open source, and available on GitHub.)
The code is no problem in the simple case. Here's a script I whipped up just now:
words = {}
File.open("/usr/share/dict/words") do |file|
file.each do |line|
words[line.strip] = true
end
end
p words["magic"]
p words["saldkaj"]
This will output
true
nil
I leave it as an exercise for the reader to make it into a proper Words object. (Technically it's not a Dictionary since it has no definitions.) Or to use a DAWG instead of a hash, even though a hash is probably fine for your needs.
A piece of language-agnostic advice here, is that if you only care about the existence of a word (which in such a case, you do), and you are planning to load the entire database into the application (which your query suggests you're considering) then a DAWG will enable you to check the existence in O(n) time complexity where n is the size of the word (dictionary size has no effect - overall the lookup is essentially O(1)), while being a relatively minimal structure in terms of memory (indeed, some insertions will actually reduce the size of the structure, a DAWG for "top, tap, taps, tops" has fewer nodes than one for "tops, tap").