Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
output prolog list path and avoid certain routes which user input in prolog.
hi ,I'm working on a project , a building contains zones , each zone has an exit , we want to evacuate people through the zones to exits ,the user input two parameters ,the first one is the "infected zone" ,the other parameter is "zone of people we want to evacuate".
the output should be all the safe routes from the "zone of people we want to evacuate" to exits avoiding the infected zone.
for example :
user input (z11, z12) // it means z11 is infected , people we want to evacuate is in z12.
output : z12->z22->exit3. and
z12->z21->exit2. and z12->elevators
the facts are :
path(z11,z12).
path(z12,z11).
path(z12,z22).
path(z12,z21).
path(z22,z12).
path(z22,z21).
path(z21,z22).
path(z11,exit1).
path(z12,elevators).
path(z21,exit2).
path(z22,exit3).
please help me writing the code.
It's inconvenient that you've chosen to name your predicate path/2 since we'd probably want to call the thing that generates a path to the exit with that name. So first I'd rename all your facts from path/2 to connected/2. Then you're going to want to annotate the exits:
exit(exit1). exit(exit2).
exit(elevators).
Otherwise you'd have to hard-code them somewhere else.
A simple thing to do would be to solve the general path question and then check to ensure the path doesn't contain an infected site. That would look like this:
path(Start, Path) :- path(Start, Path, []).
path(Start, [Exit], Seen) :-
exit(Exit),
connected(Start, Exit),
\+ memberchk(Exit, Seen).
path(Start, [Next|Rest], Seen) :-
connected(Start, Next),
\+ memberchk(Next, Seen),
path(Next, Rest, [Next|Seen]).
safe_path(Start, Avoid, Path) :-
path(Start, Path),
\+ memberchk(Avoid, Path).
This easily generalizes to handle sets of avoid zones:
safe_path(Start, AvoidList, Path) :-
path(Start, Path),
forall(member(Avoid, AvoidList), \+ memberchk(Avoid, Path)).
The bulk of what's interesting and fun to do in Prolog is accomplished with a generate/test paradigm. The simplest and most direct formulation is usually one in which you generate too much (too generally, you might say) and put all the restrictions in the test. Generally speaking, you achieve better performance by making the generator more intelligent about generating possibilities--moving code from the "test" part into the "generate" part of "generate and test."
Usually the first problem you face is generating an infinite tree. This is particularly true with graphs. The memberchk/2 in path/3 with the Seen list serves to prevent looping back and is necessary to make the set of paths finite. Using exit/1 in the base case of path/3 also helps performance because we're not generating intermediate paths. It's nice that with your particular situation you can get away with this.
Doing the avoidance at the end is winnowing out chaff last. The generation doesn't know to avoid these nodes so all of the poisoned paths will get generated and removed by the test. If performance isn't sufficient this way, you can move that code into path/2 directly, doing a similar kind of check to the one done with the Seen list.
Related
I currently work on a personal writing project which has ended up with me maintaining a few different versions due to the differences of the relevant platforms and output formats I want to support that are not trivially solved. After several instances of me glancing at pandoc and the sheer forest that it represents, I have concluded mere templates don't do what I need, and worse, that I seem to need a combination of a custom filter and writer... suffice to say: messing with the AST is where I feel way out of my depth. Enough so that, rather than asking specific questions of 'how do I do X' here, this is a question of 'is X the right way to go about it, or what is the proper way to do it, and can you give an example of how it ties together?'... so if this question is rather lengthy: my apologies.
My current goal is to have custom markup like the following which is supposed to 'track' which character says something:
<paul|"Hi there">
If I convert to HTML, I'd want something similar to:
<span class="speech paul">"Hi there"</span>
to pop out (and perhaps the <p> tags), whereas if it is just pure markdown / plain text, I'd want it to silently disappear:
"Hi there"
Looking at the JSON AST structures I've studied, it would make sense that I'd want a new structure type similar to the 'Emph' tag called 'Speech' which allows whole blobs of text to be put inside of it with a bit of extra information attached (the person speaking). So something like this:
{"t":"Speech","speaker":"paul","c":[ ... ] }
Problem #1: At the point a lua-filter sees the document, it is obviously already distilled to an AST. This means replacing the items in a manner similar to what most macro expander samples do cannot really work since it would require reading forward. With this method, I just replace bits and pieces in place (<NAME| becomes a StartSpeech and the first solitary > that follows becomes an EndSpeech, but that would make malformed input a bigger potential problem because of silent-ish failures. Additionally, these tags would be completely out of sorts with how an AST is supposed to look.
To complicate matters even further, some of my characters end up learning a secondary language throughout the story, for which I apply a different format that contains a simplified understanding of the spoken text with perspective-characters understanding of what was said. Example:
<paul|"Heb je goed geslapen?"|"Did you ?????">
I could probably add a third 'UnderstoodSpeech' group to my filter, but (problem #2) at this point, the relationship between the speaker, the original speech, and the understood translation is completely gone. As long as the final documents need these values in these respective orders and only in these orders, it is fine... but what if I want my HTML version to look like
"Did you?????"
with a tool-tip / hover-over effect containing the original speech? That would be near impossible to achieve because the AST does not contain that kind of relational detail.
Whatever kind of AST I create in the filter is what I need to understand in my custom writer. Ideally, I want to re-use as much stock functionality of pandoc as possible for the writer, but I don't even know if that is feasible at this point.
So now my question: could someone with great pandoc understanding please give me an example on how to keep relevant data-bits together and apply them in the correct manner? By this I mean show a basic example of what needs to be put in the lua-filter and lua-writer scripts in the following toolchain
[CUSTOMIZED MARKDOWN INPUT] -> lua-filter -> lua-writer -> [CUSTOMIZED HTML5 OUTPUT]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I'm reading it again and again but can't understand.
http://awesomescreenshot.com/09c45nhted
A few things i dont understand:
Epsilon meaning, aside from "empty string".
$ meaning
How R3 is possible? It has term which would go to factor which would go to something that is not exist in input stream.
3rd bullet point on second page
I appreciate any help. Thank you!
Epsilon meaning, aside from "empty string"?
This ϵ symbol in the simplest way means nothing.
$ meaning ?
$ can mean either the starting of input OR the end of input. But,here it means the end of input as starting of input can't start with $ because of such CFG having start state stmt.
How R3 is possible? It has term which would go to factor which would
go to something that is not exist in input stream.
Beginners have problem dealing with this kind of thing. It's normal and should be. This kind of production is kind of recursive production.But,it will get resolved easily on parsing the input. You can notice the next production R4 : term_tail---> ϵ. Whenever substitution of term_tail won't require any input, then this production can be used to deal with that stage. So, no infinite recursion as to what you might have been thinking...
3rd bullet point on second page?
It is the input character that can follow term_tail in the grammar. This statement is the answer to the question mentioned in second bullet point "So what input character can be consumed if we apply R4?" Actually, the input string that is going to be derived for the term_tail can be done in 2 ways :-
EITHER term_tail ---> add_op term term_tail OR term_tail ---> ϵ
Through the help of those bulleted points, the author is trying to highlight the practical significance of FOLLOW() function in top-down parsing. The author's intent is to evaluate the conditions on which R4 can be applied on top-down parsing as mentioned on the top of 2nd page "the possible input characters for which R4 can be applied?".
The FOLLOW() of term_tail comes out to be '$',')'. You will be able to calculate this when you'll study FOLLOW() function's rules.
NOTE (VERY-VERY IMPORTANT) :-
FOLLOW() shows us the terminals that can come after a derived non-terminal. Note, this does not mean the last terminal derived from a non-terminal. It's the set of terminals that can come after it. We define FOLLOW() for all the non-terminals in the grammar.
How do we figure out FOLLOW()? Instead of looking at the first terminal for each phrase on the right side of the arrow, we find every place our non-terminal is located on the right side of any of the arrows. Then we look for some terminals.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Hashtags sometimes combine two or more words, such as:
content marketing => #contentmarketing
If I have a bunch of hashtags assigned to an article, and the word is in that article, i.e. content marketing. How can I take that hash tag, and detect the word(s) that make up the hashtag?
If the hashtag is a single word, it's trivial: simply look for that word in the article. But, what if the hash tag is two or more words? I could simply split the hashtag in all possible indices and check if the two words produced were in the article.
So for #contentmarketing, I'd check for the words:
c ontentmarketing
co ntentmarketing
con tentmarketing
...
content marketing <= THIS IS THE ANSWER!
...
However, this fails if there are three or more words in the hashtags, unless I split it recursively but that seems very inelegant.
Again, this is assuming the words in the hash tag are in the article.
You can use a regex with an optional space between each character to do this:
your_article =~ /#{hashtag.chars.to_a.join(' ?')}/
I can think of two possible solutions depending on the requirements for the hashtags:
Assuming hashtags must be made up of words and can't be non-words like "#abfgtest":
Do the test similar to your answer above but only test the first part of the string. If the test fails then add another character and try again until you have a word. Then repeat this process on the remaining string until you have found each word. So using your example it would first test:
- c
- co
- ...
- content <- Found a word, start over with rest
- m
- ma
- ...
- marketing <- Found a word, no more string so exit
If you can have garbage, then you will need to do the same thing as option 1. with an additional step. Whenever you reach the end of the string without finding a word, go back to the beginning + 1. Using the #abfgtest example, first you'd run the above function on "abfgtest", then "bfgtest", then "fgtest", etc.
I am looking to write a basic profanity filter in a Rails based application. This will use a simply search and replace mechanism whenever the appropriate attribute gets submitted by a user. My question is, for those who have written these before, is there a CSV file or some database out there where a list of profanity words can be imported into my database? We are submitting the words that we will replace the profanities with on our own. We more or less need a database of profanities, racial slurs and anything that's not exactly rated PG-13 to get triggered.
As the Tin Man suggested, this problem is difficult, but it isn't impossible. I've built a commercial profanity filter named CleanSpeak that handles everything mentioned above (leet speak, phonetics, language rules, whitelisting, etc). CleanSpeak is capable of filtering 20,000 messages per second on a low end server, so it is possible to build something that works well and performs well. I will mention that CleanSpeak is the result of about 3 years of on-going development though.
There are a few things I tell everyone that is looking to try and tackle a language filter.
Don't use regular expressions unless you have a small list and don't mind a lot of things getting through. Regular expressions are relatively slow overall and hard to manage.
Determine if you want to handle conjugations, inflections and other language rules. These often add a considerable amount of time to the project.
Decide what type of performance you need and whether or not you can make multiple passes on the String. The more passes you make the slow your filter will be.
Understand the scunthrope and clbuttic problems and determine how you will handle these. This usually requires some form of language intelligence and whitelisting.
Realize that whitespace has a different meaning now. You can't use it as a word delimiter any more (b e c a u s e of this)
Be careful with your handling of punctuation because it can be used to get around the filter (l.i.k.e th---is)
Understand how people use ascii art and unicode to replace characters (/ = v - those are slashes). There are a lot of unicode characters that look like English characters and you will want to handle those appropriately.
Understand that people make up new profanity all the time by smashing words together (likethis) and figure out if you want to handle that.
You can search around StackOverflow for my comments on other threads as I might have more information on those threads that I've forgotten here.
Here's one you could use: Offensive/Profane Word List from CMU site
Based on personal experience, you do understand that it's an exercise in futility?
If someone wants to inject profanity, there's a slew of words that are innocent in one context, and profane in another so you'll have to write a context parser to avoid black-listing clean words. A quick glance at CMU's list shows words I'd never consider rude/crude/socially unacceptable. You'll see there are many words that could be proper names or nouns, countries, terms of endearment, etc. And, there are myriads of ways to throw your algorithm off using L33T speak and such. Search Wikipedia and the internets and you can build tables of variations of letters.
Look at CMU's list and imagine how long the list would be if, in addition to the correct letter, every a could also be 4, o could be 0 or p, e could be 3, s could be 5. And, that's a very, very, short example.
I was asked to do a similar task and wrote code to generate L33T variations of the words, and generated a hit-list of words based on several profanity/offensive lists available on the internet. After running the generator, and being a little over 1/4 of the way through the file, I had over one million entries in my DB. I pulled the plug on the project at that point, because the time spent searching, even using Perl's Regex::Assemble, was going to be ridiculous, especially since it'd still be so easy to fool.
I recommend you have a long talk with whoever requested that, and ask if they understand the programming issues involved, and low-likelihood of accuracy and success, especially over the long-term, or the possible customer backlash when they realize you're censoring them.
I have one that I've added to (obfuscated a bit) but here it is: https://github.com/rdp/sensible-cinema/blob/master/lib/subtitle_profanity_finder.rb
A pet peeve of mine is the use of double square brackets for Part rather than the single character \[LeftDoubleBracket] and \[RightDoubleBracket]. I would like to have these automatically replaced when pasting plain-text code (from StackOverflow for example) into a Mathematica Notebook. I have been unable to configure this.
Can it be done with ImportAutoReplacements or another automatic method (preferred), or will I need use a method like the "Paste Tabular Data Palette" referenced here?
Either way, I am not good with string parsing, and I want to learn the best way to handle bracket counting.
Sjoerd gave Defer and Simon gave Ctrl+Shift+N which both cause Mathematica to auto-format code. These are fine options.
I am still interested in a method that is automatic and/or preserves as much of the original code as possible. For example, maintaining prefix f#1, infix 1 ~f~ 2, and postfix 1 // f functions in their original forms.
A subsection of this question was reposted as Matching brackets in a string and received several good answers.
Not really an answer, but a thread on entering the double [[ ]] pair (with the cursor between both pairs) using a single keystroke occurred a couple of weeks ago on the mathgroup. It didn't help me, but for others this was a solution apparently.
EDIT
to make good on my slightly off-topic first response here's a pattern replacement that seems to do the job (although I have difficulties myself to understand why it should be b and not b_; the latter doesn't work):
Defer[f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]] /.
HoldPattern[Part[b, a_]] -> HoldPattern[b\[LeftDoubleBracket]a\[RightDoubleBracket]]
I leave the automation part to you.
EDIT 2
I discovered that if you add the above rule to ImportAutoReplacements and paste your SO code in a notebook in a Defer[] and evaluate this, you end up with a usable form with double brackets which can be used as input somewhere else.
EDIT 3
As remarked by Mr.Wizard invisibly below in the comments, the replacement rule isn't necessary. Defer does it on its own! Scientific progress goes "Boink", to cite Bill Watterson.
EDIT 4
The jury is still out on Defer. It has some peculiar side effects, and doesn't work well on all expressions. try the "Paste Tabular Data Palette" in the toolbag question for instance. Pasting this block of code in Defer and executing gives me this:
It worked much better in another code snippet from the same thread:
The second part is how it looks after turning it in to input by editing the output of the first block (basically, I inserted a couple of returns to restore the format). This turns it into Input. Please notice that all double brackets turned into the correct corresponding symbol, but notice also the changing position of ReleaseHold.
Simon wrote in a comment, but declined to post as an answer, something fairly similar to what I requested, though it is not automatic on paste, and is not in isolation from other formatting.
(One can) select the text and press Ctrl+Shift+N to translate to StandardForm