Lucene does not index some words? - asp.net-mvc

I use leucene.net for my site and it Index some of the words fine and correct but it doesn't index some words like "الله"!
I have see the indexed file with Luke and it shows that "الله"is not indexed.
I have used ArabicAnalyzer for indexing.
you can see my site at www.qoranic.com , if you search "مریم" it will be ok but if you search "الله" it shows nothing.
any idea is appreciated in forward.

The ArabicAnalyzer does some transformation to that input; it will transform the input الله to له. This is due to the usage of the ArabicStemFilter (and ArabicStemmer) which is documented with ...
Stemming is defined as:
Removal of attached definite article, conjunction, and prepositions.
Stemming of common suffixes.
This shouldn't be an issue since you should be parsing the user provided query through the same analyzer when searching, producing the same tokens.
Here's the sample code I used to see what terms an analyzer produced from a given input.
using System;
using Lucene.Net.Analysis.AR;
using Lucene.Net.Analysis.Tokenattributes;
using System.IO;
namespace ConsoleApplication {
public static class Program {
public static void Main() {
var luceneVersion = Lucene.Net.Util.Version.LUCENE_30;
var input = "الله";
var analyzer = new ArabicAnalyzer(luceneVersion);
var inputReader = new StringReader(input);
var stream = analyzer.TokenStream("fieldName", inputReader);
var termAttribute = stream.GetAttribute<ITermAttribute>();
while(stream.IncrementToken()) {
Console.WriteLine("Term: {0}", termAttribute.Term);
}
Console.WriteLine("Done.");
Console.ReadLine();
}
}
}
You can overcome this behavior (remove the stemming) by writing a custom Analyzer which uses the ArabicNormalizationFilter, just as ArabicAnalyzer does, but without the call to ArabicStemFilter.
public class CustomAnalyzer : Analyzer {
public override TokenStream TokenStream(String fieldName, TextReader reader) {
TokenStream result = new ArabicLetterTokenizer(reader);
result = new LowerCaseFilter(result);
result = new ArabicNormalizationFilter(result);
return result;
}
}

Related

How to get my object (Generator) from a Map<UUID, List<Generator>> with streams?

I've been wanting to check the location of my Generator and use streams to check if the location is valid.
The idea was as follows;
public Generator getGeneratorFromLocation(final Location location) {
for (List<Generator> generator : playerGeneratorMap.values()) {
for (Generator generator1 : generator) {
if (generator1.getGenLocation().equals(location)) {
return generator1;
}
}
}
return null;
}
I'm wanting to return a Generator from this using streams instead to try and learn more ways of doing it.
Current map:
public final Map<UUID, List<Generator>> playerGeneratorMap = new HashMap<>();
Any help would be greatly appreciated.
You can use AtomicRef object to init a retVal and then assign the wanted Generator to it in the lambda expression because regular vars can't be assigned in lambdas, only final or effectivly final can be used inside arrow functions.
This function should solve the problem :)
public Generator getGeneratorFromLocation(final Location location) {
AtomicReference<Generator> retVal = new AtomicReference<>(null);
playerGeneratorMap.values().stream().forEach(generators -> {
generators.forEach(generator -> {
if (generator.getLocation().equals(location)) {
retVal.set(generator);
}
});
});
return retVal.get();
}
By the way, streams are unnecessary because you have Collection.forEach instead of Stream.forEach, streams are used for more 'exotic' types of iterations like, filter, anyMatch, allMatch, reduce and such functionalities, you can read about Streams API on Oracle's website,
I'll link in the docs for you for future usage, important for functional proggraming.

Read from multiple Pubsub subscriptions using ValueProvider

I have multiple subscriptions from Cloud PubSub to read based on certain prefix pattern using Apache Beam. I extend PTransform class and implement expand() method to read from multiple subscriptions and do Flatten transformation to the PCollectionList (multiple PCollection on from each subscription). I have a problem to pass subscription prefix as ValueProvider into the expand() method, since expand() is called on template creation time, not when launching the job. However, if I only use 1 subscription, I can pass ValueProvider into PubsubIO.readStrings().fromSubscription().
Here's some sample code.
public class MultiPubSubIO extends PTransform<PBegin, PCollection<PubsubMessage>> {
private ValueProvider<String> prefixPubsub;
public MultiPubSubIO(#Nullable String name, ValueProvider<String> prefixPubsub) {
super(name);
this.prefixPubsub = prefixPubsub;
}
#Override
public PCollection<PubsubMessage> expand(PBegin input) {
List<String> myList = null;
try {
// prefixPubsub.get() will return error
myList = PubsubHelper.getAllSubscription("projectID", prefixPubsub.get());
} catch (Exception e) {
LogHelper.error(String.format("Error getting list of subscription : %s",e.toString()));
}
List<PCollection<PubsubMessage>> collectionList = new ArrayList<PCollection<PubsubMessage>>();
if(myList != null && !myList.isEmpty()){
for(String subs : myList){
PCollection<PubsubMessage> pCollection = input
.apply("ReadPubSub", PubsubIO.readMessagesWithAttributes().fromSubscription(this.prefixPubsub));
collectionList.add(pCollection);
}
PCollection<PubsubMessage> pubsubMessagePCollection = PCollectionList.of(collectionList)
.apply("FlattenPcollections", Flatten.pCollections());
return pubsubMessagePCollection;
} else {
LogHelper.error(String.format("No subscription with prefix %s found", prefixPubsub));
return null;
}
}
public static MultiPubSubIO read(ValueProvider<String> prefixPubsub){
return new MultiPubSubIO(null, prefixPubsub);
}
}
So I'm thinking of how to use the same way PubsubIO.read().fromSubscription() to read from ValueProvider. Or am I missing something?
Searched links:
extract-value-from-valueprovider-in-apache-beam - Answer talked about using DoFn, while I need PTransform that receives PBegin.
Unfortunately this is not possible currently:
It is not possible for the value of a ValueProvider to affect transform expansion - at expansion time, it is unknown; by the time it is known, the pipeline shape is already fixed.
There is currently no transform like PubsubIO.read() that can accept a PCollection of topic names. Eventually there will be (it is enabled by Splittable DoFn), but it will take a while - nobody is working on this currently.
You can use MultipleReadFromPubSub from apache beam io module https://beam.apache.org/releases/pydoc/2.27.0/_modules/apache_beam/io/gcp/pubsub.html
topic_1 = PubSubSourceDescriptor('projects/myproject/topics/a_topic')
topic_2 = PubSubSourceDescriptor(
'projects/myproject2/topics/b_topic',
'my_label',
'my_timestamp_attribute')
subscription_1 = PubSubSourceDescriptor(
'projects/myproject/subscriptions/a_subscription')
results = pipeline | MultipleReadFromPubSub(
[topic_1, topic_2, subscription_1])

EPiServer 7 namespace for Locate() isn't resolving

I'm new to EPiServer, and am attempting to retrieve child news article pages from a news list that I created. Examples that I've found online used the Locate() method, but when I attempt to apply Locate to my code, it's not being found.
This is one of the articles that I looked at.
http://world.episerver.com/Blogs/Johan-Bjornfot/Dates1/2012/8/EPiServer7-Working-with-IContentRepositoryDataFactory/
Essentially, I just need to return a list of child articles for a list of news items, so it's possible that the approach that I'm attempting is not right in the first place.
At any rate, this is my current model with the using statements.
using EPiServer;
using EPiServer.Core;
using EPiServer.DataAbstraction;
using EPiServer.DataAnnotations;
using EPiServer.ServiceLocation;
using EPiServer.SpecializedProperties;
using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
namespace EPiServerExercise.Models.Pages
{
[ContentType(DisplayName = "News List", GUID = "ac3287b1-4d78-4eb3-bad2-6b5c43530b33", Description = "")]
public class NewsList : BasePage
{
private IEnumerable<NewsArticle> getNewsArticles(NewsList currentPage)
{
//var contentLoader = ServiceLocator.Current.GetInstance<IContentLoader>
//IEnumerable<NewsArticle> newsArticles = new List<NewsArticle>();
//PageReference pageLink = currentPage.ParentLink;
//IEnumerable<NewsArticle> newsArticles = Locate.ContentRepository().GetChildren<IContent>(pageLink);
//IEnumerable<NewsArticle> newsArticles = ServiceLocationHelperExtensions.
//var serviceLocationHelper = ServiceLocator.Current.GetInstance();
//serviceLocationHelper.ContentLoader
}
}
}
What reference am I missing to get the Locate() method to resolve? We are using EPiServer 7 and MVC. Thanks for your help.
Update on 11/18/2014
This is the eventual solution that I put into the model. It's almost identical to what Vsevolod Goloviznin suggested. Thanks.
public string showNewsArticles()
{
IEnumerable<NewsArticle> newsArticles = getNewsArticles(this);
// Code to loop through the articles
}
private IEnumerable<NewsArticle> getNewsArticles(NewsList currentPage)
{
var repository = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentRepository>();
IEnumerable<NewsArticle> newsArticles = repository.GetChildren<NewsArticle>(currentPage.ContentLink);
return newsArticles;
}
Looks like he just uses a factory to get IContentRepository and forgot to mention to. So to just get the same functionality you can use the ServiceLocator to get the IContentRepository and then get all children for your page:
var service = EPiServer.ServiceLocation.ServiceLocator.Current.GetInstance<IContentRepository>();
var pages = service.GetChildren<NewsArticle>(currentPage.ParentLink).ToList();

NewtonSoft json Contract Resolver with MVC 4.0 Web Api not producing the output as expected

I am trying to create a conditional ContractResolver so that I can control the serialization differently depending on the web request/controller action.
For example in my User Controller I want to serialize all properties of my User but some of the related objects I might only serialize the primitive types. But if I went to my company controller I want to serialize all the properties of the company but maybe only the primitive ones of the user (because of this I don't want to use dataannotations or shouldserialize functions.
So looking at the custom ContractResolver page i created my own.
http://james.newtonking.com/projects/json/help/index.html?topic=html/ContractResolver.htm
It looks like this
public class IgnoreListContractResolver : DefaultContractResolver
{
private readonly Dictionary<string, List<string>> IgnoreList;
public IgnoreListContractResolver(Dictionary<string, List<string>> i)
{
IgnoreList = i;
}
protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
{
List<JsonProperty> properties = base.CreateProperties(type, memberSerialization).ToList();
if(IgnoreList.ContainsKey(type.Name))
{
properties.RemoveAll(x => IgnoreList[type.Name].Contains(x.PropertyName));
}
return properties;
}
}
And then in my web api controller action for GetUsers i do this
public dynamic GetUsers()
{
List<User> Users = db.Users.ToList();
List<string> RoleList = new List<string>();
RoleList.Add("UsersInRole");
List<string> CompanyList = new List<string>();
CompanyList.Add("CompanyAccesses");
CompanyList.Add("ArchivedMemberships");
CompanyList.Add("AddCodes");
Dictionary<string, List<string>> IgnoreList = new Dictionary<string, List<string>>();
IgnoreList.Add("Role", RoleList);
IgnoreList.Add("Company", CompanyList);
GlobalConfiguration
.Configuration
.Formatters.JsonFormatter
.SerializerSettings
.ContractResolver = new IgnoreListContractResolver(IgnoreList);
return new { List = Users, Status = "Success" };
}
So when debugging this I see my contract resolver run and it returns the correct properties but the Json returned to the browser still contains entries for the properties I removed from the list.
Any ideas what I am missing or how I can step into the Json serialization step in webapi controllers.
*UPDATE**
I should add that this is in an MVC4 project that has both MVC controllers and webapi controllers. The User, Company, and Role objects are objects (created by code first) that get loaded from EF5. The controller in question is a web api controller. Not sure why this matters but I tried this in a clean WebApi project (and without EF5) instead of an MVC project and it worked as expected. Does that help identify where the problem might be?
Thanks
*UPDATE 2**
In the same MVC4 project I created an extension method for the Object class which is called ToJson. It uses Newtonsoft.Json.JsonSerializer to serialize my entities. Its this simple.
public static string ToJson(this object o, Dictionary<string, List<string>> IgnoreList)
{
JsonSerializer js = JsonSerializer.Create(new Newtonsoft.Json.JsonSerializerSettings()
{
Formatting = Formatting.Indented,
DateTimeZoneHandling = DateTimeZoneHandling.Utc,
ContractResolver = new IgnoreListContractResolver(IgnoreList),
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
});
js.Converters.Add(new Newtonsoft.Json.Converters.StringEnumConverter());
var jw = new StringWriter();
js.Serialize(jw, o);
return jw.ToString();
}
And then in an MVC action i create a json string like this.
model.jsonUserList = db.Users.ToList().ToJson(IgnoreList);
Where the ignore list is created exactly like my previous post. Again I see the contract resolver run and correctly limit the properties list but the output json string still contains everything (including the properties I removed from the list). Does this help? I must be doing something wrong and now it seems like it isn't the MVC or web api framework. Could this have anything to do with EF interactions/ proxies /etc. Any ideas would be much appreciated.
Thanks
*UPDATE 3***
Process of elimination and a little more thorough debugging made me realize that EF 5 dynamic proxies were messing up my serialization and ContractResolver check for the type name match. So here is my updated IgnoreListContractResolver. At this point I am just looking for opinions on better ways or if I am doing something terrible. I know this is jumping through a lot of hoops just to use my EF objects directly instead of DTOs but in the end I am finding this solution is really flexible.
public class IgnoreListContractResolver : CamelCasePropertyNamesContractResolver
{
private readonly Dictionary<string, List<string>> IgnoreList;
public IgnoreListContractResolver(Dictionary<string, List<string>> i)
{
IgnoreList = i;
}
protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
{
List<JsonProperty> properties = base.CreateProperties(type, memberSerialization).ToList();
string typename = type.Name;
if(type.FullName.Contains("System.Data.Entity.DynamicProxies.")) {
typename = type.FullName.Replace("System.Data.Entity.DynamicProxies.", "");
typename = typename.Remove(typename.IndexOf('_'));
}
if (IgnoreList.ContainsKey(typename))
{
//remove anything in the ignore list and ignore case because we are using camel case for json
properties.RemoveAll(x => IgnoreList[typename].Contains(x.PropertyName, StringComparer.CurrentCultureIgnoreCase));
}
return properties;
}
}
I think it might help if you used Type instead of string for the ignore list's key type. So you can avoid naming issues (multiple types with the same name in different namespaces) and you can make use of inheritance. I'm not familiar with EF5 and the proxies, but I guess that the proxy classes derive from your entity classes. So you can check Type.IsAssignableFrom() instead of just checking whether typename is a key in the ignore list.
private readonly Dictionary<Type, List<string>> IgnoreList;
protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
{
List<JsonProperty> properties = base.CreateProperties(type, memberSerialization).ToList();
// look for the first dictionary entry whose key is a superclass of "type"
Type key = IgnoreList.Keys.FirstOrDefault(k => k.IsAssignableFrom(type));
if (key != null)
{
//remove anything in the ignore list and ignore case because we are using camel case for json
properties.RemoveAll(x => IgnoreList[key].Contains(x.PropertyName, StringComparer.CurrentCultureIgnoreCase));
}
return properties;
}
Then the ignore list must be created like this (I also used the short syntax for creating the list and dictionary):
var CompanyList = new List<string> {
"CompanyAccesses",
"ArchivedMemberships",
"AddCodes"
};
var IgnoreList = new Dictionary<Type, List<string>> {
// I just replaced "Company" with typeof(Company) here:
{ typeof(Company), CompanyList }
};
Be aware that, if you use my code above, adding typeof(object) as the first key to the ignore list will cause this entry to be matched every time, and none of your other entries will ever be used! This happens because a variable of type object is assignable from every other type.

How to Unit Test JsonResult and Collections in MSTest

I am very new to unit testing even though i have been coding for a very long time. I want to make this a part of my way of development. I run into blocks on how to unit test things like a collection. I generally have my jQuery script calling ASP.Net Server side methods to get data and populate tables and the like. They look like
Get_*Noun*()
which generally returns a JsonResult. Any ideas on what and how to test these using Unit tests using MSTest?
You should be able to test this just like anything else, provided you can extract the values from the JsonResult. Here's a helper that will do that for you:
private T GetValueFromJsonResult<T>(JsonResult jsonResult, string propertyName)
{
var property =
jsonResult.Data.GetType().GetProperties()
.Where(p => string.Compare(p.Name, propertyName) == 0)
.FirstOrDefault();
if (null == property)
throw new ArgumentException("propertyName not found", "propertyName");
return (T)property.GetValue(jsonResult.Data, null);
}
Then call your controller as usual, and test the result using that helper.
var jsonResult = yourController.YourAction(params);
bool testValue = GetValueFromJsonResult<bool>(jsonResult, "PropertyName");
Assert.IsFalse(testValue);
(I am using NUnit syntax, but MSUnit shouldn't be far off)
You could test your JsonResult like this:
var json = Get_JsonResult()
dynamic data = json.Data;
Assert.AreEqual("value", data.MyValue)
Then in the project that contains the code to be tested, edit AssemblyInfo.cs file to allow the testing assembly access to the anonymous type:
[assembly: InternalsVisibleTo("Tests")]
This is so the dynamic can determine the type of anonymous object being returned from the json.Data value;
This is the best blog I've found on this subject.
My favorite was the 4th approach using dynamics. Note that it requires you to ensure that the internals are visible to your test project using [assembly:InternalsVisibleTo("TestProject")] which I find is a reasonably good idea in general.
[TestMethod]
public void IndexTestWithDynamic()
{
//arrange
HomeController controller = new HomeController();
//act
var result = controller.Index() as JsonResult;
//assert
dynamic data = result.Data;
Assert.AreEqual(3, data.Count);
Assert.IsTrue(data.Success);
Assert.AreEqual("Adam", data.People[0].Name);
}
You could use PrivateObject to do this.
var jsonResult = yourController.YourAction(params);
var success = (bool)(new PrivateObject(jsonResult.Data, "success")).Target;
Assert.IsTrue(success);
var errors = (IEnumerable<string>)(new PrivateObject(jsonResult.Data, "errors")).Target;
Assert.IsTrue(!errors.Any());
It's uses reflection similar to David Ruttka's answer, however it'll save you a few key strokes.
See http://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.unittesting.privateobject.aspx for more info.
Here's a small extension to easily convert a Json ActionResult into the object it represents.
using System.Web.Mvc;
public static class WebExtensions
{
public static T ToJson<T>(this ActionResult actionResult)
{
var jsonResult = (JsonResult)actionResult;
return (T)jsonResult.Data;
}
}
With this, your 'act' in the test becomes smaller:
var myModel = myController.Action().ToJson<MyViewModel>();
My suggestion would be to create a model for the data returned and then cast the result into that model. That way you can verify:
the structure is correct
the data within the model is correct
// Assert
var result = action
.AssertResultIs<JsonResult>();
var model = (UIDSearchResults)result.Data;
Assert.IsTrue(model.IsValid);
Assert.AreEqual("ABC", model.UIDType);
Assert.IsNull(model.CodeID);
Assert.AreEqual(4, model.PossibleCodes.Count());

Resources