How to index sub-content in Sitecore with Lucene? - asp.net-mvc

I'm using Sitecore 7.2 with MVC and a component approach to page building. This means that pages are largely empty and the content comes from the various renderings placed on the page. However, I would like the search results to return the main pages, not the individual content pieces.
Here is the basic code I have so far:
public IEnumerable<Item> GetItemsByKeywords(string[] keywords)
{
var index = ContentSearchManager.GetIndex("sitecore_master_index");
var allowedTemplates = new List<ID>();
IEnumerable<Item> items;
// Only Page templates should be returned
allowedTemplates.Add(new Sitecore.Data.ID("{842FAE42-802A-41F5-96DA-82FD038A9EB0}"));
using (var context = index.CreateSearchContext(SearchSecurityOptions.EnableSecurityCheck))
{
var keywordsPredicate = PredicateBuilder.True<SearchResultItem>();
var templatePredicate = PredicateBuilder.True<SearchResultItem>();
SearchResults<SearchResultItem> results;
// Only return results from allowed templates
templatePredicate = allowedTemplates.Aggregate(templatePredicate, (current, t) => current.Or(p => p.TemplateId == t));
// Add keywords to predicate
foreach (string keyword in keywords)
{
keywordsPredicate = keywordsPredicate.And(p => p.Content.Contains(keyword));
}
results = context.GetQueryable<SearchResultItem>().Where(keywordsPredicate).Filter(templatePredicate).GetResults();
items = results.Hits.Select(hit => hit.Document.GetItem());
}
return items;
}

You could create a computed field in the index which looks at the renderings on the page and resolves each rendering's data source item. Once you have each of those items you can index their fields and concatenate all of this data together.
One option is to do this with the native "content" computed field which is natively what full text search uses.

An alternative solution is to make an HttpRequest back to your published site and essentially scrape the HTML. This ensures that all renderings are included in the index.
You probably will not want to index common parts, like the Menu and Footer, so make use of HTMLAgilityPack or FizzlerEx to only return the contents of a particular parent container. You could get more clever to remove inner containers is you needed to. Just remember to strip out the html tags as well :)
using HtmlAgilityPack;
using Fizzler.Systems.HtmlAgilityPack;
//get the page
var web = new HtmlWeb();
var document = web.Load("http://localsite/url-to-page");
var page = document.DocumentNode;
var content = page.QuerySelector("div.main-content");

Related

How to embed html in Google Forms question's description

I am trying to adapt Zack Akil's script to generate a Google Form from a Google Sheet using App Script, but one thing that I am struggling with is to make the sheet's input parsed as HTML. I generate a form based on my sheet, all the text on cells is placed in Forms as plain text, the HTML is not parsed (see figure below).
I pasted the script from Zack and I kindly ask you to point out where should I modify in order to have this parsed on the form.
function getSpreadsheetData(sheetName) {
// Return an list of objects (one for each row) containing the sheets data.
var arrayOfArrays = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName || 'Sheet1').getDataRange().getValues();
var headers = arrayOfArrays.shift();
return arrayOfArrays.map(function (row) {
return row.reduce(function (memo, value, index) {
if (value) {
memo[headers[index]] = value;
}
return memo;
}, {});
});
}
function create_ranges_for_data(form, data, data_section_name){
// loop throughh each row
data.forEach(function (row) {
// create a new question page
form.addPageBreakItem()
// add page title
form.addSectionHeaderItem()
.setTitle(data_section_name);
// create number range input with the title being the document to be labeled
form.addScaleItem()
.setTitle(row[data_section_name])
.setBounds(1, 10)
.setRequired(true);
});
}
function make_form_using_column(column_name) {
// create a new Google Form document
var form = FormApp.create('Data labelling - ' + column_name)
desc = "Thank you for taking the time to label this data!";
form.setDescription(desc);
form.setProgressBar(true);
form.setShowLinkToRespondAgain(false)
var data = getSpreadsheetData();
create_ranges_for_data(form, data, column_name);
}
function gen_form(){
var COLUMN_TO_USE = 'Input text'
make_form_using_column(COLUMN_TO_USE);
}
You can't use HTML text formatting. most sites block it because it poses a security risk. You might need to install an add-on or, like fullfine said, use bold text.

Calling multiple list of item in View using Tuple in MVC

When i am using 4 List at view then its working properly means in my example if i am using till PreClearanceDetail its working properly for all four list but after 5th one added that is FAQ its showing below compliation error.
Kindly guide for the same because as per my knowledge tuple can take N number of arguments.
My controller code is :
[HttpGet]
public ActionResult AdminDashboard()
{
DB_Entities entities = new DB_Entities();
List<Menu> MenuVM= entities.Menu.ToList();
List<TradeClose> tradeclose = entities.TradeCloses.ToList();
List<NonComplianceCas> cases = entities.NonComplianceCases.ToList();
List<PreClearanceDetail> data = entities.PreClearanceDetails.Where(a => a.Flag == ull).ToList();
List<FAQ> faqs = entities.FAQs.ToList();
List<Annoucement> Annoucements = entities.Annoucements.ToList();
return View(Tuple.Create(Menu, tradeclose, cases, data , faqs));
}
My View code is :
#model Tuple<List<MenuVM>, List<TradeClose>, List<NonComplianceCas>, List<PreClearanceDetail> ,List<FAQ>>

Partial CellFeed load

Happy new year, folks,
Currently, I'm accessing and loading a Google Sheets worksheet using the following, default way:
URL metafeedUrl = new URL(SPREADSHEET_URL);
SpreadsheetEntry spreadsheet = service.getEntry(metafeedUrl, SpreadsheetEntry.class);
URL cellFeedUrl = ((WorksheetEntry) spreadsheet.getWorksheets().get(0)).getCellFeedUrl();
// Get entries as cells
feed = (CellFeed) service.getFeed(cellFeedUrl, CellFeed.class);
Then I work with it, etc. Everyting works just fine.
The problem:
I'm about to deploy the application and have it work with a Worksheet that has several hundred, if not thousand rows of cells. To me, the only relevant rows are usually the 100-200 bottom ones.
Is there a way to partially load a CellFeed, preferrably from the bottom up? Does the API provide such a way?
Looking at the API itself, you can do it with cell feed or list feed.
in cell feed, look at https://developers.google.com/google-apps/spreadsheets/#fetching_specific_rows_or_columns
you can specify there the minimum/maximum row/columns to get, and there is also a java example in there.
a more efficient way to get your data, is the row feed as it sends less bytes in return:https://spreadsheets.google.com/feeds/list/
with the undocumented "start-index' parameter so it only reads starting at that row.
I use this and works for the "old" and "new" sheets.
The first time you will need to get all rows (or attempt some sort of binary lookup to find the last spreadsheet row).
I have not used the java api library, it probably does not allow for that undocumented parameter. You can always do a url "get" directly from java or any language and use the spreadsheet api directly by https.
I got this tip a long time ago from here:
https://groups.google.com/forum/#!topic/google-spreadsheets-api/dSniiF18xnM
and use it on this github project (javascript ajax call example)
https://github.com/zmandel/Plus-for-Trello/blob/master/source/sql.js
I don't know any CellFeed, but you can create an HTML feed that retrieves a number of ROWS from any Spreadsheet, then treat that HTML, would that work for you? What are the goals when retrieving the information?
Eg.
Code.gs
function doGet() {
return HtmlService.createTemplateFromFile("form").evaluate().setSandboxMode(HtmlService.SandboxMode.NATIVE);
}
function getLastLines( numLines, ssId, sheetName ){
var sheet = SpreadsheetApp.openById(ssId).getSheetByName(sheetName);
return JSON.stringify(sheet.getRange(sheet.getLastRow() - numLines, 1, numLines, sheet.getLastColumn()).getValues());
}
in form.html
<div id="arrayPlace"></div>
<script>
function changeDiv( res ){
document.getElementById("arrayPlace").innerHTML = res;
}
function getUrlParameter(sParam)
{
var sPageURL = window.location.search.substring(1);
var sURLVariables = sPageURL.split('&');
for (var i = 0; i < sURLVariables.length; i++)
{
var sParameterName = sURLVariables[i].split('=');
if (sParameterName[0] == sParam)
{
return sParameterName[1];
}
}
}
var numLines = getUrlParameter("numLines");
var ssId = getUrlParameter("ssId");
var sheetName = getUrlParameter("sheetName");
google.script.run.withSuccessHandler( changeDiv ).getLastLines( numLines, ssId, sheetName );
</script>
And the URL would have the additional parameters:
https://script.google.com/macros/s/AKfycbwFLN-qqTcXXAVgR-aDa9h61yTa39kVhE2MwX9htRbIm2NN5I4/exec?numLines=5&ssId=1dJlNmtvcsWixEDnUz7GxnyLMZKXHwA-9uopYPUC8I4E&sheetName=Sheet1

MVC3 EF Search DB fields but strip HTML

I am using the entity framework with MVC3 and am trying to do a search on a description field but the problem is that description field has HTML in it eg "< div class="section" />". Can i do a funky search that searches only the stuff outside of the HTML tags?
return context.Categories
.Where(i =>
i.Name.Contains(searchText)
&& i.Description.Contains(searchText)
)
Thanks in advance!
Give HtmlAgilityPack a go. It has methods for extracting the text out of an HTML Document.
You basically just need to do the following:
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);
var node = doc.DocumentNode;
var textContent = node.InnerText;
Or the much less awesome method:
public static string StripHTML(string htmlString)
{
string pattern = #"<(.|\n)*?>";
return Regex.Replace(htmlString, pattern, string.Empty);
}
All together
return StripHTML(context.Categories.Where(i => i.Name.Contains(searchText)&& i.Description.Contains(searchText)))

JQGrid Loading lots of data

SITUATION
I am using Trirand JQGrid for MVC[server side] in my proj.
I've got more than 5 hundred thousand records in a single table.
I load the data by calling this piece of code. this is what gives 500000 records collection.
IEnumerable<myIndexViewModel> myviewmodel= _allincidents.Select(x => new myIndexViewModel
{
IncidentRequestStatus = x.RequestStatus,
RequestByUserName = x.RequestByUserName,
Subject = x.Subject
});
gridModel.JqGrid.DataBind(myviewmodel.AsQueryable());
JQgrid handles the json based ajax requests very nicely for every next page i click.
PROBLEM
I dont want to load 5 hundred thousand records all together on the page load event as it kills jqgrid.
If i write a stored procedure in the DB for requesting a specific page to be displayed then its gonna load only that page in the myviewmodel collection.
How do i get pages on the fly from the DB when the next page is clicked. is this even possible in jqgrid?
SITUATION 2
Based on the answers from VIJAY and MARK the approach they have shown is absolutely correct but over here the JQGRID for MVC sets up the DATAURL property for making the method call. In this case its the IncidentGridRequest.
How do i send in the page number when the grid next page or previous page is clicked?
incidentModel.IncidentGrid.DataUrl = Url.Action("IncidentGridRequest")
public JsonResult IncidentGridRequest()
{
}
Your controller action that will provide your grid with results can accept some extra information from jqGrid.
public ActionResult GetGridData(string sidx, string sord, int page, int rows, bool _search, string filters)
The main parts you are interested in is the page, rows (sidx is for column sorting, sord for the sorting order, _search if there was a search done on the grid, and if so filters contains the search information)
When you generate your results you should be able to then
IEnumerable<myIndexViewModel> myviewmodel = allincidents.Select(x => new myIndexViewModel
{
IncidentRequestStatus = x.RequestStatus,
RequestByUserName = x.RequestByUserName,
Subject = x.Subject
}).Skip((page - 1) * rows).Take(rows)
PS. I'm not sure if you using IEnumberable will be moving a large amount of data from your DB but you might want to use IQueryable when you generate this subset of data for the jqGrid.
Edit: To deal with your paging issues, You should be calculating the number of total records in your query and passing that value to the grid, Ex
int totalRecords = myviewmodel.Count();
and then later you would pass that to your grid as a jSon value. Ex
var jsonData = new
{
total = (totalRecords + rows - 1) / rows,
page = page,
records = totalRecords,
userdata = new {SearchResultsFound = searchResultsFound},
rows = (
......
Yes, for example if you are accepting the page number you want to turn to in a variable named page and the have the size of page in a variable pageSize then:
IEnumerable<myIndexViewModel> myviewmodel = allincidents.Select(x => new myIndexViewModel
{
IncidentRequestStatus = x.RequestStatus,
RequestByUserName = x.RequestByUserName,
Subject = x.Subject
}).Skip((page-1)*pageSize).Take(pageSize));
will give you the records of size pageSize to you.
The Trirand jqGrid for ASP.NET MVC is using IQueryable interface inside the JqGrid.DataBind() method to implement pagin, sorting and filtering.
So the key here is to use datasource, which handle these types of operations at the database level (by crafting SQL queries to the database in such a way that only the data required is fetched). All major ORMs have this support, this includes: LINQ-2-SQL, Entity Framework, NHbiernate, LLBLGen.
You just need to use one of this technologies, and past the required context directly to JqGrid.DataBind() method (without extracting the data manually like you do it in your sample).
An easier approach by using PagedList library (from Nuget). There is a useful blog by Joseph Schrag
public JsonResult Users(int PageNo, int Rows)
{
var UserList = db.Users.Select(t => new
{
t.UserId,
t.Username,
t.Firstname,
t.Lastname,
t.Designation,
t.Country,
t.Email
}).OrderBy(t => t.UserId);
var pagedUserList = UserList.ToPagedList(PageNo, Rows);
var results = new
{
total = pagedUserList.PageCount, //number of pages
page = pagedUserList.PageNumber, //current page
records = UserList.Count(), //total items
rows = pagedUserList
};
return new JsonResult() { Data = results, JsonRequestBehavior = JsonRequestBehavior.AllowGet };
}

Resources