Searching through pdf files on SharePoint

Searching through pdf files on SharePoint - sharepoint-2007

I upload documents on SharePoint list and i could search their contents in case they are .doc or .txt files. However, search results do not include files of type .pdf or .docx
Is there something to include or add? Thank you!

Sharepoint uses IFilter plugins to search the contents of non-trivial and non-Office files. Try installing the Adobe PDF IFilter. There's also documentation explaining how to configure it.

SharePoint uses Ifilter for indexing the content of any type of document uploaded to the SharePoint. for office products like word, excel powerpoint the filter is provided with the installation for any other document types like PDF, Zip, tiff user would have to install the respective ifilter for PDF documents the latest ifilter can be downloaded from the adobe site after the installation of ifilter you would be required to add the file extension in file types in your ssp search setting

Related

Not able to create an edit link for an excel file using microsoft graph API

I want to create an editable link for an excel file stored in my Personal Ondrive account. According to the documentation Microsoft-Graph-Onedrive-docs, I should be able to use createLink API with payload as {"type": "edit", "view: "anonymous"}. But even after doing that the link opens a read-only excel file in excel online.
How do I open an editable excel file in excel online
I am using Personal Onedrive and calling the APIs via Microsoft graph explorer

If you have a .doc or .xls document you can not edit that. You'll need one of the newer Word, Excel formats like: .docx or .xlsx.

Impossible to upload documents on folder of document libraries with custom path

We are connecting to a Sharepoint site having multiple document libraries. Some of the document libraries are created with a custom path. To do this we use the "url" property as described in the documentation (https://msdn.microsoft.com/EN-US/library/office/microsoft.sharepoint.client.listcreationinformation_members.aspx).
As a result, the document libraries have following path in our Sharepoint site:
/sites/customers/CUSTOM_PATH/DocLib1/
/sites/customers/CUSTOM_PATH/DocLib2/
/sites/customers/CUSTOM_PATH/DocLib3/
...
Those document libraries also have multiple folders.
When uploading a document, for instance, to the folder ABC of DocLib1, the documents are not uploaded to the ABC folder. Instead, we have new folders "DocLib1/ABC" being created in the document library.
To upload the document via the graph API we use following endpoint:
https://graph.microsoft.com/v1.0/drives/DRIVE_ID/items/ABC_FOLDER_ID:/FILE_NAME:/createUploadSession
Reproduction scenario:
Create a document library DocLib1 using a custom URL path ( see URL property in https://msdn.microsoft.com/EN-US/library/office/microsoft.sharepoint.client.listcreationinformation_members.aspx).
Create a folder ABC in that document library
Upload a document to that folder using https://graph.microsoft.com/v1.0/drives/DRIVE_ID/items/ABC_FOLDER_ID:/FILE_NAME:/createUploadSession
Upload the document to the returned uploadUrl
Expected result: The document is uploaded in the ABC folder of document library DocLib1
Actual result: The document is uploaded in new folders "DocLib1/ABC" created in document library DocLib1
Did you face that issue before ? Are you aware of any workaround to go over it ?
Best regards,
Cyril.

Convert doc to pdf programmatically with out using WORD / thirdparty tools

Is it possible to convert a doc file to a pdf file programmatically, with out using WORD application/third party tools. Preferably in Delphi XE4. If so, how?

Yes, you can convert .doc/.docx files to .pdf without Word and without third-party controls. The specifications are publically available - [simply] read and parse the .doc/.docx file according to the specification and generate the content according to the .pdf specification.
Here is the specification for MS-DOC (.doc) file format :
MS-DOC Specification (622 pages) -- Word97 through 2007
MS-DOCX Extensions Specification (105 pages) -- Word2010 through 2013
See also - Open Document and OpenXML Format
And the specification for the .pdf format :
PDF Reference (1310 pages)
Really, I think you'll find you probably want to use a third party component...

Apache Tika Office to PDF conversion

I am trying to convert office files to PDF using POI and iText. I am able to do the basic conversion where I read the word file using WordExtractor and write the contents to PDF file using PDF writer.
However, this does not retain the structure (tables, styles etc). I have come across this forum that you can retain the formats using Tika. Are there any working examples for this?

Grabbying text from various document formats in Ruby on Rails

I'm new to Rails but am developing a web app that requires taking text from a large database of text files and displaying the text in html. The files are in .doc, .docx, .wps, and .pages, and are currently just sitting on a hardrive. There are a small enough number of files in .wps and .pages that I could convert these to .doc manually, but the question remains: how do I get to the text inside a .doc or .docx file so that I can save it into a sqlite database for later use?
Thanks!

Take a look at Yomu. It's a gem which acts as a wrapper for Apache TIKA and it supports a variety of document formats which includes the following:
Microsoft Office OLE 2 and Office Open XML Formats (.doc, .docx, .xls, .xlsx, .ppt, .pptx)
OpenOffice.org OpenDocument Formats (.odt, .ods, .odp)
Apple iWorks Formats
Rich Text Format (.rtf)
Portable Document Format (.pdf)

It's a long roundabout way, but open office can convert files, and there are programmatic ways to do that: http://railstech.com/2010/08/convert-open-office-document-to-another-open-office-format/
That may not be the best way yet, but maybe it will grease the wheels a bit.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart