Posted on July 1st, 2009 by wilson
Taking off from the tip given to me by Steven Sofian, I decided to test out the inclusion of PDFs in the search index of sharepoint. Prior to this, only microsoft word, excel and office documents were searchable on sharepoint search results.
The best instructions on how to do this came from : http://workerthread.wordpress.com/2008/01/03/configure-pdf-ifilter-in-wss-30/
First, you need to download the Adobe PDF IFilter 6.0, which you can find at this URL. You should also get hold of a suitable Icon to use with PDFs, so that when they are listed in a document library they are easily recognisable. There is a 17 x 17 one available on the Adobe web site here.
Once you’ve downloaded the IFilter, install it on your WSS 3.0 server, and then follow the instructions on registry settings in Microsoft KB Article 927675. I’ve always found that providing the Adobe IFilter installed properly, the only setting I need to add is the Search Extensions one listed in step 2.
Also note step 5 re stopping and re-starting the search service.
Now you need to set up the Icon file. If you downloaded the icon file in step 1 above, you will have a file called pdficon_small.gif. You need to copy this onto your WSS 3.0 server, into drive:\Program Files\Common FIles\Microsoft Shared\Web Server extensions\12\TEMPLATE\IMAGES.
Next you need to edit the XML file which WSS uses to link file extensions to icons. This file is called DOCICON.XML and is located at drive:\Program Files\Common FIles\Microsoft Shared\Web Server extensions\12\TEMPLATE\XML. Navigate to that folder and locate the file. I would suggest making a backup copy first, then opening the file in NotePad.
You need to add a mapping key for PDFs at the bottom of the file, above the closing tag. The new key will be
(note that XML is case sensitive so make sure you use same case as previous entries). Then save the file.
That’s pretty much it, but if you already have PDFs uploaded to your WSS server I would recommend starting a full crawl.
You can do the with STSAdm, the command syntax is Stsadm -o spsearch -action fullcrawlstart . More on this on TechNet here.
1 Comment »
Filed under: Computing, Sharepoint
Posted on July 1st, 2009 by wilson
On antoher sharepoint site that I manage, the search returns NO result for an item that I know exist in the sharepoint DB. A look at the event logs showed the following error:
Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to crawl this content. (0×80041205)
So further research showed this sharepoint post
Needless to say, it also worked for me. Thanks!! I am reproducing the post below:
1.- Since I was using custom host headers for the web sites, I disabled the loopback check (security feature that is designed to help prevent reflection attacks on your computer, included in the Microsoft Windows XP Service Pack 2 or Microsoft Windows Server 2003 Service Pack 1) by following this article (method 1 worked for me):
You receive error 401.1 when you browse a Web site that uses Integrated Authentication and is hosted on IIS 5.1 or IIS 6
http://support.microsoft.com/kb/896861
Method 1: Disable the loopback check
1. Click Start, click Run, type regedit, and then click OK.
2. In Registry Editor, locate and then click the following registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa
3. Right-click Lsa, point to New, and then click DWORD Value.
4. Type DisableLoopbackCheck, and then press ENTER.
5. Right-click DisableLoopbackCheck, and then click Modify.
6. In the Value data box, type 1, and then click OK.
7. Quit Registry Editor, and then restart your computer.
2.- After I restarted the server I went to the Sharepoint Admin site and started a full crawl. Some seconds later I went to the crawl log and found a warning this time with the following entry:
Content for this URL is excluded by the server because a no-index attribute.
So I “Reset All Crawled Content” and tried the full crawl again. This time it worked.
I hope this helps some one else.
No Comments »
Filed under: Computing, Sharepoint
Posted on June 30th, 2009 by wilson
So after being convinced by Steve S at the Singapore Sharepoint User’s Group, I decided to test out Search Serve Express edition on one of my existing Sharepoint Sites.
So as Steve said, my existing sharepoint search could not search inside PDFs. ( I confirmed this by uploading a PDF document and searching for contents inside it). Also, I made sure to make backup of the entire server before I started with the upgrading process. Since the existing site was still ver 2.0 I wanted to upgrade it to ver 3.0 before using search server express on it.
So Im watching Joel Oleson’s video guide first! http://www.slideshare.net/joeloleson/sharepoint-upgrade-wss-20-to-wss-30-and-sps-2003-to-moss-2007-by-joel-oleson-and-shane-young-presentation.
First problem encountered: LinkId=103318. This was right after we ran the preparationtool (everything complete). I am stuck at this point.

The system already has WSS v3.0 sp2. But it is running on SQL 2008 Not SQL 2005. Could this be the cause?
Update: We setup a new Virtual server with windows 2003 and NO SQL server installed, so that the sql express (MSDE) is used.
We were able to make it work and I love this!!
Below is a screen shot of the Search Server Admin Page/Report

No Comments »
Filed under: Sharepoint
Posted on June 12th, 2009 by wilson
I was fortunate enough to arrive early for the Singapore Sharepoint User’s Group meeting held at the 22nd floor of Microsoft Singapore in One Marina. I was met by the man himself, Steve Sofian who chatted me up and shared some things that i never knew about Sharepoint!
1. Instead of using WSS search, Steve suggested that I look into Search Express, and the cool iFilter features that will do some OCR functions for PDFs, TIFF etc type documents. http://www.microsoft.com/enterprisesearch/en/us/default.aspx
2. Adding another data base to work around the 2TB size limit of Sharepoint by using Site Collections!
Thanks Steve for the cool tips. Steve Sofian is the solutions architect for NicheField.com.
Below are pictures of the event:


No Comments »
Filed under: Computing, Sharepoint
Posted on May 26th, 2009 by wilson
Inquirer (www.inquirer.net) reported today that the GSIS intends to sue IBM for GSIS’s woes on DB2 server crashing! On the surface, it looks like IBM’s DB2 product did not scale very well to handle GSIS transaction volume.
IBM’s defense is that it was not part of the evaluation process in the selection of the GSIS software solution. But maybe, due diligence on IBM’s part to do some project management and risk analysis would have helped IBM warn GSIS about this potential problem. On the other hand, GSIS should also make sure that the software developers did what they could to reduce the transactional demands on the system.
I am eagerly awaiting more details into this problem. It is an interesting Project management debacle as well.
No Comments »
Filed under: Computing, Project Management