Tuesday, June 7, 2011

FS4SP Pipeline Performance and CustomerExtensibility

Topic: SharePoint 2010 and FAST Search 2010 for SharePoint (FS4SP) Performance
Subject:  FS4SP Pipeline Performance and “CustomerExtensibility” stage
Problem: My FAST Content Recent crawl rate is very low.  I have added more Document Processors, memory and even an additional FAST Server but I can’t seem to get my Crawl Performance any faster.  What is the problem?
Response: This is a very common scenario I run into.  The answer to the performance problem can be one of many areas in SharePoint, SQL Server, and FS4SP but there are 2 specific problem areas I look at 1st and these are the questions I first ask: 
1.      Are the disks fast enough? 
                                                    i.     For FS4SP the recommended IOPS are 2000-3000/sec with a low latency.  If I get a server with 1000+ IOPS or 750MB/sec throughput I can usually get pretty good performance.  In this case we will answer that the disks are fast enough and focus on question #2

2.      Have you implemented any “CustomerExtensibility” solutions? 
                                                    i.     The answer is usually “yes”. 
                                                   ii.     The follow-up question is what did you write it in?
I have seen many blogs, TechNet articles, and heard many people insist on how “CustomerExtensibility” needs to be implemented.  Instead of telling anyone the issue or how to do this I will use the Solution\Example section to show how to look at pipeline performance and the reader can draw their own conclusions. 
For ease of setting up a repetitive test I am limiting my FS4SP farm to have Document Processors on only one server. A stand-alone Farm is acceptable and probably the easiest.  If you are using a Multiple Server Farm you would want to perform the “CustomerExtensibility” steps on any server in the FS4SP farm which has Document Processors defined.
1.      As any executable program can be called from the “CustomerExtensibility” stage, I will use a series of programs to perform the same function and look at the effects on the pipeline. I hear, erroneously, a lot of times that the only program I can call is .net C# which is probably because it is the most common example.

The 4 examples I will use:
a.      Batch File
b.      VBScript
c.      C# - (the notorious blog and TechNet example)
d.      C++.

2.      FS4SP pipeline background.
a.      The pipeline is a sandbox solution.
                                                    i.     This means regardless of the user permissions of the FAST Service user the user will have limited capability within the pipeline. 

                                                   ii.     There is a single folder location accessible to write files to from within the “CustomerExtensibilty” stage of the pipeline. “C:\users\<fast service user>\AppData\LocalLow”.

b.      Each Document Processor (or DP) in the FS4SP Farm is an individual instance of the Pipeline
                                                    i.     1 Item can pass through a DP at a time.  A DP may be working on a batch of 100 items but only 1 item is being worked on at a time

                                                   ii.     If I have 4 Document Processors the Farm could potentially be processing 4 items at a point in time.
1.      With this knowledge it becomes critical to have the pipeline running as efficient as possible.

2.      If it takes 1 sec to process an item, the most I could get passing through the pipeline at any point in time is 4 per/sec.  (In actuality you probably won’t get 4 from SharePoint’s perspective)

3.      Open Windows Explorer
a.      Create a folder called “TEST” on the same root drive as the FASTSearch install
                                                    i.     Example: (for the purpose of this test)
1.      C:\FASTSearch
2.      C:\TEST

4.      Expose hidden folders to Windows Explorer
a.      From “Organize” sub-menu of choose Folder and search options
b.      From the View Tab
                                                    i.     Check “Show hidden files, folders, and drives” radio option

5.      Using windows explorer navigation to:
a.      C:\Users\<your FAST service user>\AppData\LocalLow
b.      Create a folder called “TEST”
c.      My service user is “svc_fast”
                                                    i.     Example: C:\Users\svc_fast\AppData\LocalLow\TEST

6.      Create Test Programs and place in the “C:\TEST” folder
a.      All the Examples are kept as simply as possible.  No error trapping, tracing, etc.

b.      Place a random file in the folder for testing
                                                    i.     Example: C:\Test\TEST.txt

c.      Use Notepad to create the following batch file
                                                    i.     CmdTest.cmd
    copy %1 C:\users\%USERNAME%\AppData\LocalLow\TEST\

                                                   ii.     Command Line Test> CmdTest C:\TEST\TEST.txt
                                                  iii.     Look in  …\LocalLow\TEST directory to verify success

d.      VBScript. This program will be comprised of 2 Files. We can’t launch the VB Script directly in the pipeline so we will need a command file to execute the script
                                                    i.     VBTest.cmd
cscript C:\FASTSearch\bin\CopyToLocalLow.vbs %1 %2

                                                   ii.     VBTest.vbs
option explicit
dim objNet, objFS, tmpInFile, userProfilePath

tmpInFile = wscript.arguments(0)
set objNet = CreateObject("WScript.NetWork")
userProfilePath = "C:\users\" + objNet.UserName + "\appdata\locallow\TEST\"
set objFS = createobject("scripting.filesystemobject")
objFS.copyfile tmpInFile, userProfilePath

set objFS = nothing
set objNet = nothing

                                                  iii.     Command Line Test> VBTest.cmd C:\TEST\TEST.txt
                                                  iv.     Look in  …\LocalLow\TEST folder to verify success

**Side Note: Executing scripts within the FS4SP pipeline can be tricky.  I have some servers work flawlessly and others I need to fix settings for scripting.  Make sure the FAST service user has explicit execute permission on the vbs file.  If you have issues you can skip this program.  I will show the results.

e.      C#.  Create Console App
                                                    i.     CSharpTest.exe (Visual Studio 2008 .net 3.5)
using System.IO;
using System.Security.Principal;
namespace CSharpTest
    class Program
        static void Main(string[] args)
            string[] userContext = WindowsIdentity.GetCurrent().Name.Split('\\');
            string outputFolder = @"C:\Users\" + userContext[1] + @"\AppData\LocalLow\TEST";
            FileInfo fileInfo = new FileInfo(args[0]);
            File.Copy(args[0], Path.Combine(outputFolder, fileInfo.Name));

                                                   ii.     Command Line Test> CSharpTest.exe C:\TEST\TEST.txt
                                                  iii.     Look in …\LocalLow\TEST folder to verify success

**Side Note: I also tested with a Visual Studio 2010 .net 4.0 and will show the results.

f.       C++. Created Console App
                                                    i.     CPlusTest.exe
#include "stdafx.h"
#include "fstream"
#include "windows.h"
#include "Lmcons.h"

std::string ExtractFilename( const std::string& path )
  return path.substr( path.find_last_of( '\\' ) +1 );

int _tmain(int argc, _TCHAR* argv[])
  TCHAR name [ UNLEN + 1 ];
  DWORD size = UNLEN + 1;
  GetUserName ( (TCHAR*)name, &size);

  char outputPath[100];
  std::string ff = ExtractFilename(argv[1]);

  const char *fileName[1] = { ff.c_str() };
  strcpy_s(outputPath, "C:\\Users\\");
  std::ifstream src;
  std::ofstream dest;
  src.open (argv[1], std::ios::binary);
 dest.open (outputPath, std::ios::binary);

  dest << src.rdbuf ();
  dest.close ();
  src.close ();

  return EXIT_SUCCESS;

                                                   ii.     Command Line Test> CPlusTest.exe C:\TEST\TEST.txt
                                                  iii.     Look in …\LocalLow\TEST folder to verify success.

**Side Note: FS4SP requires Microsoft Visual C++ 2008 Redistributable – x64 runtime so it is easiest to use VS 2008 and compile in x64 mode.   I loaded the C++ 2010 x86 and x64 for fun and tested with these versions also.   For this examples analysis I went with 2010 x64.    

7.      Time to test performance in the Pipeline

8.      Create a File share Folder and place 1000 documents in the folder
a.      I used a mixed of office and PDF documents

9.      From your FAST Content Search Service Application
a.      Setup a new File Share Content Source

10.   We will repeat the next steps for each of the 4 test programs

11.   Open Windows Explorer on the FAST Server
a.      Navigate to <FAST Install Driver>\FASTSearch\etc
b.      Edit the File “pipelineextensibility”
c.      Add the Follow and Save the File (there is a sample already in the file just make sure you place the new call outside of the comments)

   <Run command="C:\TEST\CmdTest.cmd %(input)s">
                              <CrawledProperty propertySet="11280615-f653-448f-8ed8-2915008789f2" varType="31" propertyName="url"/>
              <CrawledProperty propertySet="7262a2f9-30d0-488f-a34a-126584180f74" varType="31" propertyName="mime"/>

d.      We are going to ask the FS4SP pipeline to pass 2 Crawled Properties to the program
                                                    i.     url          - Access Url generated by the SharePoint Crawler
                                                   ii.     mime     - Crawled Property Set by the FAST pipeline

e.      Reset pipeline
                                                    i.     Open the FS4SP PowerShell as Administrator
1.      Execute: psctrl reset
a.      Resets the pipeline to pick up the pipeline changes we implemented in pipelineextensibility.xml

                                                   ii.     From FAST Content SSA
1.      Issue Index Reset

                                                  iii.     Open the FS4SP PowerShell as Administrator
1.      Execute: Clear-FASTSearchContentCollection <your collection> 
a.      “sp” the default collection

b.      If the Index does not clear it is because we have a typo in the “pipelineextensibility” format.  This is a trick for testing your “pipelineextensibility” changes.  This also works for any pipeline configuration changes and is an easy way to make sure changes are implemented with the correct format.

2.      Repeat Step 3 – 6 for each FS4SP server in the FARM.

3.      Execute: nctrl stop
a.      Stops the FAST Node
b.      We are going to collect statistics from the pipeline and Stopping and Starting the Node will reset all statistics

4.      Optional: I usually remove all logs from “<fast install drive>/FASTSearch/var/logs” between tests.  To do this you would also need to stop the “FAST Search for SharePoint Monitoring” service.

5.      Execute: nctrl start
a.      Starts the FAST Node

6.      Execute: nctrl status
a.      It will wait for a response from all Node Processes.  This will let you know when FAST is ready

12.   From the FAST Content SSA kick off a Full Crawl of the new File Share Content source
a.      Note the Crawl time upon completion.

13.   From Windows Explorer on any FS4SP server with DPs.
a.      Navigate to “C:\Users\<your fast  service user>\AppData\LocalLow\Test”

b.      Get the total number of documents created. You should get 1001
                                                    i.     1 Folder

                                                   ii.     1000 Items which passed through the pipeline.

                                                  iii.     Open an item in Note Pad
1.      You will have pure XML of the Crawl Properties values we requested in Step 12
                                                  iv.     If you end up with more than 1001 files I will comment on that further at the end of the post.
**Side Note:  If you read my blog “Spy Stage in the FAST Search (FS4SP) pipeline” (http://fs4sp.blogspot.com/2011/04/spy-stage-in-fast-search-for-sharepoint_05.html) the Spy stage is a great way to look at Crawled properties but is a little limiting and you have to control the amount of data being crawled.  The copy to locallow allows you to view selected crawled properties for large quantities of items.  I used it constantly for many different debugging reasons. It also has an advantage over the FDDumper as you can control which crawled properties are produced instead of a full dump of the pipeline and then searching for items.
14.   From the FS4SP Command Shell as Administrator (on the FS4SP Admin Server if more than 1 server in the Farm)
a.      Execute: infotool

b.      infotool will zip all the log files and statistics from all FS4SP Servers in the Farm.
                                                    i.     This is why I usually delete all logs between test runs.  It is optional in this case as we are only interested in the statistics. If we were debugging a different issue I would want only the logs which are created during the test run.

c.      FYI - The more FS4SP servers and the more logs on the servers the longer it will take for infotool to complete.

d.      Infotool will provide a location of the zip file when it is completed.

e.      Use Windows Explorer to navigate to the location and rename the infotool folder to be associated with the Command type tested and move the folder to our program location.
1.      On my server I will end up with:
a.      C:\test\logs\Cmd
b.      C:\test\logs\CSharp
c.      C:\test\logs\CPlus
d.      C:\test\logs\VBScript

f.       If you remove logs between runs and don’t move the infotool log make sure you don’t delete the infotool folders.

15.   Repeat steps 11 – 14 for each of the other programs.
a.      The only difference will be the run command in the pipelineextensibility.xml
                                                    i.     <Run command="C:\TEST\CSharpTest.exe %(input)s">
                                                   ii.     <Run command="C:\TEST\CPlusTest.exe %(input)s">
                                                  iii.     <Run command="C:\TEST\VBTest.cmd %(input)s">

16.   Analyzing the infotool logs
a.      Open the infotool_default.zip file associated with each run.
b.      On the root level open the file “rc-dump.txt” in note pad or an advanced editor
c.      The “rc-dump.txt” will have statistics for all services from the last run. Including statistics for each Document Processor by stage.
d.      Search for “CustomerExtensibility”. In my case I have 4 DPs running so I will have stats for all 4.
e.      Save into a Notepad
f.       Repeat for all 4 tests clearly marking which command the statistics came from.

17.   The following are my results:

CmdTest                             Crawl Time: 09:04            #Files in Local Low: 1001
Scopes:                                                                        Cur   Total   Min   Max      Avg
Processor::CustomerExtensibility                              :   0  ( 250)  30ms 472ms  69ms
Processor::CustomerExtensibility                              :   0  ( 211)  30ms 456ms  71ms
Processor::CustomerExtensibility                              :   0  ( 270)  30ms 299ms  67ms
Processor::CustomerExtensibility                              :   0  ( 270)  30ms 489ms  72ms

VBScript                             Crawl Time: 9.43              #Files in Local Low: 1001
Scopes:                                                                        Cur   Total   Min   Max      Avg
Processor::CustomerExtensibility                              :   0  (  210)  62ms    1s      137ms
Processor::CustomerExtensibility                              :   0  (  229)  61ms 977ms 152ms
Processor::CustomerExtensibility                              :   0  (  272)  61ms    1s      141ms
Processor::CustomerExtensibility                              :   0  (  290)  61ms 884ms 146ms

C++ VS 2010 x64               Crawl Time: 9.05              #Files in Local Low: 1001
Scopes:                                                                        Cur   Total   Min   Max       Avg
Processor::CustomerExtensibility                              :   0  (  250)  30ms 389ms  73ms
Processor::CustomerExtensibility                              :   0  (  249)  30ms 343ms  73ms
Processor::CustomerExtensibility                              :   0  (  251)  30ms 467ms  75ms
Processor::CustomerExtensibility                              :   0  (  251)  30ms 436ms  78ms
C# 2008 .net 3.5               Crawl Time: 9.15              #Files in Local Low: 1001
Scopes:                                                                        Cur   Total   Min   Max       Avg
Processor::CustomerExtensibility                              :   0  ( 289)  46ms 812ms  145ms
Processor::CustomerExtensibility                              :   0  ( 252)  62ms 593ms  149ms
Processor::CustomerExtensibility                              :   0  ( 230)  77ms 641ms  149ms
Processor::CustomerExtensibility                              :   0  ( 230)  77ms 546ms  145ms

C# 2010 .net 4.0               Crawl Time: 9.25              #Files in LocalLow: 1001
Scopes:                                                                        Cur   Total   Min   Max       Avg
Processor::CustomerExtensibility                              :   0  (  231)  62ms    1s      131ms
Processor::CustomerExtensibility                              :   0  (  261)  61ms    1s      129ms
Processor::CustomerExtensibility                              :   0  (  270)  61ms 561ms 135ms
Processor::CustomerExtensibility                              :   0  (  239)  61ms 825ms 135ms

18.   How to read the Numbers
a.      Total:  Number of items which pass through the Document process since the last time statistics where reset.

b.      Min Time:  The shortest amount of time a single item passed through a DP.

c.      Max Time:  The longest amount of time a single item passed through a DP. 

d.      Avg Time:  The cumulative average time for all items which passed through a DP.

19.   Max time is important but it is a little limiting.  If the Max time was 10 sec (A totally unacceptable number) I can’t tell how many passed through in 7, 8, or 9 seconds.  If we had an item take 10 second it would mean that the particular DP could do no other work for 10 secs until that item finished processing.

20.   Average time is really the number to focus on.  The higher the Average time the slower the performance of the FS4SP farm.  Remember the “CustomerExtensibility” stage is only one stage of many which make up the FS4SP pipeline.

21.   In our particular testing, performing very simple functionality, all the programs all pretty fast ( < 150ms) but are they really.  Let’s run the numbers for 2 fictitious customers who each implemented this example with one choosing C# 2008 .net 3.5 and the other choosing C++ VS 2010 x64.

a.      The average times for all document processes for each
                                                    i.     C#   = ((145+149+149+145) / 4)  = 147ms
                                                   ii.     C++ = ((73+73+75+78) / 4) = 74.75ms

b.      How much longer will the C# customer’s crawl take versus the C++ customer for 10M items
                                                    i.     Additional average time = 147 – 74.75 = 72.75ms
                                                   ii.     Total Milliseconds for 10M = 72.25 * 10,000,000 = 722,500,000ms
                                                  iii.     Total Seconds =722,500,000 / 1000 = 722,500 sec
                                                  iv.     Total Minutes = 722,500 / 60 = 12041.6667 mins
                                                   v.     Total Hours = 12041.6667 / 60 = ~50   hours
                                                  vi.     For Farm = 50 hours / 4 DPs = ~12.5 hours

22.   For those who got more than 1001 files in the LocalLow:  The particular servers I used are very poor performing VMs but I never got more than 1001 files because I tuned my SP/FS4SP farms to only crawl as fast as the FS4SP farm can handle.  I choose these servers because I was using very little functionality in the pipeline “CustomerExtensibility” and wanted to make sure the differences where apparent.  In a good farm my average times would have been lower but the differences would still exist.  There a several issues which can cause the increased number of files but disk IOPS and latency is the most common.  When batches are processed and submitted to the index dispatcher, the index dispatcher has 300 seconds to get the batch into the index.  If the batch times out it re-sends the items back through the pipeline.   If I had added 2 more DPs on this Farm my implementation would have started thrashing and re-processing items decreasing performance.  If you had more than 1001 items, it is because some items where re-processed.Of interesting note, if you had looked at your crawl logs you will see 1001 successes.  The SP Crawl log is concerned about how many items made it into the index successfully and does not care about how many times FS4SP had to process the items before they were successful.   The CmdTest.cmd is a great/quick way to look at overall performance. It takes 30 seconds to setup, no real code involved, and can tell you pretty quickly if a FS4SP farm is thrashing.  It will add overhead but it is minimal. 

Conclusions:  When we are looking at performance on a small scale the numbers < 150ms appear to be ok until we extrapolate the numbers to a larger scale.  The difference in this example is simply the choice of programming language.  The infotool and repeatable testing are a great way to look at the overall performance of the FS4SP pipeline.   The OOB CompanyExtractors are notoriously sluggish. They are worth taking a look at for performance.  If the customer is not going to take advantage of the stage remove it from the pipeline.
When using “CustomerExtensibility” always check the performance impact.



  1. An excellent post!

    The average loading time for .Net programs being twice that of C++ is not unexpected. Even if you run ngen.exe on your program it will still have to load quite a few other dll's, both .Net and native ones.

    I think the best way to use .Net for pipelines when performance really is an issue, is to have a C++ stub loader which signals a .Net service via a mutex or similar, and then waits for the service to finish, and get a signal back.

    This way the .Net app is only loaded once, and you get the benefit of C++ for the startup, and in my case, the benefit of .Net as a programming language to create the actual module logic.

    With this approach you could even chain your modules in the .Net service instead of having several registered modules in the pipeline. Sort of like a pipeline within the pipeline.

    But if you really need speed, then you have to write python stages instead as mentioned at http://support.microsoft.com/kb/2455437 (they mention it as a native pipeline extensibility program).

  2. Hi, good work.
    Do you try to test C# app passed trough the Native Image Generator (Ngen.exe)? http://msdn.microsoft.com/en-us/library/6t9t5wcf(v=vs.80).aspx
    Do you have such results?