Topic: SharePoint 2010, FS4SP, FAST Search Server 2010, FIXML
Subject: Index corruption and rebuilding the FS4SP index from FIXML
Problem: My FS4SP index has become corrupt. It took a week to original perform our full crawls. Is there anything we can do?
Response: Though this situation tends to be rare there may come a time when you need to rebuild a FS4SP index. When FS4SP is indexing items it not only stores the physical index itself <FASTInstall Drive>\FASTSearch\data\data_index but it also stores FIXML <FASTInstall Drive>\FASTSearch\data\data_fixml for each item indexed. The FIXML contains all the information which is to become that item within the index. Search results are retrieved from the index so why keep the FIXML around? Well one reason is for this exact problem: rebuilding an index. There are two ways of fixing a corrupt index and I will use the Solution\Example section to cover both. (In addition rebuilding an index is very important in prompting a backup column to a primary column) In this example I am using a three server single Column FS4SP farm.
FS4SP Farm: 1 Admin, 1 Primary Column, and 1 Backup Column.
Note: “fast3.mydomain.local” is the Primary Column Server.
Note: “fast3.mydomain.local” is the Primary Column Server.
Starting FS4SP Farm Deployment
<?xml version="1.0" encoding="utf-8" ?>
<deployment version="14">
<instanceid>FAST Search Server POC</instanceid>
<connector-databaseconnectionstring></connector-databaseconnectionstring>
<!—Admin Node -- >
<host name="fast1.mydomain.local">
<admin />
<webanalyzer server="true" max-targets="2" link-processing="true" lookup-db="true" redundant-lookup="true" />
</host>
<!—Primary Column -- >
<host name="fast2.mydomain.local">
<content-distributor id="0" />
<searchengine row="1" column="0" />
<indexing-dispatcher />
<query />
</host>
<!—Back Column -- >
<host name="fast3.mydomain.local">
<document-processor processes="4"/>
<content-distributor id="1" />
<indexing-dispatcher />
<searchengine row="0" column="0" />
<query />
</host>
<searchcluster>
<row id="0" index="primary" search="true" />
<row id="1" index="secondary" search="true" />
</searchcluster>
</deployment>
Solution\Example:
Repairing a corrupt index using FIXML
1. Optional (To allow for ease of observation)
a. Clear Event Viewer entries on the Primary Column Server
b. Clear FS4SP Logs
i. Open a FAST Command Shell as Administrator
ii. Execute: nctrl stop
iii. Execute: net stop fastsearchmonitoring
iv. Delete all files under <FASTInstallDrive>\FASTSearch\var\log
v. Execute: nctrl start
2. Resetting the Index
a. An index reset can be performed from any server in the FS4SP farm and on any Column or Row by specifying the Column, Row in the reset command. For this example, I will perform the command from the server which hosts the Primary Column and will specify the Row and Column even though the FS4SP farm only has a single column.
b. An index reset does not rebuild the Column from scratch. The indexer will validate each item within the FS4SP column against the original FIXML. Any item which is deemed not in sync or corrupt will be updated in the FS4SP column/index. This will take less time than rebuilding the index column from scratch.
3. Open the FAST Command Shell as Administrator on the FS4SP Admin Server
a. In this example: fast1.mydomain.local
b. Make sure all crawls are stopped or paused and the FS4SP Column is idle
i. Execute: indexerinfo --row=0 --column=0 status
ii. Look at the status of each partition status
<indexer hostname="fast3.mydomain.local" port="13050" cluster="webcluster" column="0" row="0"
<documents size="2022136316.000000" total="7046" indexed="7046" not_indexed="0"/>
<column_role state="Master" backups="1"/>
<index_frequence min="0.000000" max="0.000000"/>
<partition id="0" index_id="1315841940082287000" status="idle" type="dynamic"
<documents active="7046" total="7046"/>
</partition>
<partition id="1" index_id="1315328545925813000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="2" index_id="1315328542666333000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="3" index_id="1315328539447778000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="4" index_id="1315328535885114000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<document_api number_of_elements="0" last_sequence="99610"
<queue_size current="0"/>
<operations_processed api="0"/>
<document_api number_of_elements="0" last_sequence="99610" frequence
</document_api>
</indexer>
c. Stop the Web Analyzer and Relevancy Admin
i. The WebAnalyzer runs on a schedule. To avoid any updates or processing causing changes to the index we will suspend the services from processing.
ii. Logon to the FS4SP server which hosts the WebAnalyzer.
1. In this example: fast1.mydomain.local
iii. Open FAST Command Shell As Administrator
iv. Execute: waadmin showstatus
1. The Overall Status needs to be running before we can suspend it.
v. If the Status is paused
a. Execute: waadmin enqueueview
b. Repeat Steps iv.
vi. Execute: waadmin AbortProcessing
vii. Execute: spreladmin AbortProcessing
4. Issue an Index reset
a. Logon to the Primary Index Column Server
i. In this example: fast3.mydomain.local
b. From the FAST Command Shell as Administrator
c. Execute: indexeradmin --row=0 --column=0 resetindex
d. Execute: indexerinfo --row=0 --column=0 status
i. You may have to issue the command several times to see the work being performed.
ii. The index reset will work through each partition of the FS4SP column. If you can keep executing the command to watch the progress. Note: all the items are originally in partition 1 (they will not end up there. At least in my example because of the number of items I have under index).
iii. When the index reset finishes all the items end up in partition 4. (This will vary depending on the partition spread. Just an interesting observation as to how the reset is performed under the covers)
<indexer hostname="fast3.mydomain.local" port="13050" cluster="webcluster" column="0" row="0"
<documents size="2022136316.000000" total="7046" indexed="7046" not_indexed="0"/>
<column_role state="Master" backups="1"/>
<index_frequence min="0.000000" max="0.000000"/>
<partition id="0" index_id="1315841940082287000" status="idle" type="dynamic"
<documents active="7046" total="7046"/>
</partition>
<partition id="1" index_id="1315328545925813000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="2" index_id="1315328542666333000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="3" index_id="1315328539447778000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="4" index_id="1315328535885114000" status="indexing (6%)" type="dynamic"
<documents active="0" total="0"/>
</partition>
<document_api number_of_elements="0" last_sequence="99610"
<queue_size current="0"/>
<operations_processed api="0"/>
<document_api number_of_elements="0" last_sequence="99610" frequence
</document_api>
</indexer>
e. Alternative ways to watch the process. (Especially if you experiment with very little data in the system as the index reset will work fast enough that you may not see the status changing)
i. From Event Viewer
1. Open the Windows Event Viewer
2. Expand the Applications and Services Logs Node
3. Open the FAST Search Logs
4. You will find several entries similar to the following:
indexer_admin_servant: Reset index requested.
state::runtime: Indexing suspended
state::runtime: Indexing resumed
work_order 4_1315848480044895000: Index State Reset - Not using incremental indexing
master_indexing_thread (p:4,j:1): Completed Index State Reset. Normal indexing enabled.
f. From FAST Logs
i. Using Windows Explorer Navigate to <FASTInstallDrive>\FASTSearch\var\log\indexer folder
ii. Open indexer.txt
iii. Search for “Reset index requested” and you will see something similar to the following:
INFO indexer indexer_admin_servant: Reset index requested.
VERBOSE indexer rts::indexing::util: Suspending indexing
INFO indexer state::runtime: Indexing suspended
INFO indexer state::runtime: Indexing resumed
VERBOSE indexer percentage_file_distributor: Partition 4 should have 100% of the docs. Num docs 7046 target : 6000000. New range: 1-714
INFO indexer work_order 4_1315848480044895000: Index State Reset - Not using incremental indexing
VERBOSE indexer index_producer (4): Index 1315848480044895000 completed. OK docs: 7046, failed docs: 0, errors: 0, range: 1-714, 0 exclusionlisted
VERBOSE indexer percentage_file_distributor: Partition 3 has empty range.
INFO indexer work_order 3_1315848737366889000: Index State Reset - Not using incremental indexing
VERBOSE indexer index_producer (3): Index 1315848737366889000 completed. OK docs: 0, failed docs: 0, errors: 0, range: 0-0, 0 exclusionlisted
VERBOSE indexer percentage_file_distributor: Partition 2 has empty range.
INFO indexer work_order 2_1315848740067419000: Index State Reset - Not using incremental indexing
VERBOSE indexer index_producer (2): Index 1315848740067419000 completed. OK docs: 0, failed docs: 0, errors: 0, range: 0-0, 0 exclusionlisted
VERBOSE indexer percentage_file_distributor: Partition 1 has empty range.
INFO indexer work_order 1_1315848743251859000: Index State Reset - Not using incremental indexing
VERBOSE indexer index_producer (1): Index 1315848743251859000 completed. OK docs: 0, failed docs: 0, errors: 0, range: 0-0, 0 exclusionlisted
VERBOSE indexer percentage_file_distributor: Partition 0 has empty range.
INFO indexer work_order 0_1315848746358249000: Index State Reset - Not using incremental indexing
VERBOSE indexer index_producer (0): Index 1315848746358249000 completed. OK docs: 0, failed docs: 0, errors: 0, range: 0-0, 0 exclusionlisted
VERBOSE indexer search_controller_holder: Activating index set '0_1315848746358249000,1_1315848743251859000,2_1315848740067419000,3_1315848737366889000,4_1315848480044895000', 7046 active docs, 0 exclusionlisted.
INFO indexer master_indexing_thread (p:4,j:1): Completed Index State Reset. Normal indexing enabled.
VERBOSE dictionary_producer Loading rc-file: 'C:\FASTSE~1\var/etc/findexrc'
VERBOSE dictionary_producer Reading index configuration from C:\FASTSearch\data\data_index\fast3.mydomain.local.normalized.temp\index.cf
VERBOSE indexer dictionary_builder: New dictionaries ready (trickypoc3.trickydomain.local.normalized.1315848819)
g. Resume WebAnalyzer and Relevancy Admin once the index reset completes.
i. Open FAST Command Shell As Administrator on the FS4SP Admin node or the Server with the WebAnalyzer service enabled.
ii. Execute: waadmin EnqueueView
iii. Execute: spreladmin Enqueue
h. Test your Search Center
Rebuilding an index from FIXML
1. Open the FAST Command Shell as Administrator on the FS4SP Admin Server
a. In this example: fast1.mydomain.local
b. Make sure all crawls are stopped or paused and the FS4SP Column is idle
i. Execute: indexerinfo --row=0 --column=0 status
ii. Look at the status of each partition status
<indexer hostname="fast3.mydomain.local" port="13050" cluster="webcluster" column="0" row="0"
<documents size="2022136316.000000" total="7046" indexed="7046" not_indexed="0"/>
<column_role state="Master" backups="1"/>
<index_frequence min="0.000000" max="0.000000"/>
<partition id="0" index_id="1315841940082287000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="1" index_id="1315328545925813000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="2" index_id="1315328542666333000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="3" index_id="1315328539447778000" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="4" index_id="1315328535885114000" status="idle" type="dynamic"
<documents active="7046" total="7046"/>
</partition>
<document_api number_of_elements="0" last_sequence="99610"
<queue_size current="0"/>
<operations_processed api="0"/>
<document_api number_of_elements="0" last_sequence="99610" frequence
</document_api>
</indexer>
c. Stop the Web Analyzer and Relevancy Admin
i. The WebAnalyzer runs on a schedule. To avoid any updates or processing causing changes to the index we will suspend the services from processing.
ii. Logon to the FS4SP server which hosts the WebAnalyzer.
iii. Open FAST Command Shell As Administrator
iv. Execute: waadmin showstatus
1. The Overall Status needs to be running before we can suspend it.
v. If the Status is paused
a. Execute: waadmin enqueueview
b. Repeat Steps iv.
vi. Execute: waadmin AbortProcessing
vii. Execute: spreladmin AbortProcessing
2. Rebuild the Primary Column Index
a. Logon to the Primary Index Column Server
b. In this example: fast3.mydomain.local
c. From the FAST Command Shell as Administrator
d. Execute: nctrl stop
e. Using windows explorer navigate to <FASTInstallDrive>\FASTSearch\data\
f. Delete the data_index folder
g. Execute: nctrl start
h. Execute: indexerinfo --row=0 --column=0 status
i. Notice the total=”7046”, indexed=”0” and not_indexed=”7046” as these numbers are generated from the FIXML.
ii. Much like the index reset you will see that status of the partitions changes
iii. Keep re-issuing the indexerinfo --row=0 --column=0 status to watch the progress
iv. Note the difference between index reset and rebuilding the index. The index reset moved the items from one partition to another while keeping the index populate. The rebuilding of the index from scratch started with an indexed count: indexed=”0”.
<indexer hostname="fast3.myomain.local" port="13050" cluster="webcluster" column="0" row="0"
ndex="0">
<documents size="2022136316.000000" total="7046" indexed="0" not_indexed="7046"/>
<column_role state="Master" backups="0"/>
<index_frequence min="0.000000" max="0.000000"/>
<partition id="0" index_id="0" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="1" index_id="0" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="2" index_id="0" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="3" index_id="0" status="idle" type="dynamic"
<documents active="0" total="0"/>
</partition>
<partition id="4" index_id="0" status="indexing (8%)"
<documents active="0" total="0"/>
</partition>
<document_api number_of_elements="0" last_sequence="99610" frequence="0.000000"
<queue_size current="0"/>
<operations_processed api="0"/>
</document_api>
</indexer>
i. Much like the index reset above the alternative ways to watch the progress are through the Event Viewer and the indexer.txt log. The messages will differ but the same end result will occur.
j. Resume WebAnalyzer and Relevancy Admin once the index rebuild completes.
i. Open FAST Command Shell As Administrator on the FS4SP Admin node or the Server with the WebAnalyzer service enabled.
ii. Execute: waadmin EnqueueView
iii. Execute: spreladmin Enqueue
k. Test your Search Center
Conclusion:
For each item indexed into the FS4SP index a FIXML file is created representing the item within the index. The FIXML items stored on the FS4SP Servers does take up additional storage but it has several beneficial functions from validating security, debugging crawled/managed properties and as in this example fixing a corrupted index or even rebuilding an index.
Special Note: Rebuilding the index from scratch requires a lot more free disk space than an index reset. Temporary files are created and released and you may consume twice as much disk space as the final index. If you need to rebuild from scratch watch the amount of free disk space. If it runs out (about 2GB min) the process will fail and immediately start again. Until the lack of free storage space is addressed the process will continue to cycle.
Eric, your blogs have been very useful for my troubleshooting. I want to completely reset my index and start from scratch. We have decided that we could reindex everything. How do you reset the index without the system automatically rebuilding everything. It seems that running clear-fastsearchcontentindex twice now has resulted in the process timing out and the index automatically rebuilding. Is this as simple as just stopping everything and deleting the Data folder.
ReplyDeleteEric, Great Post !
ReplyDeletei am having the issue mentioned here..
http://social.technet.microsoft.com/Forums/en-US/sharepointgeneralprevious/thread/8adac159-8bbe-4ba7-adf7-3abd3bf0789f
Will index rebuilding/resetting help ?
Hi Eric, thanks for your posts. This one came in handy recently when after a power outage we started receiving the Error: [18] No engine available for partition 0. We have a two index column FAST farm and one index column had become corrupt. I ended up having to rebuild the index column from the existing FIXML with the help of your post.
ReplyDeleteRegards
Pete