Saturday, September 28, 2013

Avoid corrupted site columns due feature re-activation

One of our clients reported an issue that provisioned site columns get corrupted after deinstall of the provisioning feature. The feature deinstallation is a step within the repair of erroneous situation in which sandbox solution per accident is also deployed as global farm-based solution. This incorrect deployment requires a fixture because an effect is that features now appear as duplicate entries in the list of site(collection) features: one installed via the sandbox solution (correct, administrated in the sitecollection's content database), and one installed via the farm solution (incorrect, administrated in the farm configuration database).
On first thought the simple fix is to deactivate the provisioning feature that originates from the farm deployment, remove the feature, then retract the global farm-solution, and remove it from the farm solution store. Next activate the feature deployed via sandbox-solution to arrive at the correct deployment situation. However, in a test execution we experienced that this approach gives an error upon the re-provisioning of the site columns: The local device name is already in use.
Error details in ULS log:
Unable to locate the xml-definition for FieldName with FieldId '<GUID>', exception: Microsoft.SharePoint.SPException: Catastrophic failure (Exception from HRESULT: 0x8000FFFF (E_UNEXPECTED)) ---> System.Runtime.InteropServices.COMException (0x8000FFFF): Catastrophic failure (Exception from HRESULT: 0x8000FFFF (E_UNEXPECTED)) at Microsoft.SharePoint.Library.SPRequestInternalClass.GetGlobalContentTypeXml(String bstrUrl, Int32 type, UInt32 lcid, Object varIdBytes) at Microsoft.SharePoint.Library.SPRequest.GetGlobalContentTypeXml(String bstrUrl, Int32 type, UInt32 lcid, Object varIdBytes)
To come with a solution, I started with a root-cause analysis. Why are the provisioned site columns not completely deleted from the site upon its feature deactivation? The explanation is that the provisioning feature also performs a contenttype binding to the Pages library, and that in our testcase a page was created based on that contenttype. This effectually results in the contenttype being kept ‘in usage’ by the Pages library. On feature deactivation the site columns can still be removed at sitecollection level, but the contenttype not anymore due the descendant sibling binded to the Pages library.
The real problem however lies in the 'removed' site columns. They are deleted from sitecollection level, but due the preserved contenttype (Pages library) their definition has remained in the sitecollection's content database, with the same ID as on sitecollection level (note this is different for a contenttype binded to a list, that gets a new ID based on the ID of the source/parent contenttype at sitecollection). SharePoint administrates per provisioned artifact whether the origin is a feature, and if so effectively couples the artifact to that feature. SharePoint disallows these artifacts by automatically deleted or modified by another feature. As result the feature re-activation halts with an error when trying to (re)provision the sitecolumns that are still present deep down in the content database, coupled to the Pages library.
With this SharePoint-internal insight, I was able to come up with a faultproof approach to fix the 'duplicate features' issue. The trick is to initial leave the feature definition that originated from the erroneous farm solution in the configuration database, deploy the sandbox-solution (stores the feature definition in the sitecollection's content database), activate the feature with the same feature id from that sandbox solution. Now the feature activation proceeds completely without errors, and restores the site columns. Ultimately the farm-based solution can then be retracted from the farm solution store.
Note: I came to this insight by inspecting on SQL level. We all know it is not allowed to perform changes on SharePoint content database level (or loose your Microsoft support), but it is perfectly ‘SharePoint’-legal to review and monitor on SharePoint content database level.
Used/useful SQL statements:
SELECT * FROM ( SELECT *, Convert(varchar(512), CONVERT(varbinary(512), ContentTypeId), 2) As key FROM [ContentTypeUsage]) as T where key like '%< contenttypeid >%'

SELECT tp_Title FROM ( SELECT *, Convert(varchar(512), CONVERT(varbinary(512), ContentTypeId), 2) As key FROM [ContentTypeUsage] Join AllLists on ContentTypeUsage.ListId = AllLists.tp_ID ) as T where key like '%< contenttypeid >%'