I am currently architecting a large SharePoint deployment.
This deployment has the potential to grow to petabytes in size over the course of several years.
One of the current issues we are discussing is the option of storing our data in SharePoint using InfoPath Forms. Some of these forms contain hundres of fields and require a lot of mapping to content types for persistence and search. Our search requirement is primarily a singular identifier and NOT the contents of the forms, although I am told I should preempt the "want" to search in the future.
We require our information to be utilised for secondary purposes (such as reporting etc). The information MUST be accessible instantly after persisting to the system.
My core questions therefore are:
- What are the benefit/risks of this approach compared to storing our data in a singular relational store using web-service persistence?
- If we decided on this approach what would be the impact of changing the forms, content-types over time?
- What happens when our farm grows beyond a single web-application / site collection how accessible will the information be? Will I know where it is and how portable will the information be overtime?
1.)
Benefit:
Risks:
2.) Well this is a tough one. Changing the form template and re-deploying is easy and only takes a few minutes. However changing the structure (underlying XML) of the template can get you in trouble very easily, because older (filled out) forms will be invalid - there is an option to "upgrade" older forms out-of-the-box, but in my experience it never worked as it supposed to.
Content Types behave very similar, say you want to delete a column from a content type because it's no longer needed - you'll have to remove all references to it, which means removing all items so you can delete the column.
3.) Well portability is definitly an issue with InfoPath, because it heavily relies on the corresponding URL structure. You absolutely can add more site collections, but this means you have to deploy your form template to each site collection. Information (filled out forms) can't easily be shared between site collection's because each form contains the SourceURL (where it came from) and the Namespace of the template (which changes constantly once you deploy).
Considering your requirements, i would strongly recommend a relational store instead of InfoPath - simply because it is not designed to be a data storage.
I would use a SQL database to store the data and a custom UI (WebPart or Application Page) to perform CRUD operations. This means that the information is not actually stored in SharePoint - just displayed (which also means that it can't be searched with the builtin SharePoint Search). There is also the possibilty to use the Business Connectivity Services (which basically does all of the above without you needing to create a custom UI - however very slow with large amount of data).
If you do need the information just in SharePoint, why not just make all this happen with Lists only?