Interesting research regarding Records Management in SharePoint 2010

Most of the results of my research below is from Gimmal.com, Adam Harmetz of Microsoft and the Microsoft site itself.

MOSS 2007 is also the first version of SharePoint to include out-of-the-box records management functionality.

What is SharePoint and Moss Records Management
At its core, SharePoint is a web-based portal providing enterprise collaboration and document management functionality and is not an application but rather a platform that includes a template for creating a unique records repository site called the Records Center. Until 2007, any organization using SharePoint for document management was required to use a third-party records management application integration to manage their records. At any point in the lifecycle of documents, the users are able to declare them records. Typically, this is done one of two ways. The end user can manually declare the document a record by right-clicking on it and selecting an option to send it to the third-party records repository. Or the record declaration takes place as part of an automated process and is completely transparent to the solution’s end users.

Records Routing – Because training end users to properly classify records can be difficult, the Records Center can be configured to classify records automatically based on the record’s content type. This, for example, would allow an end user to send a document with a content type of ‘Financial Statement’ to the ‘Accounting’ records library in the Records Center without having to manually navigate the Records Center libraries and folders.

Litigation Holds – A ‘hold’ on a record suspends destruction practices and procedures as necessary to comply with preservation obligations related to actual or reasonably anticipated litigation, governmental investigation, or audit. (It should be noted that holds only suspend the expiration of records in the records repository; they do not prevent someone from deleting records manually.)

The Records Center provides an excellent set of core records management features, but the standard MOSS 2007 Records Center does not have the records management features that a true enterprise solution would require. Some of the functionality missing includes: A multilevel, hierarchical file plan Metadata propagation Event-based retention scheduling Vital records review processing Record relationships Transfer disposition Multiphase disposition processing Electronically stored records destruction processing Comprehensive email records management. The lack of a true hierarchical file plan, event-based disposition, vital records review processing, and other records management application features may also limit the standard MOSS 2007 Records Center as the sole solution for an organization’s records management requirements.

DoD 5015.2 Resource Kit

While not without its faults, the MOSS 2007 DoD 5015.2 Resource Kit significantly enhances SharePoint’s records management capabilities. Multi-phase retention schedules can be set, file plans can be created and records can be managed throughout their full life cycle with much more comprehensive records management functionality than the features included in the standard Records Center.

Resource Kit include the following:
File Plan Creation – enables system administrators to create a file plan structure within the Records Center. The file plan consists of the primary level, called Categories, and the secondary level, called Folders. This file plan structure mirrors the one used in the DoD 5015.2 Test procedures. Records can be declared at both the category and folder levels. The retention applied to the records is inherited from the parent category.

Metadata Propagation – Metadata values assigned to the categories and folders in the DoD 5015.2 Resource Kit file plan can be inherited by the records they contain. For instance, one of the metadata properties assigned to records in the repository is the record’s parent category. This inherited property allows the solution to identify where the record is classified within records reports and records search result lists.

Record Relationships – The DoD 5015.2 Resource Kit ships with records linking capability. Any record can be associated with any other record in the file plan. This relationship is reciprocal and is updated as records are moved or deleted. Additionally, the Resource Kit allows an administrator to create site-specific relationships that may not exist with the standard implementation. New records relationships can be either hierarchical or peer-to-peer.

Folder Closing – Folders in the DoD 5015.2 Resource Kit file plan are typically used for case-based records retention, meaning all the records in the folder qualify for disposition at the same time. Case-based retention normally requires an event-driven retention period. When an event occurs starting retention on the records classified into a folder, the Resource Kit provides a feature that allows the folder to be ‘closed’ and prevents any new records from being added to it. If necessary, the folder can be reopened at anytime by a system administrator with the proper permission.

Vital Records Review Processing – A small percentage of records in an organization’s repository are considered ‘vital’. This means they are essential to the organization’s continuity of operations if the organization suffers a catastrophic disaster. Most organizations want a means to periodically review their vital records to ensure they are accurate and up-to-date. The DoD 5015.2 Resource Kit provides the ability to assign a vital review process to categories and folders in the file plan. In the Resource Kit, the category or folder is designated as containing vital records and assigned a Vital Review Period (the time between each review) and a Vital Record Reviewer. The Vital Record Reviewer is a user or group of users who will receive an email notification when the records are due for review. The email has a link to the category or folder to be reviewed and a link to a workflow that the reviewer must complete as part of the review process. Once the reviewer has examined the records and completed the workflow, the system will change the ‘Last Reviewed Date’ to the current date and the process will begin again.

Record Disposition Cutoff Processing – A cutoff date is simply the date that a record’s retention period begins. This could be the day the record is declared, the day of a particular event or a calendar date, such as the end of the month. Typically, authorization is required to approve when a record or group of records is assigned a cutoff date. The DoD 5015.2 Resource Kit automates the cutoff approval process by initiating a workflow whenever a record or group of records is due to be cutoff.

Enhanced Disposition Processing – The DoD 5015.2 Resource Kit adds significant disposition functionality to the Records Center. The Resource Kit automates all records disposition processing through workflows. Multi-phase retention cycles can now be applied to the same set of records. And final records disposition can now include transfers as well as destruction.

Enhanced Records Search – The DoD 5015.2 Resource Kit adds additional functionality to the Records Center Search page. The search page is now configurable so the default properties available for searching can be selected by the system administer. And the columns are displayed in a search result list and the order of those columns can be configured in any order that the end users prefer.

What is a File Plan

The file plan is the primary records management planning document in SharePoint Server 2010. Although file plans can differ across organizations, they typically:

  • Describe the kinds of items the organization acknowledges to be records.
  • Describe what broader category of records the items belong to.
  • Indicate where records are stored.
  • Describe retention periods for records.
  • Delineate who is responsible for managing the various kinds of records.

Introducing SharePoint 2010 Records Management Capabilities

A new trend in records management

Provides an alternative to the traditional process of copying or moving records to another location, and then applying security and retention policies. You can manage records “in place which means that you can leave a document in its current location on a site, declare it as a record, and apply the appropriate security, retention and disposition properties to the record. You can still use the Records Center site template, but now you have the option of managing records in any site.

There are essentially three ways to manage records. You can:

  • Manage records in an archive such as the Records Center.
  • Manage records in the same document repository (“in place”) as active (def: records in frequent use, regardless of their date of creation, required for current business relating to the administration or function of an organization) records.
  • Use a combination of the two methods above. For example, you could keep records in place with active documents for a specified period of time, and then move the records to an archive (Record Center).
Create and configure a Records Center site

This section provides guidance on the major steps you need to take to create and configure a Records Center site.

  1. Create the Records Center site using the Records Center site template.
  2. Create record libraries or lists to manage and store each record type that is specified in your file plan (def: A file plan describes the types of documents or items that an organization acknowledges as official business records. It indicates where these records are stored, and it provides information that differentiates one type of record from another).
  3. Associate content types to your libraries and lists.
  4. Create and add site columns to the relevant content types to contain and display the metadata for each record type that is specified in your file plan.
  5. Add an information management policy to a content type on the Records Center site.
  6. Configure the Content Organizer to route each record type to the appropriate location.
Configure in-place records management

When you use the Records Center, you are working in a locked down repository and can use a Send To operation to get records into that repository. However, any site can be enabled for in-place records management can be configured as a records management system. In this type of system, unlike with the Records Center, you can store records along with active documents in a collaborative space. Some additional benefits of using an in-place records management system are:

  • Records can exist and be managed across multiple sites.
  • With versioning enabled, maintaining versions of records is automatic.
  • eDiscovery search can be executed against both records and active documents at the same time.
  • Broader control over what a record is in your organization and who can create a record.
    There are three major steps to configure in-place records management:
  1. Activate in-place records management at the site collection level.
  2. Configure record declaration settings at the site collection level.
  3. Configure record declaration settings at the list or library level.

Differences between a records archive and in-place records

Factor Records archive In-place records

Managing record retention

The content organizer automatically puts new records in the correct folder in the archive’s file plan, based on metadata.

There may be different policies for records and active documents based on the current content type or location.

Restrict which users can view records

Yes. The archive specifies the permissions for the record.

No. Permissions do not change when a document becomes a record. However, you can restrict which users can edit and delete records.

Ease of locating records (for records managers)

Easier. All records are in one location.

Harder. Records are spread across multiple collaboration sites.

Maintain all document versions as records

The user must explicitly send each version of a document to the archive.

Automatic, assuming versioning is turned on.

Ease of locating information (for team collaborators)

Harder, although a link to the document can be added to the collaboration site when the document becomes a record.

Easier.

Clutter of collaboration site

Collaboration site contains only active documents.

Collaboration site contains active and inactive documents (records), although you can create views to display only records.

Ability to audit records

Yes.

Dependent on audit policy of the collaboration site.

Scope of eDiscovery

Active documents and records are searched separately.

The same eDiscovery search includes records and active documents.

Administrative security

A records manager can manage the records archive.

Collaboration site administrators have permission to manage records and active documents.

The following table describes differences between the two records management approaches that might affect how you manage IT resources.

Resource differences between a records archive and in-place records

Factor Records archive In-place records

Number of sites to manage

More sites; that is, there is a separate archive in addition to collaboration sites.

Fewer sites.

Scalability

Relieves database size pressure on collaboration sites.

Maximum site collection size reached sooner.

Ease of management

Separate site or farm for records.

No additional site provisioning work beyond what is already needed for the sites that have active documents.

Storage

Can store records on different storage medium.

Active documents and records stored together.

    Other Major Functionality
    Remote Blob Storage

Configure the Content Organizer to route documents

Configuring in place records management

Create a hold to suspend documents or items

Create and apply information management policies

Create Content Organizer rules to route documents

Declare any list or library item as a record

Implement Records Management

 

Introduction to Remote Blob Storage (RBS)

In SharePoint Server 2010, a binary large object (BLOB) is a large block of data stored in a database that is known by its size and location instead of by its structure — for example a Microsoft Office 2010 document or a video file. By default, these BLOBs, also known as unstructured data, are stored directly in the SharePoint content database along with the associated metadata, or structured data. Because these BLOBs can be very large, it might be better to store BLOBs outside the content database. BLOBs are immutable. Accordingly, a new copy of the BLOB must be stored for each version of that BLOB. Because of this, as a database’s usage increases, the total size of its BLOB data can expand quickly and grow larger than the total size of the document metadata and other structured data that is stored in the database. BLOB data can consume lots of space and uses server resources that are optimized for database access patterns. Therefore, it can be helpful to move BLOB data out of the SQL Server database, and onto commodity or content addressable storage. To do this, you can use RBS.

RBS is a Microsoft SQL Server library API set that is incorporated as an add-on feature pack for Microsoft SQL Server 2008 R2, SQL Server 2008 or Microsoft SQL Server 2008 R2 Express. The RBS feature enables applications, such as SharePoint Server 2010, to store BLOBs in a location outside the content databases. Storing the BLOBs externally can reduce how much SQL Server database storage space is required. The metadata for each BLOB is stored in the SQL Server database and the BLOB is stored in the RBS store.

SharePoint Server 2010 uses the RBS feature to store BLOBs outside of the content database. SQL Server and SharePoint Server 2010 jointly manage the data integrity between the database records and contents of the RBS external store on a per-database basis.

RBS is composed of the following components:

  • RBS client library
    An RBS client library consists of a managed library that coordinates the BLOB storage with Microsoft SharePoint Server, SQL Server, and RBS provider components.
  • Remote BLOB Storage provider
    An RBS provider consists of a managed library and, optionally, a set of native libraries that communicate with the BLOB store.
    An example of an RBS provider is the SQL FILESTREAM provider. The SQL FILESTREAM provider is a feature of SQL Server 2008 that enables storage of and efficient access to BLOB data by using a combination of SQL Server 2008 and the NTFS file system. For more information about FILESTREAM, see FILESTREAM Overview (http://go.microsoft.com/fwlink/?LinkID=166020&clcid=0x409) and FILESTREAM Storage in SQL Server 2008 (http://go.microsoft.com/fwlink/?LinkID=165746&clcid=0x409).
  • BLOB store
    A BLOB store is an entity that is used to store BLOB data. This can be a content-addressable storage (CAS) solution, a file server that supports Server Message Block (SMB), or a SQL Server database.

Q: Should I consider using RBS all the time?

No. RBS may be used when :

  • You have huge content dbs for document archiving so you want to reduce storage cost (terabytes of data)
  • You have large media files to stream to the audience
  • You need to use RBS to integrate 3rd party storage/archiving solutions to SharePoint. (For example EMC Documentum) For example, if you only have 100GB of data, separated in several content databases, and most of the content are documents, go for RBS will not benefit your server farm.

I found 2 very good, 1 hour each presentations on ECM and records from 2009 SP Conference in Vegas. The 2007 version was not enterprise ready for sure. 2010 according to many out there is enterprise ready and flexible but you have to use folders in the record center (archive) to handle flexible retention requirements. Meta data and content organizer can make this happen behind the scenes though so it is hidden from user.

The other revelation for me is that a records center is really an archive with not a lot of i/o so the SharePoint native content database will scale to 100 million items out of box. To get a 100M items in there it must be fully automated with careful metadata design and file plans etc. The only reason to go to blob storage outside of SharePoint is for cost reduction not scalability or performance. So we need to stick to OOB. To help scale, they have a hub approach where the hub receives content and distributes it to the right folder in the right database, the example is fiscal year, they have a db for a fiscal year, so the hub is on top and multi SharePoint OOB content databases below for each year, scalability is unlimited. With Search features and managed meta data features set up right, one can actually find a record in a pile of millions of items.

Note: The OOB records management feature will only send the latest version to the archive (records center) all other versions are discarded. But if you wish to keep every version in the records archive, there is a SharePoint workflow that can be deployed to copy a version to archive upon new version event. But to avoid this copying scenario, they recommend that if versions are critical to a record archive to use in place records management for these objects.

Advertisements