The content library is a new concept that was introduced in System Center 2012 Configuration Manager.  In a nut-shell, the content library stores all the Configuration Manager content efficiently on the disk.  If the same file is part of two different packages, it stores only one copy in the content library.  However, references are kept indicating that the file is part of both the packages.
The focus of this blog is to provide more insights on what happens behind the scenes and help Configuration Manager users and administrators to understand the concept better.
Note: The content library is also known as the “single-instance store”, referring to the single instance of any particular file.

Rationale for the Content Library

The rationale for content library is to optimize disk storage and to avoid distributing a file that already exists on the distribution points.
If two different packages each contain a particular file that is identical (even if the file names are different), only one copy of this file will be stored by the content library.  This minimizes the disk space consumption.
When distributing a package, we first analyze all the files in that package. If a file to be distributed is already present on the distribution point as part of another package or part of a previous version of the same package, that file is not copied to the distribution point.  Instead, we add a mapping reference between that file and the new package that we are distributing. This helps reduce the network traffic by not copying files that already exist.  Additionally, it allows for more rapid provisioning of packages on the distribution point.

Location of the Content Library

A copy of the content library (containing all packages) is housed on the site server (as the source for distribution points).  Moreover, each distribution point will have a copy (as the source for clients), containing the packages distributed to the distribution point.  The content library is designed to optimize both network and disk usage in the distribution process.  This helps to keep our customers’ costs lower and efficiency higher.
The content library is typically stored on the root of a drive in a folder called “SCCMContentLib”.  This folder is shared and has restricted permissions to prevent accidental damage.  Within this are the Package Library (“PkgLib” folder), the Data Library (“DataLib” folder), and the File Library (“FileLib” folder).  The Package Library contains information about what packages are present on the distribution point.  The Data Library contains information about the original structure of the packages.  The File Library contains the original files in the package; this is typically where the bulk of the storage is used.
For instance, in the screenshot below, the content library is located on the root of the C: drive, in C:\SCCMContentLib.  It is shared as “SCCMContentLib$”.  Regardless of which drive the content library is located on, the primary share location will always be “SCCMContentLib$”.

Screenshot

 

Diagram

 

Package Library

The starting point for exploring the content library is the Package Library folder, “PkgLib”.  Within this folder will be several files, one for each package distributed to the distribution point.  The name will be the package ID, e.g. ABC00001.INI.  In this file is a list of contents IDs (under the “[Packages]” section) that are part of the package, as well as other information, such as the version.  Using these content IDs, we can continue exploring the contents of the content library.  Let us assume that ABC00001 is a legacy-style package, at version 1.  Thus, the content ID in this file will be ABC00001.1.

Data Library

Once we have found the content IDs we are interested in, we can continue exploring the content library.  In the Data Library folder, “DataLib”, there will be one file and one folder for each of the contents in each package.  This file and folder will be named, for example, ABC00001.1.INI and ABC00001.1, respectively.  The file contains information for validation.  Inside the folder, the folder structure from the original package is recreated. 
However, the files in the Data Library are replaced by INI files that have the name of the original file in the package, e.g. MyFile.exe.INI.  These files contain information about the original file, such as the size, time modified, and the hash.  The first four characters of the hash will help us find where the original file is in the File Library.  Suppose the hash in MyFile.exe.INI is “DEF98765”.  Thus, the first four characters are “DEF9”.

File Library

The last step in exploration is locating the file in the File Library, “FileLib”.  If the content library is spanned across multiple drives, the files could be in the File Library on any of these drives.
Using the first four characters from the hash found in the Data Library, we can locate the file.  Inside the File Library folder(s) will be many folders, each with a four-character name.  Find the folder that matches the first four characters from the hash.  Remember that the folder may be in the File Library on a different drive.
Once this folder is found, it will contain one or more sets of three files.  These three will share the same name, but one will have the extension INI, one will have the extension SIG, and one will not have any extension.  The file with no extension whose name is equal to the hash found above is the original file.
Using the example above, we would look for folder “DEF9”, containing “DEF98765.INI”, “DEF98765.SIG”, and “DEF98765”.  Here, “DEF98765” is MyFile.exe.  Additionally, in the INI file, there will be a list of “users”; these are content IDs that share the file.  The file will never be removed unless all of these contents are also removed.

Difference Between Distribute, Update, and Redistribute Actions

The first major action pertaining to content distribution is the Distribute action.  This refers to the initial distribution of a package to a distribution point.  This is triggered with Distribute Content in the Configuration Manager console.  This will transfer all files in a package to the target distribution points, excluding those which are already present as part of another package—these will become shared.
The second major action is the Update action.  This is typically used when a package has been changed and all distribution points to which it is distributed need the updated content.  This is triggered with Update Distribution Points in the console.  This will transfer the changed files to all distribution points.  Unchanged files will not be transferred.  If a file is removed from the package in the updated version, it will be deleted from the package on the distribution point (as long as no other packages are sharing it).
The third major action is the Redistribute action, triggered with Redistribute in the Configuration Manager console.  This will transfer the entire content to a specific distribution point.  Files will still be transferred and overwritten even if they are already present on the distribution point.  The chief purpose of the Redistribute action is to correct any inconsistencies that may exist in the content library.

Content Library Explorer

There is a new tool available in the System Center 2012 R2 Configuration Manager toolkit called “Content Library Explorer”.  This tool facilitates user-friendly exploration of the contents of the content library.  This tool cannot be used to modify the contents, but can provide insight into what is present, as well as allowing validation and redistribution.  Please refer to the toolkit documentation for more information. This documentation is installed with the toolkit.  There is typically a shortcut for it in the start menu called "System Center 2012 R2 Configuration Manager Toolkit Help".

Drive Spanning

The content library can be spanned across multiple drives.  These drives can be manually chosen by the administrator at the time the distribution point is created.  Alternatively, they can be chosen automatically by Configuration Manager (this is the default setting).
If they are chosen by the administrator, a primary and secondary drive can be chosen.  On the primary drive, all metadata will be stored.  Only the File Library will be spanned across to the secondary drive.  On secondary drives, the folder’s share name includes the drive letter.  For instance, if D: and E: are secondary drives for the content library, the share names would be “SCCMContentLibD$” and “SCCMContentLibE$”, respectively.
If “Automatic” is chosen, Configuration Manager selects the drive with the most available free space as its primary drive.  All of the metadata will be stored on this drive.  Only the File Library will be spanned across to secondary drives.  The administrator selects a reserve space amount; Configuration Manager attempts to use a secondary disk once the best available disk has only this reserve space amount left free.  Each time a new drive is selected for use, the drive with the most available free space is selected.
Currently it is not possible to specify that a distribution point should use all drives except for a specific set from the console.  It can be prevented by creating an empty file on the root of the drive, called exactly “NO_SMS_ON_DRIVE.SMS”.  This file must be present before the drive is selected for use by Configuration Manager.  If Configuration Manager detects this file on the root of the drive, it will not use the drive for the content library.  Additionally, there is a command-line tool in the toolkit for permanently moving the content library to a different drive, called “ContentLibraryTransfer.exe”.

Azure Distribution Points

Azure distribution points do not use single instancing.  This is because packages are encrypted before they are sent, and each package has a unique encrypted key.  So, even if two files were identical, the encrypted versions would not be.

Troubleshooting

Here are a few tips for troubleshooting issues with the content library:
  • Check the logs on both the site server (distmgr.log and PkgXferMgr.log) as well as on the distribution point (smsdpprov.log) for any pointers to the failures.
  • Use the Content Library Explorer tool to gain any additional insights.
  • Check if there are any file locks by other processes (e.g. antivirus) or other software.  It is a good practice to exempt the content library on all drives from automatic antivirus scans, as well as the temporary staging directory, “SMS_DP$”, on each drive.
  • If you still have issues you can try Validating the package from the Configuration Manager console and see if there are any hash mismatches.
  • As a last option, you can redistribute the content which should resolve the issue.

Orphaned Contents

On occasion, it is possible for, files, contents, or packages to be orphaned on a distribution point—they are not distributed to the distribution point, but due to an error during deletion, they were not deleted.  Often this is due to external programs (such as antivirus) having file locks on the content when deletion is attempted by Configuration Manager.
This is a known issue.  The operability of Configuration Manager is not affected; the distribution point will still function as intended.  However, this orphaned content can waste disk space.  We are investigating ways to clean up the existing orphaned content in future releases of the toolkit, as well as fixing the product to ensure that no future content is orphaned.

http://blogs.technet.com/b/configmgrteam/archive/2013/10/29/understanding-the-configuration-manager-content-library.aspx