Reporting From SharePoint Lists

Introduction
Overview
One of my occasional tasks involves taking data that is stored in one, or more, SharePoint lists and presenting it in a formatted table or chart.  The requirements vary as does the infrastructure and there are many options available.  In this article I’m going to take a while to talk through the options available and how to make some of them work in different scenarios.

Scenarios to consider
SharePoint is a big technology, deployed in a wide variety of circumstances and situations.  Part of the job of a consultant is to determine the best way to use the available technology in any given circumstance.  Some of the things that need to be considered when choosing how to report on a SharePoint list include:

  1. SharePoint Version (3/2007 or 2010)
  2. SharePoint Edition (Services/Foundation, Server Standard, Server Enterprise)
  3. SQL Version (2005/2008 or 2008R2)
  4. SQL Edition (Express Advanced Services, Workgroup, Standard or above)
  5. Report access from LAN only or across an Internet Facing Web App
  6. Whether Reporting Services is in Integrated or Native Mode
  7. Size and complexity of the SharePoint farm
  8. Complexity of the reports required
  9. Whether unsupported techniques are allowed

I’ll try to clarify some of these a bit further later on.

Report Mechanism
There are several mechanisms available for displaying the list data.  I’ll list out the ones that I can think of here and then try to talk about each one.  Some I’ll cover in detail.  These include:

  1. SharePoint Designer DataView (with/without jQuery)
  2. Google Charts
  3. ChartPart for SharePoint by Victor Wilen
  4. SharePoint built-in Charts
  5. Reporting Services using XML Connection Type
  6. Reporting Services using SharePoint List Connection Type
  7. Reporting Services using a SQL view of the SharePoint Database
  8. Codeplex SSIS Extension that connects Integration Services to SharePoint List and warehouse the data in a separate SQL table
  9. Third party SharePoint List connection type
  10. Chargeable, third-party chart web parts

Discussion
SharePoint Designer DataView

Although it takes a bit of effort and doesn’t have a good deployment story SharePoint Designer can produce some impressive looking tabular data especially when augmented by a bit of jQuery or JavaScript.  There are then some examples of using these to produce simple graphs.  Natalya has a great example of this: Vertical Bar Graph in Data View Web Part (DVWP).  The trouble with these is how quickly the SharePoint UI gives up and you move into hand crafting XSLT to display your data.  The story gets a bit better in some ways with 2010, you can take any existing list view and just add conditional formatting, fonts and other tweaks to it; often without even unghosting things.  Your authentication type and current zone (default/intranet etc) doesn’t prevent things within dataviews from working the same way.

ChartPart for SharePoint by Victor Wilen
You can find this on codeplex http://chartpart.codeplex.com/.  I found this in use on one of our customer sites recently but have no personal experience of it, however it works for them.  The customer in question is currently running Windows SharePoint Services v3 and is talking to us about an upgrade to Foundation 2010.  unfortunately Victor hasn’t yet released a version that works in 2010.

SharePoint built-in Charts
You are running SharePoint Server 2010 with enterprise features; congratulations you now have access to use it’s built-in charts.  I’ve only tried these on one site so far where I wanted to do a pivot chart on a list.  Unfortunately I couldn’t find pivot chart so had to move on and present the data a different way.

Reporting Services using XML Connection Type
David Wise wrote a good article on: Connecting SQL Reporting Services to a SharePoint List.  David’s approach is to use the XML data type in SQL Reporting Services and connect to SharePoint’s lists web service.  The XML data type has been there for a few SQL versions so will work on an older Server as well.  You then construct an XML query that uses the GetListItems method to retrieve a bunch of data from the list.  You can then use the layout tools in reporting services to create a report or chart and finally show this chart inside a SharePoint site with either Page Viewer Web Part (PVWP) or Report Viewer Web Part.  As part of your web service query you can supply parameters that filter the amount of data that is retrieved from the list: How to filter a sharepoint list with report
parameters.  That way you get back the 100 results you want, not the 6000 items in the list.  However because the query is XML there is no useful data typing and you will find yourself using cDate and the like in the report designer to constantly turn things into the correct data type for you to display or sum etc.  Field names also come through in need of attention that adds more time to writing the report.

Immediately we’ve opened up a bunch of other caveats including: how do you authenticate to the lists web service, is Reporting Services in integrated or native mode, are reporting services and SharePoint co-located or on separate Servers, are you in the default zone or across the Internet.

Firstly, the XML data type supports Windows Integrated authentication only.  This means that your users will have access in a report only to lists and items that they have access to in the SharePoint User interface.  If SharePoint and Reporting Services are on the same Server (meaning also that you have a singe Web Front End) then this authentication can be relatively simple whether in Native mode or Integrated mode.  If SharePoint is on one Server and Reporting Services is on another then you need to come up with something else and it varies between modes and your location.

You can have Reporting Services in Native mode and configure SharePoint and Reporting Services to support Kerberos authentication.  This, on a really good day, allows your user’s authentication to double hop between Servers and arrive at the Reporting Services Server.  Remember though that Kerberos requires direct access to the Domain Controllers so even if you publish SharePoint to the Internet (https://sharepoint.example.com) it won’t work from there.  Remember also that you put a PVWP into a Web Part (or 2010 Wiki page) in order to display the report, you then put the internal URL of your report server with some render URL parameters, thus it’s not going to work when you are across the Internet anyway.  Internally you may choose to do away with the need for Kerberos by editing the Data Source once published and using Reporting Services to securely store the credentials for you.  Congratulations your users no longer need access rights to the SharePoint list that you are showing them in the report (maybe this is what you wanted anyway).

You can have Reporting Services in Integrated mode, set its Authentication Mode to Windows Authentication.  You are back to relying on Kerberos for the double hop.  Now you upload your reports to a document library, from the LAN you can click on a report and render it.  However reports only render in the default zone and will fail, if you are in another zone.  You can though create a web part page and put the Reporting Services Report Viewer web part on the page and then point that at the report in the document library.  This trick then allows the report to render from the Internet Zone and it doesn’t matter that we jump to the other Server for which SharePoint has an internal URL.  Oh no, we just failed again – Kerberos won’t work for us because we don’t have direct DC access from the external location.

You can have Reporting Services in Integrated mode, set its Authentication Mode to Trusted Account.  This should solve our need for Kerberos and allow us to render those reports from the Internet without a double hop to a second box.  Fail again, you can’t use data sources that only support Windows Integrated on a Report Server that is Integrated Mode when Authentication Mode is set to Trusted Account.   Oh and you have to look pretty closely into your choice of data source to find out what its authentication options are.

Reporting Services using SharePoint List Connection Type
If you are lucky enough to be using SQL 2008 R2 Reporting Services then you have access to the SharePoint List Connection Type.  Remember though, you won’t get this in Express AS or even Workgroup, its Standard or above (check the comparison matrix on Microsoft).  This connection type gives you a GUI for choosing the columns that you want to pull through, these then arrive with decent names and proper data types.  However there is no filtering in the query; you’ll get back 6000 items and need to use the report to through away all but the 10 you want to show to the user.

The SharePoint List connection type only supports Windows Integrated authentication so all the stuff above about Kerberos and Trusted Accounts applies here as well, meaning that it is also only suitable for the default zone across the LAN or a VPN.

Reporting Services using a SQL view of the SharePoint Database
Be warned this is officially not supported.  You will find people on the Internet warning you off this technique but lets explore it anyway.  SharePoint stores all the data in every list in the alluserdata table of the content database.  If you have control over the SQL Server then you can directly query that table to extract the data in a list.  My preference is to create another database and then create a series of views within it.  Each view looks at the SharePoint content database to retrieve results for a specific list.  I can then use those views to build reports.

Before I talk about how I build the queries, I want to first consider this in comparison to the systems above.  Firstly, here is another example of why you might create a new Site Collection instead of a new Site; consider using STSADM to create a new site collection in a new content database.  On a recent project, the main content database was 8GB so I created an additional one for the new application that I was going to report on, thus my unsupported queries can’t affect the bulk of the data.

Now I’m in TSQL I can do parameterised queries on the table and retrieve just the 10 rows that represent the 10 items in a list of 6000.  How can this not be better than using the SharePoint List connection type, pulling everything back and throwing most of it away.  Oh and check the clustered index on the AllUserData table, it looks like it should help with these queries as well.

Now I’m in TSQL I can use either Windows Authentication or SQL Authentication.  So if I’m in SharePoint Integrated mode and my report Server is on a separate machine, I can now use Trusted Account, store my SQL username and password and solve all my authentication issues.  With one more small trick I can finally make my reports available in ALL SharePoint Zones.

What is that small trick, well if you publish a report to a SharePoint library and click on it, to run it across the Internet you get a message telling you that this can only work in the default zone.  Instead I create a Web Part Page and put a Reporting Services Report Viewer Web Part on the page, I point that web part at the report’s RDL and the report can then render in the non default zone and I finally have my solution.  If it’s not supported, why is it the only way that works.

*****Update March 2012 Relating to SharePoint 2010/SQL 2008R2 Reporting Services**********
If you are using SharePoint 2010 you will have the SQL 2008R2 Report Viewer, you may have this with an older SharePoint version anywhere.  There are two tricks with this to force it to work:
1 – If you use the browse button to find your report in the list then it will put a relative URL in place.  You need to prefix this with the http:// stuff that turns it into an absolute URL for your default zone.
2-You need to expand the View section and untick Show Report Builder Menu
*************End Update

OK, so how do I query the SharePoint Content database.  In my new Site collection, I create a list called Regions:

SharePoint Regions List

SharePoint Regions List

This simple list just has the title column that I’ve renamed to Region.  If you use List Settings on the ribbon and inspect the URL, you will be able to see the list ID.  You need to take the %2Ds and turn them into dashes.  Then in SQL Management Studio you can use a query on the database similar to the following

SELECT      tp_ID AS ID, nvarchar1 AS [Region]
FROM         WSS_Content.dbo.AllUserData
WHERE     (tp_ListId = ‘{9045D3AA-475B-48DA-8043-631D4DC79421}’)
  AND (tp_IsCurrentVersion = 1) AND (tp_DeleteTransactionId = 0)
  AND (AllUserData.tp_RowOrdinal = 0)
ORDER BY ID

  • tp_ListId indetifies the list that we want data for
  • tp_IsCurrentVersion makes sure we only get the latest version of the item
  • tp_DeleteTransactionId makes sure we don’t get items that are in the recycle bin

The table has a bunch (64 I think) Varchar columns, 16 Int columns, some date columns, some SQL Variant columns etc.  The data type you pick in SharePoint effects which type of column will store the actual data.  The Title column is always stored in varchar1.  The next Text column you add to the list will be in varchar2.  Calculated columns are in the SQLVariant columns.  Lookups are in the Int columns (you need to watch for that one).  On a simple list, each item will only take one row in the all user data table (tp_RowOrdinal=0).  If you have more than 16 Integers then your list will start using RowOrdinal 0 and RowOrdinal 1 to fully represent each list item.  I don’t know if there is a limit on the number of RowOrdinals that a list can use, however each time it adds one, you take a performance hit reading or writing data to that list.  SharePoint 2010 has a default limit of 6 rows per list, but this can be increased.

Integers can be the biggest cause of a rollover onto another row, they are used for every lookup, also each time you add a workflow to a list there is a workflow status column created, which uses the Integer columns as well.  So 25 lookups, 10 workflows and you are onto three rows.  Remember to consider if you should normalise your design; this gets a chunk easier in 2010 anyway.

You generally need to look at the data in the list, look at a query with all columns in the database and match the two to see which database column/Row Ordinal, matches which list column.  I have also seen a Bamboo Solutions web part that will show this information without you having to work it out yourself.

If you run the query above, you’ll get a result similar to the one below:

SQL Select SharePoint Regions List

I’ll take the query that I’ve worked out above and use it to create the RegionsView or Vw_Regions, if you prefer.  If I have to do a view on another list that uses regions as a lookup I can then either join to the RegionsView, or use a sub select to turn the integer ID into the text of the lookup.

Conclusion
Like most things in SharePoint, there is more than one way to meet the customer’s requirements.  I have seen, in production, sites using: dataviews, reports based on SharePoint List connection type, reports based on SQL Views of the SharePoint database and sites using bespoke/third-party charting web parts.  I think that you have to consider each situation on its merits and decide which option is going to be the best fit for the customer.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Reporting From SharePoint Lists

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s