Too Much Information

Too Much InformationLike many businesses, my current employer uses Microsoft Dynamics GP for our main  accounting/inventory software. From a recent trip to Connections (the big convention for Dynamics), I learned that there were some cool data mining utilities available from Microsoft that we might be able to use. I love free stuff, and the opportunity to give our executives something to chew on for a bit (keeping them out of my hair) was one I couldn't pass up.

The Analysis Services install I managed to dig up (after much searching on Microsoft's Customer Source) went pretty easily. There are a couple of SSIS jobs included that populate an summary database with relevant data pulled from the objects you want to throw into your data warehouse. Not too shabby, even if it's an extra aggregating step. In fact, I kind of appreciate the added encapsulation and I know I'll have to restrain myself from using this database for other reports.

Anyway, the only thing left at that point was to "process" the data warehouse. That's where the data warehouse reads the "factual" information and chunks it up into nice little cubes that should have precise business-person meaning. It can take a while, so I was prepared to leave the process running a bit.

Five hours later...

We're just not so big a company that this thing should have taken five+ hours. Digging into the process that was taking so long revealed that at least one of the dimensions it was trying to process included a table join that seemed a bit fishy. It included a master-detail relationship that should have ended up with something around 500,000 rows. Checking the "Estimated Execution Plan" on the generated select statement showed that it was estimated to return closer to 2 billion rows. That's a messed-up join is what that is.

Fortunately, Visual Studio 2005 gives you some cool tools that work with Analysis Services. One of them is a wizard that you can point at a data warehouse and have it reconstruct it as a project. Checking the Data Source View "GPDataWarehouse.dsv" revealed what the problem was. Roughly half the relations in that view had split the key so that there was one for the actual relation minus the CompanyId and another with the CompanyId all by its lonesome. The effect of this is that the data warehouse would process once for the real relationship and again for the spare. Since CompanyId is the same value for all my records, what I ended up with was a data set getting ready to relate 147,698 records with every one of 11,937 records. I'll spare you digging out a calculator; that's 1,763,071,026 rows total.

Now frankly, I could have simply removed the CompanyId entirely from all relationships—CompanyId is only really relevant if your Dynamics install is trying to manage multiple companies. We aren't doing that. Still, I can be a little, uh, rigid with stuff like that so I went ahead and changed the relevant relationships to include CompanyId and simply deleted the spares. Once I deployed the project back to the data warehouse, processing it completed in a mere 15 minutes.

I suspect that the cause of this disjointed relationship is that the original warehouse was configured for SQL Server 2000 and we're using SQL Server 2005. At any rate, things are a lot happier now. I have an SSIS package that kicks off the summary database update and then processes the cube and the whole thing runs at night in 20 minutes or so.

For those who might be experiencing similar issues, I've uploaded my working Data Source View. You should be able to replace yours using SQL Server Management Studio. I selected all the options on install so it includes the Financials, Inventory, Payables, Purchases, Receivables, and Sales cubes. I'd do a backup before attempting it, but you were probably going to do that anyway...

4. April 2007 22:25 by Jacob | Comments (0) | Permalink

Out-Cleverring Yourself

Have you ever hacked a product to do something it wasn't intended to do in order to "simplify" things for your users and have that blow up in your face? This is an account of my experiences doing just that with MS Reporting Services.

If you've used Reporting Services at all, you'll know that there are two virtual directories that are created on IIS when you first install it to a server: ReportServer actually serves up the reports by passing the requested data to external applications via whatever protocol you have configured and Reports (aka ReportManager) which serves as a user interface for reports on the server.

ReportManager is, by far, the most visible of the two. Using ReportManager, you can organize your reports and data sources and set permissions on who can view and change them. Often, this is enough of a user interface that it is a viable deployment mechanism to simply point your users to the Url for a report under the ReportManager directory.

Ingenious people, like myself, will often designate a new Web instance on IIS just for Reporting Services, generally naming it something clever like "reports". When you do this, it is very tempting to point the root directory of that web instance to the ReportManager directory. This means that you can point your users to "http://reports/Invoice" instead of "http://reports/Reports/Invoice". You can see why you'd want to make this change (assuming you are similarly obsessive).

There is, however, an unintended side-effect to this change. Once you do this, your ReportServer will begin throwing errors if you ever decide you want to use direct Url access to display reports. Not a lot of people do this unless they're using a Reporting Services "Integration" that uses this functionality for showing reports. Personally, I ran into this situation when trying to tie our Reporting Services forms into Great Plains. Since Great Plains integrations (both native and third party) expect to use ReportServer for the report display, I was shocked and dismayed to find myself staring at this error:

The type Microsoft.ReportingServices.UI.WebControlConnection, ReportingServicesWebUserInterface, Version=, Culture=neutral, PublicKeyToken=89845dcd8080cc91 does not implement IReportServerConnection or could not be found

Hitting Google for this error is more than a little bit frustrating because the vast majority of issues where it shows up is in mis-configuring a ReportViewer object in an ASP.Net web application. This is manifestly not the case here.

This was a dome-scratcher for a lot longer than I like to admit. All of the advice pieces that I could find for this error are simply inapplicable. Fortunately, I ran across a Usenet post by Dave Green from last July. Dave reported that he could fix ReportServer if he removed a couple of settings from the <appSettings> section of the ReportManager. This is most odd because the two simply shouldn't be linked--at least, in that direction. If they were linked, I'd expect that link to go the other way (i.e. something in ReportServer breaking ReportManager). After all, ReportManager is essentially a UI on top of ReportServer.

I tested his findings and sure enough, removing the appSettings fixed ReportServer for me. Since removing those settings broke ReportManager, this was an unappetizing solution.

It was then that I remembered that IIS does an interesting little trick with web.config files and subdirectories. You see, each subdirectory on a website inherits the configuration from parent directories (also machine.config, but that's not relevant to our story). It doesn't matter how those directories are physically arranged, what is important is that IIS uses the Url to determine inheritance. So in a situation where "http://reports" is pointing at ReportManager and "http://reports/ReportServer" is pointing to ReportServer, that means that ReportServer is inheriting the web.config settings from ReportManager. Thus, the appSettings for ReportManager are being read by ReportServer and misapplied (I've no idea why the presence of an appSettings entry would break ReportServer)

Fortunately, the fix for this issue is pretty simple. The <appSettings> element includes a spec for a clear directive that will wipe out all inherited values. Adding an appropriate entry to the web.config in ReportServer cleared the problem right up. Here's the section for the curious (and/or lazy):

    <clear />

So there you go. I hoisted myself by my own petard. Fortunately, I was able to figure out how to get myself unhoisted. I now have both ReportManager and ReportServer humming along nicely without having to undo my clever directory mapping. Since I haven't been able to find this solution anywhere, I figured I'd post this for those in similar situations.

21. February 2007 21:47 by Jacob | Comments (0) | Permalink


<<  September 2017  >>

View posts in large calendar