Multi-blog Obsession

by Jacob 6. April 2009 05:26

TwoBlogsThe multi-blog data provider for BlogEngine.Net has been taking up a lot of my brain space lately—to the point that I’m able to announce that it is installed and working “in the wild” on a hosted site (though not in anything like a heavy-load situation). I now have a copy of both my dev site and my personal site up and running from the same directory (and the same database). Frankly, I didn’t think it’d be as easy as it was. This success prompted me to create a 2.0 release (that is now up on the CodePlex site).

Getting Static

My main fear was with the heavy use of static variables in BlogEngine.Net. You see, BE.Net loads all the data into memory using static List variables. I found this out when I went looking for the best way to store a BlogId (so that it didn’t have to parse against an Url every time a request came through).

While there are pros and cons to keeping your entire blog in memory (pro: speed and ease, con: memory bloat and a large delay on any request that triggers a data load), my concern was how an application would react when it had to serve two sets of data. Fortunately, it seems that even when two sites share an application pool on IIS, they still keep their static spaces separate. I’m not sure what I was going to do if it didn’t but I was spared the tragedy.

Configuration

Installing the blog provider mainly involves copying the binary into the /bin directory and then updating the web.config to point to the right driver. There are three providers in your web.config that are affected.

Blog Provider

The blog provider handles the blog data. Settings, posts, categories and suchlike. Add the provider and update the “defaultProvider” tag and you’re ready to go.

<BlogEngine>
  <blogProvider defaultProvider="SQLBlogProvider">
    <providers>
      <add name="SQLBlogProvider" type="BlogEngine.SQLServer.SqlBlogProvider, BlogEngine.SQLServer" connectionStringName="BE"/>
    </providers>
  </blogProvider>
</BlogEngine>

Membership Provider

The membership provider handles user authentication and management (stuff like changing passwords and such). Technically, you don’t need to change this, but if you don’t the users will be the same across blogs (not a problem if you aren’t multi-blogging). I frankly haven’t tested if a mixed-configuration actually works but it should. Again, add the provider and update the “defaultProvider” tag and you’re ready to go.

<membership defaultProvider="LinqMembershipProvider">
  <providers>
    <clear/>
    <add name="LinqMembershipProvider" type="BlogEngine.SQLServer.LinqMembershipProvider, BlogEngine.SQLServer" passwordFormat="Hashed" connectionStringName="BE"/>
  </providers>
</membership>

Role Provider

The role provider handles authorization and what users are assigned to which roles. Again, you don’t technically have to change this if you don’t need it. Also again, it’s simply a matter of adding the provider and changing the “defaultProvider” tag.

<roleManager defaultProvider="LinqRoleProvider" enabled="true" cacheRolesInCookie="true" cookieName=".BLOGENGINEROLES">
  <providers>
    <clear/>
    <add name="LinqRoleProvider" type="BlogEngine.SQLServer.LinqRoleProvider, BlogEngine.SQLServer" connectionStringName="BE"/>
  </providers>
</roleManager>

Multiple-blog Configuration

To set stuff up for multiple blogs, you’ll need to run a script or two in your database and add a tag to all the providers. There are two script files (included in both the binary and source files), one for setting up the initial database changes (DatabaseSchemaChanges.sql—mostly adds tables) and another for adding the base values for a new blog (AddNewBlog.sql).

I wanted to make this easier by having the driver do the updates for you. That may still happen in the future, but since BlogEngine.Net itself requires manually running a script if you want to use the database provider I decided not to sweat it too hard. Presumably, anyone running in a database has to be running scripts manually anyway so this isn’t going to be a show stopper.

The provider will run just fine after running either script, even if you aren’t using multiple blogs. In other words, just because the database changed doesn’t mean that the single-blog installation is hosed. The exception to this is the “be_Settings” table. If you’re going to run for a while with a single-blog after running the first script, you’ll want to add a default to the BlogId column so it doesn’t choke when you insert and update settings.DefaultBlogId

Both scripts are “templated” so you can change key factors (a table prefix on the first and a couple of blog values in the second). Filling in the template is a matter of hitting ctrl-shift-M in Query Analyzer or SQL Server Management Studio. That’ll bring up a prompt for what values you want those template variables to have.TemplatePrompt

The final thing to setup is to add a multiblog attribute on the providers. That’ll make your providers look something like this.

<add name="SQLBlogProvider" type="BlogEngine.SQLServer.SqlBlogProvider, BlogEngine.SQLServer" connectionStringName="BE" multiblog="true"/>

The provider selects the blog it wants to deliver based on three configured values.

  • Host is the base address. The provider matches the Host value against the end of the host (so rabidpaladin.com will match “rabidpaladin.com”, “www.rabidpaladin.com” and “blog.rabidpaladin.com”).
  • Path is the rest of the Url. The provider matches the Path value against the start of the requested path.
  • Port is the port (if any) in the Url. Honestly, I threw this one in there as much for my testing as for any real-world use I expect it to see.

Tags

One thing I added (at the provider level) is that when a post comes in without any tags, the provider takes a moment to scan for tags in the post body. This is a feature I did the initial work for in Subtext so porting it over was a matter of a couple minutes. Any time a post is inserted into the database, the provider checks if it has tags yet. If no tags are present, it will scan the content for appropriate anchor markup (like those produced for Technorati tags). That means that on import, my posts all had their tags correctly populated—saving me a lot of extra work (or face losing tags on imported posts). That I was able to avoid the brain-damaged tag handling of BlogEngine.Net is just a bonus (they lower-case tags on creation and then re-capitalize them when serving them up).

Other Stuff

As I said, this should get you set up. Since I used this blog provider from the start on both my blogs, I can verify that the import tool works just fine in a multi-blog configuration. As far as BlogEngine.Net is aware, it’s doing the same stuff it always has. Indeed, the only change I made from BlogEngine.Net’s standard v1.4.5 release was in UrlRewrite.cs to allow links produced by Subtext to still work (so I don’t throw errors on old links).

else if (url.Contains("/POST/") || url.Contains("/ARCHIVE/"))

I submitted a patch at one time to have this hit the base source code but apparently it wasn’t deemed worthy.

Also, I found that running the provider in IIS7 is a bit tricky. Since BlogEngine.Net loads extensions from the database on application start you’ll get errors if you are configured for “Integrated” mode. That’s because “Integrated” mode (quite properly) fires the application start event before the HttpContext.Request is populated (which is what I’m using to determine what blog is being requested). Setting the application pool to “Classic” mode will solve this “problem”.IIS7ClassicMode

Looking Forward

My blogs are still running Subtext at their base addresses. I’m still not quite ready to take the plunge on BlogEngine.Net.  I am, however, undoubtedly one step closer.

Tags: , , , , , , ,

Programming

Multiple Blog Data

by Jacob 29. March 2009 09:01

PartialSchema So I have a working LINQ to SQL provider for BlogEngine.Net. Now what? Given a little spare time, how about I see if I can’t use it to support running multiple blogs from the same installation? More importantly, see if I can use it to support running multiple blogs from the same database?

Doing just that turns out not to be all that difficult.

Scheming

The current architecture for BlogEngine.Net’s data already has a bit more cohesion than it technically needs. All the objects have their own individual Ids and those Ids are used to relate objects to each other (though there is one exception). Since every object already has its own Id (usually a Guid), splitting objects into separate blogs isn’t the chore it might otherwise have been.

There are two options when it comes to dividing items up into multiple blogs. First, each object can have a column added to its table to indicate which blog it is associated with. Second, you can create a cross-reference table that associates a blog Id with the object Id for the blog.

Columns
My initial impulse in most cases would be to add a BlogId column to the tables where it is needed. The reason is simple: objects belonging to the blog are in a true parent-child relationship and that relationship is generally best expressed as a field on the child indicating its parent. The relationship can (and really should) be enforced with a foreign key constraint on the column to ensure that the relationship is intact.

Cross-references
Having cross-reference tables is a bit more problematic and carries with it some maintenance and performance concerns. Not only does it force a join when you want to read the objects for a specific blog, but it means that insert, update, and delete commands now have to involve two tables instead of just one. One advantage of cross-reference tables is that they’re easier to extract back out if you need to devolve your data. Additionally, foreign key constraint integrity is triggered when the cross-reference entry is created instead of on your blog objects themselves—making your touch a bit lighter if you have other actors in the system.

Complicating Things 
No decision is best for every occasion, and when it came time to design how I wanted multiple blogs to work, I was really reluctant to mess with the native tables of BlogEngine.Net. I’m not sure if my hesitation is a matter of respect for a project I’m not involved in or if I’m just being unreasonably squeamish, but I eventually chose to go the cross-reference route. My main reasoning is that I wanted my intrusions to remain light and easily devolved.

I ♥ Linq

Now, normally, adding a super-structure on top of an existing infrastructure is a real pain. Editing all your SQL statements manually becomes an exercise in precision string manipulation and if you’re working through stored procedures… ugh. Linq made this really easy.

Here’s an example from the FillProfiles method of the blog provider.

var profileData = from p in context.Profiles
                  select p;
if (isMultiBlog)
{
    profileData = from p in profileData
                  join bp in context.BlogProfiles on p.ProfileID equals bp.ProfileId
                  where bp.BlogId == Utils.GetBlogId()
                  select p;
}

The initial select is good for the general case. It pulls all the objects from the Profiles table. Adding a filter when we have multiple blogs is added in the if clause. Note that the second select references the first (“from p in profileData”). Linq knows that the second “from” is a refinement of the first and puts them together logically. Since Linq defers execution of the query until it’s actually used, the query sent to the server includes the full constraint (i.e. filtering happens on the database). Here’s the statement that’s actually sent.

SELECT [t0].[ProfileID], [t0].[UserName], [t0].[SettingName], [t0].[SettingValue]
FROM [dbo].[be_Profiles] AS [t0]
INNER JOIN [dbo].[be_BlogProfiles] AS [t1] ON [t0].[ProfileID] = [t1].[ProfileId]
WHERE [t1].[BlogId] = @p0

This method ensures that you only take the hit of the join if you are in a multi-blog setup. And without pulling everything to the client.

Settings

I had some fun with the Settings table because it is an exception to BlogEngine.Net’s Id rigor. It has interesting impact on the Linq situation, but I think I’ll give it its own (short) post later.

Beta Available

So I tested this in my own home-grown environment and it seems to work as expected. In consequence, I’ve created a new release at the project homepage. I’m calling it a beta, though it barely warrants the label. I worry that it has only been tested in a single environment. If you’re a hearty soul and a BE.Net user, please give it a go. I’ll be spending some time getting it set up and tested in an actual public setting with my personal blogs here shortly. As always, I welcome feedback either at codeplex or comments or via email.

Tags: , , , , , , ,

Programming

Gratuitous Use of Linq

by Jacob 7. March 2009 06:10

PowerShovel Every now and then I get to doing something just because... well, because I can. These projects usually atrophy before becoming anything usable and serve more as a way to explore and practice than anything else. Usually. My latest tangent actually got to a state where I can let it loose in the wild and it'll probably actual do what it is supposed to do.

BlogEngine.Net

Let me be perfectly clear up front: I don't actually use BlogEngine.Net at all. Anywhere. I'm still a Subtext guy when it comes to blogging software. BlogEngine.Net still lacks critical features and that prevents me from using it as yet (primarily running multiple blogs from a single installation/database).

That said, BlogEngine.Net is a lovely little product with a lot to like about it. The extensibility model is extremely easy to use and flexible. The theming doesn't suck. And its architecture is easy to navigate/understand even while it makes investments in areas I consider likely to payoff.

Data Access

While I like the product, the data access bugs me more than a little. It uses a provider model and includes built-in XML and database providers. These are good things. For flexibility, the database provider uses System.Data.Common and the DbProviderFactory with string-built commands. This structure allows BlogEngine.Net to be able to use any database that has a .Net data provider (including things like MySQL, SQLite, or VistaDB). Incidentally, they downplay (unintentionally?) this feature on their website saying in their FAQ that they support XML and “the SQL Server provider”.

At any rate, here's their SelectPage implementation as an example

public override Page SelectPage(Guid id)
        {
            Page page = new Page();
 
            string connString = ConfigurationManager.ConnectionStrings[connStringName].ConnectionString;
            string providerName = ConfigurationManager.ConnectionStrings[connStringName].ProviderName;
            DbProviderFactory provider = DbProviderFactories.GetFactory(providerName);
 
            using (DbConnection conn = provider.CreateConnection())
            {
                conn.ConnectionString = connString;
 
                using (DbCommand cmd = conn.CreateCommand())
                {
                    string sqlQuery = "SELECT PageID, Title, Description, PageContent, DateCreated, " +
                                        "   DateModified, Keywords, IsPublished, IsFrontPage, Parent, ShowInList " +
                                        "FROM " + tablePrefix + "Pages " +
                                        "WHERE PageID = " + parmPrefix + "id";
 
                    cmd.CommandText = sqlQuery;
                    cmd.CommandType = CommandType.Text;
 
                    DbParameter dpID = provider.CreateParameter();
                    dpID.ParameterName = parmPrefix + "id";
                    dpID.Value = id.ToString();
                    cmd.Parameters.Add(dpID);
 
                    conn.Open();
                    using (DbDataReader rdr = cmd.ExecuteReader())
                    {
                        if (rdr.HasRows)
                        {
                            rdr.Read();
 
                            page.Id = rdr.GetGuid(0);
                            page.Title = rdr.GetString(1);
                            page.Content = rdr.GetString(3);
                            if (!rdr.IsDBNull(2))
                                page.Description = rdr.GetString(2);
                            if (!rdr.IsDBNull(4))
                                page.DateCreated = rdr.GetDateTime(4);
                            if (!rdr.IsDBNull(5))
                                page.DateModified = rdr.GetDateTime(5);
                            if (!rdr.IsDBNull(6))
                                page.Keywords = rdr.GetString(6);
                            if (!rdr.IsDBNull(7))
                                page.IsPublished = rdr.GetBoolean(7);
                            if (!rdr.IsDBNull(8))
                                page.IsFrontPage = rdr.GetBoolean(8);
                            if (!rdr.IsDBNull(9))
                                page.Parent = rdr.GetGuid(9);
                            if (!rdr.IsDBNull(10))
                                page.ShowInList = rdr.GetBoolean(10);
                        }
                    }
                }
            }
 
            return page;
        }

I want to reiterate that I'm not ripping on their choice to use this data access methodology. If you want the flexibility to use any .Net supported data provider without a third-party dependency, this how you get it. You could optimize some of the stringiness, but with trade-offs.

Linq, Linq, Baby

That said, I don't mind being tied to SQL Server and if I'm going to muck with the data layer (like, say, if I'm going to attempt multiple blogs from a single application instance) I want something simple that I can use without all that extra goo. I looked for any hint that this might have been done already, but I couldn't find anything. It looks like I'm the only one with this particular manifestation of brain damage, however.

Since configurable table prefixes are a desirable feature and since that's easier to do in Linq to SQL I figured it'd be best to go that route. It was good to get the table prefix stuff into a working project and have it work about as I expected it to.

Implementing the BlogEngine.Net blog provider turns out to be pretty easy in Linq. Ditto the Role and Membership providers. I tried to stay as close as possible to the DbBlogProvider. Even so, I found that some of the admin components are picky enough that even little things could bite me (the category editing page blows up if you leave category descriptions null, for example).

For compare and contrast, here's my SelectPage method using Linq to SQL

public override Page SelectPage(Guid id)
{
    Page page = null;
    using (Data.BlogDataContext context = getNewContext())
    {
        Data.Page pageData = (from p in context.Pages
                              where p.PageID == id
                              select p).FirstOrDefault();
        if (pageData != null)
        {
            page = new Page()
            {
                Id = pageData.PageID,
                Title = pageData.Title,
                Description = pageData.Description,
                Content = pageData.PageContent,
                Keywords = pageData.Keywords,
                DateCreated = pageData.DateCreated.HasValue ? pageData.DateCreated.Value : DateTime.MinValue,
                DateModified = pageData.DateModified.HasValue ? pageData.DateModified.Value : DateTime.MinValue,
                IsPublished = pageData.IsPublished.HasValue ? pageData.IsPublished.Value : false,
                IsFrontPage = pageData.IsFrontPage.HasValue ? pageData.IsFrontPage.Value : false,
                Parent = pageData.Parent.HasValue ? pageData.Parent.Value : Guid.Empty,
                ShowInList = pageData.ShowInList.HasValue ? pageData.ShowInList.Value : false
            };
        }
    }
    return page;
}

BlogEngine.Net for SQL Server

So I got the thing working and thought I'd open it for “the community”. Since BlogEngine.Net is on CodePlex, that was a natural choice. Anyone sharing my peculiar proclivity is invited to head on over, take a poke and let me know what can change for the better. Or better yet, submit a patch. Or better, better yet, join the project. (also: CodePlex's recent transparent svn compatibility is awesome! When did that happen?)

If you do poke at the project, some forewarning. First, I don't have unit tests for this and don't plan on any (I'm not against using them, I just don't want the headache of creating them myself). Data access is more in the realm of “integration testing”, so I'm not sure there's much you can really do that's actually useful. It might be different if BlogEngine.Net had a suite of tests I could use to validate my providers against, but...

Second, I haven't gone out of my way to do a lot of commenting. This is deliberate. These methods are short, should be self-explanatory, and since they should be hidden behind a provider I didn't even bother with the typical XDoc stuff. Anything directly accessing them such that intellisense comes to bear is doing something wrong...

Room to Improve

In the end, this is the ground-work to remove an (admittedly trivial) barrier in working with BlogEngine.Net. I hope to help move as much as possible into the database in order to support things like multi-blog configurations. BlogEngine.Net hasn't been very disciplined in its storage layer and XML really is the default medium. Things like referrer tracking don't use a provider at all so you really can't (yet) get away from XML files in your App_Data directory (and hence, read/write permission configuration). Since I want that to change, I'm going to see if I can't get that going a bit.

Things to do
  • Decide on a database update methodology (so far, I'm using the already-existing tables and hence piggy-backing on the BlogEngine.Net scripts)
  • Replace the built-in ReferrerModule (use/create provider model access for referrers?)
  • Comb the project for any other errant XML dependencies
  • Work on multi-blogging configuration

Tags: , , , , , ,

Programming

Changing Table Names in an OR/M

by Jacob 27. August 2008 11:33

Empty Sign I spent some quality time googling this and even went and asked the nascent Stack Overflow community and didn't come up with a satisfactory answer. Being the intrepid sort, I opened up a test project and started poking around, compiling information from a number of sources and playing until I got something that worked. For your amusement and/or edification, I'll document what I found.

What I Want to Do

The basic scenario is that many typical “commodity” web applications use databases to store their information. Since most web hosting services come with a single database but charge extra for additional databases, it is common for web-type products to add identifier text to their table and stored procedure names*. The .Net blog software I sometimes contribute to, Subtext, is an example. Take the table I added to hold tags associated to posts a while back, “subtext_Tag”. Using the “subtext_” prefix means that we won't run into naming collisions on our tables if someone has a wiki or forum application that also contains a table for tag entities named “Tag”.

 

* And yes, as one user on Stack Overflow suggested, you could use schema as a differentiator so we could as easily have used a “subtext” schema and our tables would be “subtext.Tag” instead of “dbo.Tag”. That solution is much trickier to setup than a simple table naming convention, though, so I think I prefer the table prefix for now.

 

The .Net Way

While I was initial open to any .Net OR/M the fact of the matter is that the only one I know anything about is SubSonic. Now I like SubSonic a lot, but their tools are geared towards code generation and I couldn't find a handle into runtime manipulation of table names (such might exist, but I was unable to find it).

Since I'm even less familiar with the other third party .Net OR/M data tools (like LLBLGen or NHibernate) and nobody on Stack Overflow (who tend to be knowledgeable about these things) spoke up, I decided to check out what it would take to monkey with the table names at runtime using stuff I have actually used. Namely the Entity Framework and LINQ to SQL. It turns out to be possible in either, though I have to admit to being surprised that it is easier in LINQ to SQL than in EF.

My Test Database

To keep things simple, I created a test database. Since I am a highly creative professional, I named it “Test” and created a table there named “TestTable”.

Test Schema 

LINQ to SQL

Instantiating a new DataContext object in LINQ to SQL includes a constructor that allows you to feed in a MappingSource derivative. By default, L2S uses an attribute mapping object that pulls the metadata from attributes on your classes—in this case the “Name” property of a TableAttribute.

[Table(Name="dbo.TestTable")]
public partial class TestTable : INotifyPropertyChanging, INotifyPropertyChanged
 

Since attributes are immutable at runtime, using the default isn't an option here. Fortunately there's an XmlMappingSource object that can use an XML file (or fragment) to do what I need. Unfortunately, generating an initial mapping file is a touch cumbersome and requires use of the SQLMetal.exe tool provided with Visual Studio.

Here's how (after adding a TestLINQ.dbml file and dragging the table onto it—pre-name-change of course):

  1. I opened the VS Command Prompt (it's in a Visual Studio Tools folder in the start menu).
  2. Changed the directory to my project.
  3. Entered the command “sqlmetal /map:TestLINQ.map /code TestLINQ.dbml”. This generates a TestLINQ.map file.
  4. Right-clicked on the generated file in my project and selected “Include in Project”.
  5. Set the file's “Copy to Output Directory” property to “Copy Always”.

The mapping file is pretty simple. The relevant bit is the Name attribute of the Table element:

<Table Name="dbo.test_TestTable" Member="TestTables">
  <Type Name="LINQ.TestTable">
    <Column Name="TestId" Member="TestId" Storage="_TestId" DbType="Int NOT NULL IDENTITY" IsPrimaryKey="true" IsDbGenerated="true" AutoSync="OnInsert" />
    <Column Name="TestOne" Member="TestOne" Storage="_TestOne" DbType="VarChar(50)" />
    <Column Name="TestTwo" Member="TestTwo" Storage="_TestTwo" DbType="VarChar(50)" />
  </Type>
</Table>
 

Make sure the name is what you want it to be and you're golden. Here's the code I used to test it out after the name change.

XmlMappingSource source = XmlMappingSource.FromUrl("TestLINQ.map");
using (LINQ.TestLINQDataContext context = new LINQ.TestLINQDataContext(Properties.Settings.Default.TestConnectionString, source))
{
    LINQ.TestTable table = new LINQ.TestTable()
    {
        TestOne = "firstLINQ",
        TestTwo = "secondLINQ"
    };
    context.TestTables.InsertOnSubmit(table);
    context.SubmitChanges();
 
    table.TestOne = "thirdLINQ";
    context.SubmitChanges();
 
    context.TestTables.DeleteOnSubmit(table);
    context.SubmitChanges();
}
 

XmlMappingSource even has a .FromXml() method that will create your map from an Xml fragment string.

Entity Framework

As I mentioned before this one is harder. This is a surprise because EF is ostensibly created to make it easier to keep your object definitions separate from your storage definitions. The reason it isn't easier is understandable once you realize that EF is made to be highly configurable and thus its definition files are much more complex than the L2S mapping.

The first problem with EF, though, is that the documentation is schizophrenic. Also confusing. That's because MS rolled the original three configuration files into the .edmx definition file so references on the web imply that those files are easily seen and edited. Even more confusing is that EF actually still uses the .csdl, .ssdl, and .msl files at runtime—it just generates those files from the .edmx and either packs them in the assembly as a resource (by default) or as files in your output directory.

Well, to monkey with the tables at runtime, you have to have access to the configuration files. To do so you need to change the default in the “Metadata Artifact Processing” property of your ConceptualEntityModel and rebuild the project. That'll put perfectly good .csdl, .ssdl, and .msl files in your output bin directory (You don't have to leave it at "Copy to Output Directory" once you have saved these files off for your own use and abuse.)

EF did another funky thing by deviating from the norm in what it pulls from the connection string that you feed it. If you look at your EF connection string in app.config (or web.config) you'll see something like this:

<add name="TestEntities" 
connectionString="metadata=res://*/TestEF.csdl|res://*/TestEF.ssdl|res://*/TestEF.msl;provider=System.Data.SqlClient;provider connection string=&quot;Data Source=localhost;Initial Catalog=Test;Integrated Security=True;MultipleActiveResultSets=True&quot;"
providerName="System.Data.EntityClient" />
 

Notice that there's a “provider connection string” embedded in the connectionString attribute—that's the “normal” connection string that tells EF what database to use for storage. Also notice that the metadata property tells EF where to go for those configuration files (in this case, it's telling EF to look in the assembly for resources TestEF.csdl, TestEF.ssdl and Test.msl).

Armed with this information, I was able to add a set of generated config files to the project stolen from those generated in the output directory. Once there, you have to edit the storage file and the mapping file to use the altered names. The table name used by EF is taken from the Name attribute of the EntitySet element in the .ssdl file. This is unfortunate because the Name attribute is a reference used by things like associations. Which means that you have to make sure you alter all the references to that EntitySet as well (fortunately, these are generally referenced using an EntitySet attribute on the relevant association and thus are relatively easy to find).

<EntitySet Name="test_TestTable" EntityType="TestModel.Store.TestTable" store:Type="Tables" Schema="dbo" />
 

It's not necessary to alter the EntityType, though I found that if you alter it consistently across the file it will work if you do.

The mapping configuration in the .msl file just has to be updated so that the objects use the correct EntitySet for storage.

<EntitySetMapping Name="TestTable">
  <EntityTypeMapping TypeName="IsTypeOf(TestEF.TestTable)">
    <MappingFragment StoreEntitySet="test_TestTable">
      <ScalarProperty Name="TestId" ColumnName="TestId" />
      <ScalarProperty Name="TestOne" ColumnName="TestOne" />
      <ScalarProperty Name="TestTwo" ColumnName="TestTwo" />
    </MappingFragment>
  </EntityTypeMapping>
</EntitySetMapping>
 

Once those changes are made your code will work with the altered table name, though you need to alter the connection string to look for the correct configuration files. Here's the final code I used to test it out.

string connection = "metadata=TestEF.csdl|TestEF.ssdl|TestEF.msl;provider=System.Data.SqlClient;provider connection string=\"Data Source=localhost;Initial Catalog=Test;Integrated Security=True;MultipleActiveResultSets=True\"";
using (TestEntities entities = new TestEntities(connection))
{
    TestTable table = new TestTable()
    {
        TestOne = "firstEF",
        TestTwo = "secondEF"
    };
    entities.AddToTestTable(table);
    entities.SaveChanges();
 
    table.TestOne = "thirdEF";
    entities.SaveChanges();
 
    entities.DeleteObject(table);
    entities.SaveChanges();
}

So Which is Better?

Well, that depends on what you want to accomplish but then, doesn't it always? For me, the LINQ to SQL solution is much cleaner because I'm simply not going to use all the other goo that the Enterprise Framework includes. Plus, the LINQ to SQL solution can use an XML fragment so I can bury that mapping piece wherever I want to, including in inline code. EF requires a file reference so those files have to be either in the assembly resources or on the file system. EF also allows you to leverage Asp.Net Data Services but that's a topic for another post entirely...

Tags: , , , , , , ,

Programming

MS DAAB Is Eeeeviiiiil

by Jacob 30. May 2007 22:30

Phil Haack just posted an update about the benefits of moving to Subsonic as a way to get strongly-typed stored procedures. He includes this bit:

I don’t know about you, but I find it a pain to call stored procedures from code. Either I end up writing way too much code to specify each SqlParameter explicitly, or I use a tool like Microsoft’s Data Access Application Block’s SqlHelper class to pass in the parameter values, which requires me to remember the correct parameter order (it actually supports both methods of calling a stored procedure). What a pain!

Now, I'm not upset that Subtext is moving away from the Microsoft Data Access Application Block. In fact, I couldn't be happier. Subsonic makes things even easier than ADO.Net because you can point it at a connection string and it auto-generates this stuff (and can regenerate appropriately in the event of changes). So Subsonic is the bees knees as far as I'm concerned.

Still, these kinds of statements make me crazy—mostly because I know that Phil is a sharp guy and representative of clued-in developers in general. Here's the thing: ADO.Net has strongly-typed stored procedure access already! Which is why I hate the MS DAAB. If it weren't for this piece of misbegotten code, more developers would have taken the time to actually learn ADO.Net and use it. Subtext could have had this functionality with the move to the .Net Framework 2.0.

And just to be absolutely clear, even if we had done so, we'd still want to implement Subsonic now. I mentioned the auto-generating and updating part, right?

Anyway, here is what it'd take to add a single strongly-typed stored procedure using ADO.Net's built-in adapters.

First, add a DataSet object to your project.

 Next, drag the stored procedure onto the design surface.

 Finally, add the code.

QueriesTableAdapter qa = new QueriesTableAdapter();

return (int) qa.InsertEntry(entry.Title
                          , authorId
                          , entry.Body
                          , (int)entry.PostType
                          , entry.Description
                          , BlogId
                          , entry.DateCreated 
                          , (int) entry.PostConfig
                          , entry.EntryName
                          , entry.DateSyndicated
                          , 0);

Note that this is truly strongly-typed.

Adding additional stored procedures is a matter of dragging and dropping them onto the designer surface.

Easy Schmeasy.

I have no idea why developers find it easier to use the MS DAAB than simply using ADO.Net itself. You don't have to add any new dependencies and you have some pretty sophisticated designers that are a) way more paranoid than anything a developer is going to write on their own and b) designed for extensibility with all the methods marked virtual and the classes marked partial.

Now, there's some quirks of this code that may need to be hashed out for something that's using the provider model to do data access. Altering the connection string, for example, takes some care if you aren't going to leave it fully referenced in the app.config (or web.config).

<add name="Subtext.Framework.Properties.Settings.SubtextData"
       connectionString="..." />

Instead of:

<add name="subtextData" connectionString="..." />

 

Even so, the partial class structure makes it a simple matter to add an initializer method that overrides the generated connection string if you truly need the complexity. And you only have to do that once to cover all your stored procedures.

Technorati tags: , , , ,

Tags: , , , ,

Programming

DataSets and Business Logic

by Jacob 23. November 2006 08:23

Whoa, that was fast. Udi Dahan responded to my post on DataSets and DbConcurrencyException. Cool. Also cool: he has a good point. Two good points, really.

Doing OLTP Better Out of the Box

I'll take his last point first because it's pure conjecture. Why don't DataSets handle OLTP-type functions better? My first two suggestions would, indeed, be better if they were included in the original code generated by the ADO.NET dataset designer. I wish that they were. Frankly, the statements already generated by the "optimistic" updates option are quite complex as-is and adding an additional "OR" condition per field wouldn't really be adding that much in either complexity or readability (which are both beyond repair anyway) and would add to reliability and reduce error conditions.

My guess is that it has to do with my favorite gripe about datasets in general: nobody knows quite what they are for. I suspect that this applies as much to the folks in Redmond as anywhere else. Datasets are obviously a stab at an abstraction layer from the server data and make it easier to do asynchronous database transactions as a regular (i.e. non-database, non-enterprise guru) developer. But that doesn't really answer the question of what they are useful for and when you should use them.

DataSets are, essentially, the red-headed step child of the .NET framework. They get enough care and feeding to survive, but hardly the loving care they'd need to thrive. And really, I think that LINQ pretty much guarantees their eventual demise. Particularly with some of the coolness that is DLINQ.

Datasets Alone Make Lousy Business Objects

As much as I am a fan of DataSets in general, you have to admit that they aren't a great answer in the whole business layer architecture domain.

I mean, you can (if you are sufficiently clever) implement some rudimentary data validation by setting facets on your table fields (not that most people do this--or even know you can). You can encode things like min/max, field length, and other relatively straight-forward data purity limitations. Anything beyond this, however, (like, say, when orders in Japan have to have an accompanying telephone number to be valid) would involve either some nasty derived class structures (if you even can--are strongly-typed DataTables inheritable? I've never tried. It'd be a mess to do so, I think), or wrapping the poor things in real classes.

One solution to this is to use web services as your business layer and toss DataSets back and forth as the "state" of a broader, mostly-conceptual object. This is something of a natural fit because DataSet objects serialize easily as XML (and do so much better--i.e. less buggy--in .NET 2.0). This de-couples methods from data, so isn't terribly OO. It can work in an environment where complex rules must work in widely disparate environments (like a call center application and a self-serve web sales application) when development speed is a concern (as in, say, a high-growth environment).

I think this leads to the kind of complexity Udi says he has seen with datasets. The main faultline is that what methods to call (and where to find them) are in design documents or a developer's head. This can easily lead to a nasty duplication of methods and chaos--problems that functionally don't exist in a stronger object paradigm.

That Said...

Here is where I stick my neck out and reveal my personal preferences (and let all the "real" developers write me off as obviously deluded): although DataSets make admittedly lousy business objects, most non-enterprise level projects just don't need the overhead that a true object data layer represents. For me, it's a case of serious YAGNI.

Take any number of .NET open source software projects I've played with: not one uses DataSets, yet not one needs all the complexity of their custom created classes, either. They aren't doing complex data validation and their CRUD operations are less robust than those produced automatically from the dataset designer. All at a higher expense of resources to produce.

Or take my current place of gainful employ. We have five ASP.NET applications that all have an extremely complex n-tier architecture--all implemented separately in each web application (and nowhere else--they're not even in a separate library). Each of the business objects has a bunch of properties implemented that are straight get/set from an internal field. And that is all they are. Oh, there's a couple of "get" routines that populate the object for different contexts using a separate Data Access Layer object. And an update routine that does the same. And a create... you get the point. It's three layers of abstraction that don't do anything. I shudder to think how much longer all that complexity took to create when a strongly-typed DataSet would have done a much better job and taken a fraction of the time. It makes me want to call the development police to report ORM abuse.

Which is to Say

Don't let all that detract from Udi's point, though. He's right that for seriously complex enterprise-level operations, you can't really get around the fact that you need good architecture for which datasets will likely be inadequate. Relying wholly on DataSets in that case will get you into trouble.

I personally think that you could get away with datasets being the communication objects between web services in most cases even so, but I also realize that there are serious weaknesses in this approach. It works best if the application is confined to a single enterprise domain (like order processing or warehouse inventory management). Once you cross domains with your objects, you incur some serious side-effects, not least of which is that the meaning of your objects (and the operations you want to perform on them) can change with context (sometimes without you knowing it--want an exercise in what I mean? Ask your head of marketing and your head of finance what the definition of a "sale" is--then go ask your board of directors).

So yeah, DataSets aren't always the answer. I'd just prefer if more developers would make that judgement from a standpoint of knowing what DataSets are and what they can do. Too often, their detractors are operating more from faith than from knowledge.*

*Not that this is the case for Udi. For all he has admitted that he isn't personally terribly familiar with datasets, his examples are pretty good at delineating their pressure points and that tends to indicate that he's speaking from some experience with their use in the wild.

 

Tags: , , , , , , , ,

Programming

4 Solutions to DbConcurrencyException in DataSets

by Jacob 21. November 2006 11:35

Following links the other day, I ran across this analysis of DataSets vs. OLTP from Udi Dahan. His clincher in favor of coding OLTP over using datasets is this:

The example that clinched OLTP was this. Two users perform a change to the same entity at the same time – one updates the customer’s marital status, the other changes their address. At the business level, there is no concurrency problem here. Both changes should go through.When using datasets, and those changes are bundled up with a bunch of other changes, and the whole snapshot is sent together from each user, you get a DbConcurrencyException. Like I said, I’m sure there’s a solution to it, I just haven’t heard it yet.

I thought about this for a minute and came up with four solutions for DbConcurrencyException in this scenario using DataSets (though the first two are essentially the same and differ only by who actually implements it). I'm sure there are others, but this should do for starters.

  1. Use stored procedures created by a competent DBA that utilizes parameters for the original and new column state. This means that you check each field with a "OR (<ds.originalValue> = <ds.updateValue>)". This solution passes the same two parameters per field as an "optimistic" pre-generated update statement but it makes the update statement larger by adding this new "OR" condition for each field.
  2. You can do the same by altering a raw update generated from the DataSet designer. This means sending a longer select to the database each update though this can be offset by setting your batch size higher if you have lots of updates you're sending (uh, you'd need ADO.NET 2.0 for that). I'd hesitate to use this method but that's mainly a personal taste issue than anything else (because I'd prefer using stored procedures and recognize that internal network traffic generally isn't the bottleneck in these kinds of transactions, though on-the-fly statement execution plan creation could be).
  3. Override the OnUpdating for the adapter to alter the command sent based on which fields have actually changed. This is probably the closest in effect to the OLTP solution envisioned by Udi. This solution is problematic for me simply because I've never actually tried to do it and I'm not sure you can hook into the base adapter updates each execution. If you can't, an alternative (in ADO.NET 2.0) would be to create a base class for the table adapters and create an alternative Update function in derived partial classes. In this case, you'd have "AcceptFineGrainedChanges" or some such function that you'd call. Once the alternative base class was created, custom programming per table adapter would be a matter of a couple moments. I've done something similar for using the designer for SyBase table adapters and it worked out pretty well. I'd have to actually try this to make sure it'd work though. Call this two half-solutions if you're feeling stern about it.
  4. This last would be useful if I have a relatively well-defined use case that isn't going to morph much or require stringent concurrency resolution. In this one, you deliberately break the one-for-one relationship from your dataset and database (i.e. one database table can be represented by multiple dataset tables). In Udi's concurrency example, the dataset would have a CustomerAddress table and a CustomerStatus table. Creating the dataset with custom selects would generate the tables pretty painlessly with appropriate paranoia. Now, this only really pushes his concern down a little, making it less likely to be an issue. It doesn't eliminate it. It'd probably handle most of the concurrency problems people are likely to run into. Or at least, push them out beyond where most people will ever experience it (not quite the same thing). It could be taken to a rediculous extreme where each field was it's own datatable (which is just silly, but I've seen sillier things happen) so a little balance and logical separation would be needed.

OLTP may seem more natural as a solution for many, but that's likely an issue of preference and sunken costs (because they've done it before and are comfortable with that solution space). It certainly isn't the only solution, though, nor is it a stumper for datasets.

Finally, I’ll add a caveat that I'm not saying that datasets are necessarily to be preferred over stronger object models. I just know that they get pretty short shrift from "real" developers in these kinds of discussions and want to make sure that the waters remain appropriately muddied. There may be a universal stumper for datasets I don't know about. There are certainly environments where a formal OLTP or ORM tool would be a legitimately preferred solution.

 

Technorati tags: , , , , ,

Tags: , , , , ,

Programming

scruffylookingcatherder.com

Information

    Recent Posts

    Calendar

    <<  September 2010  >>
    MoTuWeThFrSaSu
    303112345
    6789101112
    13141516171819
    20212223242526
    27282930123
    45678910

    View posts in large calendar
    Disclaimer
    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2010 Scruffy-looking Cat Herder