Spammers Are Vermin

by Jacob 19. July 2009 09:32

Cockroaches My apologies if you’ve tried to access my personal blogs recently. I’ve been inundated by comment spammers and it has been a tremendous pain in the buttocks getting them straightened out. For a while, I was getting only a half dozen or so a day. Short comments about what an amazing blog/post it was and that they’d definitely be back and/or bookmark/subscribe.

I could manually delete them without too much inconvenience for a while. Lately, though, there’s been a staggering increase in these weasels so I’ve adopted measures a little more… drastic.

A Comment Filter BlogEngine.Net Extension

I noticed that most of these spammers shared some distinctive characteristics. Many of them put down the same email address, for example. I also noticed that there were only three or four websites generally involved. Since the spam exists for the purpose of Google pagerank manipulation, the website is probably the important thing to note.

Now, I looked for a BE.Net extension that’d do this already. Unfortunately, most of the comment filters I found were tied into Akismet or some other blog filter service. That’s more overhead than I really want (in terms of configuration, registering, and complexity etc.). All I really need is something to check the email address, website, and maybe IP address against a known blacklist I can maintain myself. That shouldn’t be difficult, right?

Adventures in Comment Filtering

On the surface, these things weren’t that hard to accomplish. BlogEngine.Net has some quirks, though, that got in my way until I figured them out. For those interested, I’m going to explain them here. If you want to skip the gory details, head down to the next section. Or if you just want the extension, download it, pop it into the App_Data/Extensions folder and season to taste.

Finding the Right Event

My first impulse was to look at the Comment object for useful events to extend. Comment.Validating looked like a good candidate so I tried that one out. Unfortunately, that event never got hit on my blog. It took me a bit to realize that this is because I don’t actually validate comments. Validating comments is a setting where a comment doesn’t show up until it is approved. Since I only do blog maintenance once a day or so, I don’t want to prevent comments from showing up for that long. Validating comments would pretty much stop discussions in their tracks and I don’t want that.

Once I remembered that comments are managed on the Page object, things went much better. The Page.AddingComment event turned out to be the one I wanted.

ExtensionParameter Fun

This is the one that held me up the longest. ExtensionParameters can be assigned types that include things like “DropDown” and “ListBox”. That seemed like exactly the kind of thing I could use for my filters. You see, each filter will be of a limited number of valid types: “Website”, “Email”, “IP Address”, or “Length” (I added Length when I noticed that all these messages are really short and I might want to account for that in my filter).

Unfortunately, these ParamType values are a complete red herring for tabular data storage. I noticed that BE.Net wasn’t actually storing my selection when I tried to add filter entries. The thing is that BE.Net stores tabular values on each parameter in the DataStore and only maintains a link to them by the order in which they appear. So my parameters in the DataStore look like this once saved:

<Parameters>
  <Name>Filter</Name>
  <Label>Filter</Label>
  <MaxLength>100</MaxLength>
  <Required>true</Required>
  <KeyField>true</KeyField>
  <Values>http://www.sonicity.com/</Values>
  <Values>http://www.unlockprivateprofiles.com/</Values>
  <Values>http://www.lastminutejoy.de/</Values>
  <Values>http://www.mooladays.com/</Values>
  <Values>http://www.dbpclan.com/</Values>
  <Values>200</Values>
  <Values>email002545@hotmail.com</Values>
  <Values>http://www.ramshyam.com/</Values>
  <ParamType>String</ParamType>
  <SelectedValue/>
</Parameters>
<Parameters>
  <Name>FilterType</Name>
  <Label>Filter Type</Label>
  <MaxLength>100</MaxLength>
  <Required>true</Required>
  <KeyField>false</KeyField>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Length</Values>
  <Values>Email</Values>
  <Values>Website</Values>
  <ParamType>String</ParamType>
  <SelectedValue/>
</Parameters>

It looks to me like list types (DropDown, ListBox, etc.) were mainly implemented with scalar settings in mind rather than tabular settings as this needs to be. This is unfortunate, but I can’t see an easy way to alter the architecture to enable list types easily. I could create my own custom admin page for the extension (and I still may) but that’s more work than I wanted to do to get this running.

The Extension

So my comment extension has been up and working for a day or two now and things have calmed down a lot. This is a good thing. I can’t say that it is extensively tested for the simple reason that I don’t get many legitimate comments on a regular basis.

Configuration is pretty simple as long as you don’t typo the Filter Type value. Each filter is its own entry in the tabular list on top.

CommentFilterConfiguration (Click image to enlarge)

Talking Back to Spammers

When I noticed that it still looks to the user like their comment is saved (because the comment is still part of the page object, it just isn’t saved to the DataStore), I had an inspiration. Since the comment is still displayed to the person who posted it (though not to anyone else), that’s an opportunity to make sure that someone running afoul of my length requirement doesn’t end up wondering what happened. Plus, it gives me a chance to tell spammers that they’ve been noticed (yeah, that’s of dubious value and I may rethink this, but for now, it just makes me feel better). If you enlarged the image above, you’ll see that there are templated values that will be used to replace the comment content. I can be as nasty as I want and the only ones who see it will be the spammers—though you’ll probably want to take it easy on those who stumble on your length filter (if any).

Spammers Should Die

A day or so after this filter went into effect I started to get new messages. These are clever little plays for sympathy saying things like “my comment got eaten but anyway… <regular spiel here>”. Or another “my blog is getting lots of comment spam, do know any way to help?” The website links were still classic spam sites so these weren’t real users looking for help. Cheeky little locusts, aren’t they? Seriously, someone with the right skills needs to hunt these bastards down and rearrange key organs into innovative new patterns.

Tags: , , , ,

Programming | Software

Is Microsoft Evil?

by Jacob 4. June 2007 16:10

Face of Evil I've been considering this post for a while now, but have been afraid to actually write it. So here's the thing: I've noticed that most of those who talk about how evil Microsoft is don't bother supporting that assertion. They tend to assume the rightness of their position and hence the wrongness of whatever it is that Microsoft has done. Microsoft stifles technology! Microsoft is a monopoly! Microsoft engages in unfair business practices!

Do they really?

No Consumers Were Harmed in Making This Software

There's a couple of problems with the whole monopoly thing. For one, at least in the U.S., being or having a monopoly isn't itself illegal—using the position of a monopoly to harm consumers is the illegal part. Now, some courts, and popular opinion, assumes that the fact of a monopoly is, itself, harmful to consumers, but that has never been proven to my satisfaction. Indeed, many of those who testified against Microsoft in past years rested on this assumption by equating harm to them as harm to the consumer.

Here's a tip: the fact that a company cannot compete and goes out of business isn't really evidence that consumers were, in any way, harmed. Let's make this concrete with an example. The fact that Microsoft started giving Internet Explorer away for free and that doing so tanked their competitors in the browser market doesn't actually harm consumers. After all, consumers are now getting something for free that companies wanted to charge money for. If Microsoft began charging money for browsers after their competitors tanked, well, that'd be a different story. At that point, you'd have to ask if the new price for browsers was higher than it would have been with competitors still in business. The thing is, Microsoft didn't do so. Indeed, if those whining about Microsoft got their way, we'd be charged money for browsers today and that, in my opinion, is far more harmful to the consumer than Microsoft's decision that something should be free.

But here's the thing: Microsoft isn't even a monopoly. Seriously. Let's take the primary definition of a monopoly from reference.com.

mo·nop·o·ly      /məˈnɒpəli/ [muh-nop-uh-lee]
–noun, plural -lies.

1. exclusive control of a commodity or service in a particular market, or a control that makes possible the manipulation of prices.

See that. The key to having a monopoly is having exclusive control or enough control that makes possible the manipulation of prices. It's obvious that Microsoft doesn't have exclusive control, but the fact of the matter is that Microsoft doesn't have the power to manipulate prices, either. While Microsoft can determine the prices they charge for their own products, doing so does nothing to control the prices of anybody else's product.

That's because software is inherently uncontrollable.

The reason for this is because you cannot control the supply of software in any compulsory way. Unlike any other product, software can be reproduced at will by anybody who owns the rights to the program (and by many who don't). You could gain a monopoly over practically any other product if you can somehow control the supply of a key component. Software doesn't have a key component crucial for its replication. If every owner of Microsoft Vista wanted to migrate to OS X tomorrow, there's nothing that Microsoft could do to hamper Apple from creating as many copies of OS X as they wanted to create and charging whatever price Apple wanted to charge (including no price at all) for those copies.

Note that I'm relying on the distinction that controlling the price for a product is not the same as controlling how much you can charge successfully. Microsoft gave IE away for free. That rather hampered people being able to charge money for comparable software. It did not alter the ability of those companies to charge whatever price they wanted to for competing browsers. You cannot actually be said to be capable of manipulating price until you can move it up or down at your whim. The ability to move prices down is inherent in the marketplace and your ability to compete in it. The ability to move prices up is the key to being an actual monopoly.

It Isn't Fair!

The second largest complaint is that Microsoft engages in unfair practices to privilege their software because they "own" the OS. Now, I haven't been a fan of the "it's not fair" defense since my kids grew up enough to employ it. I personally stopped expecting life to be fair a long time ago. The thing is, I'm enough of a libertarian that as long as all parties to a transaction are informed and consenting, I don't have much problem with them working out whatever deals they think they can.

Still, you can't deny that Microsoft likes its shady deals. I certainly wouldn't dream of denying it. In fact, I'm all for exposing those deals as soon as they're known, and the sooner the better. Does Microsoft have a deal with Dell that includes Dell anteing up for every PC purchased? Doesn't matter to me, but by all means, get the news out if you discover it to be true. I mean, as far as I can see, Dell wouldn't be doing so if the net cost to them weren't cheaper than doing it the other way. As long as Dell is able to compete in its markets for computers, I'm not really that interested. After all, if Dell raises the price of computers that don't have some version of a Microsoft OS, you have to know that they'll get hammered by their competitors who aren't trying to recoup such costs.

That's the magic of capitalism. You only get to set the price, you don't get to set the demand. If somebody else can do it cheaper, then they'll come in and prove it in the only way that matters—by offering their product at a lower price.

Does Microsoft sometimes fail to, ahem, document their complete APIs for external sources? That certainly appears to have been the case in the past. Insofar as they might have claimed to have documented the entire API, they have violated the law and should be held accountable for doing so. Outside of such a claim, I don't see how we have any standing to demand otherwise. Not that we shouldn't ask for better, but there's no cause to be slinging charges of moral depravity. You can't simply decide that a company has to release their full API when they haven't agreed to do so and expect to be taken seriously. Certainly, nobody expects the same from Apple. Fortunately, one consequence of all those shady deals makes Microsoft the most scrutinized software company on the planet so its ability to hide things is, shall we say, limited.

Scrabbling for the Top

Microsoft's dominance of the software market seems like it must have a monopoly somewhere. The fact that we can't detect it doesn't mean it isn't there, right? After all, random chance should dictate that some companies would successfully compete with Microsoft and their dominance would wane.

Ah. But since when did random chance have anything to do with markets, let alone software? There is, in fact, one thing that all successful companies do to become (and stay) successful: they attack #1. Indeed, those companies that rise to the first position and later fall always do so because they stop attacking #1. You see this with Sun. There was a period when they owned the corporate server market. The thing is, they stopped attacking themselves. Their resting gave their competitors an opportunity to come in and steal their lunch. Dell and HP saw that Sun's prices hadn't dropped even though the cost of hardware was falling steadily. They saw an opportunity and Sun is left wondering what happened. The same thing is happening with Oracle in the database server market and Sun (again) with the hot development language, Java.

And that's what has allowed Microsoft to dominate the OS and Office Application spaces for so long—they haven't stopped attacking #1, even when it's them. Microsoft, for some reason, has mastered the paranoia and internalized the lesson that they are only a couple of motivated geeks in a garage away from the #2 slot. Witness Office 2007 and the ribbon control. Microsoft could easily have sat on their dominance in the Office Application space. They didn't. Time will tell if that innovation makes their product better, but so far, it seems that it has.

The Beauty of Creative Destruction

Which brings me to the software development space. Microsoft has done here what they do in all the markets they come to dominate: stake out some initial territory and then expand to become the best value in that space (note that I said value, not software, or price, or technology). That's how they continue to dominate in business programming even though they charge money and the new kids on the block don't. Yeah, you can do some interesting things with Ruby on Rails, Java, or Eclipse. I'm not denying the achievements of others.

All I'm saying is that for me, the business developer, Microsoft makes the development decision an extremely easy one. For a paltry $2k a year, I can own everything Microsoft produces in one package—OS, IDE, servers, and Office Applications. I don't have to find out which distro is most popular. I don't have to research GNome vs. KDE vs. Xfce. And I don't have to browse a single Man page or HowTo.

Not that I wouldn't move if Microsoft stopped anteing up, but they don't seem to be doing so right now. Whether it's IronPython or Silverlight, or even Ruby, Microsoft shows no signs of letting others come in and eat their lunch. This is part of what makes it easy to be a Microsoft developer. If a good idea crops up in a space not currently dominated by Microsoft, you can bet that it won't be long before it's available in my dev environment either as a third-party add-in or from Microsoft itself.

The Wise Use of Power

All of that euphoria aside, Microsoft does have its problems. With great power comes great responsibility and Microsoft doesn't have a monopoly on ethical people in positions of power (uh, the "duh" is understood there, right?). Undercutting NUnit was a waste of community effort and good will. And what's happened with TestDriven.net is a bigger one. Indeed, with the details we have about the TestDriven.net case, it's obvious that Jason Weber at Microsoft behaved like an arrogant jerk and he deserves to take heat for it and Microsoft does as well.

Being a complacent consumer leads to complacent companies and products that never improve. So by all means, give Microsoft hell when they ask for it.

All I'm saying is that Microsoft isn't the dominant force it is in the markets it dominates because people are stupid. Any position that concludes that people don't know what is best for them is a position that I'm becoming increasingly impatient with. Brow-beating developers who don't kowtow to your party line isn't going to actually win you converts in whatever crusade you've decided to embark upon. By all means, give me your best pitch, I want to hear it. But don't assume that I don't have perfectly valid reasons for the choices I've made and even if I am a lazy bastard who couldn't program my way out of a wet paper bag, you are probably not best served by mocking me for it. Although, come to that, I sort of ask for it when I call OSS folk cry-babies with no justification...

Tags: , , ,

Programming | Software

Secret Geek Gets it Right

by Jacob 1. June 2007 11:57

Since I stuck my foot in it on the NUnit issue, I've been paying particular attention to the most recent Microsoft controversy with TestDriven.net. By far, the best comment I've seen is the brief analysis by Leon Bambrick at Secret Geek where he separates Microsoft from the individual at Microsoft who made a tough job worse than it had to be through arrogance and ego.

Tags: , , ,

Winning the In-House QA Argument

by Jacob 25. May 2007 15:52

I wrote last month about winning arguments in IT. Earlier this week, Phil Haack asked a question (through Twitter) about things he could do to help convince a company to create an in-house QA department. Well, it turns out that I did exactly that at XanGo—successfully pushed for and oversaw the installation of an in-house QA department. I thought it might be a useful follow-up to the previous post to use this as an example of how I "won" that argument.

Concentrate on What is Best for the Company

This is the key, the whole key and nothing but the key to winning corporate arguments. You should approach any problem from this perspective. In this case, I have seen what software development looks like with a competent QA team and I wanted that again. Simple developer testing was leaving holes and we didn't have a consistent deployment model. I was sure that having an in-house QA team would benefit the company.

Now, since I was the Software Development Manager, instantiating a QA team was likely out of the scope of my responsibilities. Fortunately, I had a good relationship with my boss and he was the functional IT Director. It was well within his scope. This gave me my first temptation: the desire to expand my personal empire by creating a "QA Team" within my department. After all, it was my idea, it would take a lot of effort on my part to bring about, and QA is associated with development, right?

Tempting as it is, though, it would definitely not be to the company's benefit. A QA team should have as much power as they can get because they are the last rational review before stuff goes into production. What that means in practical terms is that they should be a first-class IT member. If other departments can override QA, you are headed for a lot of pain.

Which is when I realized that what I had to do first is marshal all the reasons I could think of that an in-house QA team would benefit XanGo. This is probably really what Phil was looking for. It's easy to come up with reasons it'd be good from your own perspective. It's a lot harder to broaden that perspective to the company as a whole.

Here's what a list of personal benefits might look like:

  1. Backstop development so that buggy code doesn't hit production.
  2. Help train developers by pointing out errors early enough that it is still in the mental cache of the developer who created it.
  3. Give me a blame-shield if something broke.

 

Here's some things that are more useful from the company's perspective:

  1. Backstop development so that buggy code doesn't hit production.
  2. Help train developers by providing timely feedback.
  3. Ensure that customers don't encounter broken processes.
  4. Ensure that employee bug reports and feature requests are tracked, prioritized, and scheduled.
  5. Develop and own a formal build/deploy process that is repeatable and automated.
  6. Increase executive confidence that software releases do the things they are promising people it does.
  7. Help identify training opportunities for development.
  8. Help identify training opportunities for non-development staff.
  9. Create a manager who can concentrate on identifying people with QA skills and provide a career path for them.
  10. We have a Software Development Manager who knows what a good QA team looks like and how to create one and we can easily leverage his expertise without bringing in expensive consultants.

This list was my external list of arguments. Which ones I'd present depended heavily on a given audience. With executives, for example, I'd start with #3, glide through #4, #1 and #7 and then slam into #6. I'd pause there and ask if there was interest. If I had their interest (and I guarantee that I did), I could cover the other points in more detail--generally in a follow-up, formal meeting.

Show Me the Money

Reason is all well and good, but without something concrete and measurable, we're still in the realm of faith. Yeah, it makes sense, but it's still just a framework of belief. Now, there's nothing wrong with belief and sometimes you have to have a holy war in order to keep everyone pulling the same direction. But proof is better so find it whenever you can.

The only proof that counts in business is money. The closer the money is to the company, the better. In other words, find examples that people can relate to. Real examples. I hate that I have to stipulate "Real", but unfortunately, I do. Too often process advocates go looking for best (or worst) case examples or even pull them out of their hat. I've gotten to the point where if it didn't happen internally, I'll check sources. I've learned to derive a certain glee from demolishing someone who was, uh, less than careful. Be aware that people like me are out there and nothing will tank you faster than being shown to be faking your proof. You stand to lose more than simply the argument when that happens.

The best source for examples are things that happened in your company. In the case of XanGo, I was able to track back the cost of some bugs that happened there. A bug doesn't have to tank the company to impose a cost. It is always good policy to do a complete autopsy of significant bugs that hit production. Don't be satisfied with simply fixing the error. Dig as deeply as you can. That's just good practice. In this case, it also let me attach a dollar figure to bugs that actually had hit production despite our best vigilance in development. Adding a reasonable cost-benefit is almost like magic--it clears barriers away like they're made of tissue paper.

Don't mistake, though, a cost-benefit list isn't going to do you any good if you cannot articulate the reasons it will work the way you predict. A money-based analysis that isn't accompanied by solid reasons that you can articulate clearly isn't going to work any better than having reasons without a money-based analysis. Which is to say that it might work, but you're essentially rolling the dice. I prefer not to dice with my career.

More Than Winning

In addition to the above, I made a couple resolutions to myself. These are things that I knew from experience were important in helping a QA department work but that are easy to overlook from the development side of the fence.

  1. The eventual QA team should be at least as high up on the IT totem pole as my team. They would need all the power they could get when crunch time comes and they're still finding bugs. It is the unfortunate reality in business that the ones who had the code last tend to be seen as the barrier to making a deadline.
  2. Educate as many people as possible that QA is a wholly different skill set from development. A skill set, by the way, that is rarer (and hence more valuable). This includes making sure that the eventual QA manager (who I would have a part in interviewing) believed the same way. Too often, companies use their QA department as a back-door to a development career. While this is flattering to developers because it means they are the final product (i.e. the pinnacle, end-point, and better than), it is also a recipe for tragedy because the skill sets aren't simply unrelated--they are, to an extent, opposed.
  3. Resolve to give the QA team my support in any future dust-ups. Dust-ups that are, by the way, inevitable. Developers are going to resent having bugs found by someone not them. They aren't going to like that they didn't actually reach a deadline they busted hump to hit. Further, too often when things go wrong, people start whacking around with the blame stick and they'll start in the QA department. It's really easy to keep a low profile when that happens because you want to avoid getting whacked yourself. It is, nonetheless, a bad idea to give in to this cowardice.

Now, all these things share a common feature: they're all things that require developer pain for the benefit of the QA team. Each one is also for the benefit of the company. My biggest point in this and the previous post is that concentrating on what is best for the company is most effective when it is sincere. We shouldn't be merely willing to do what is best for the company when it means pain for us (and possibly benefit for someone else), we should actively hunt down that pain with the intent of embracing it.

Having a QA department actually makes my job as Software Development Manager harder. It means imposing discipline, slowing down production, formalizing development processes, and giving some of my current corporate power to someone not me. It meant inviting another manager into my department who is at least as high on the corporate ladder as I am and who won't always agree with me.

That's a lot of downside to implement something that would be a lot of extra work and take some of my personal corporate political capital to make happen. The meager personal benefit list above doesn't really stack up well to the short term downside. The only reason I took it on is because I knew that doing so was in the best interest of the company and I knew that I could make that case in terms that others would understand.

Tags: , , , , ,

Management | Programming

Steve McConnell's Blog

by Jacob 25. May 2007 01:46

Oh my. Steve McConnell has started a blog. If you do not know who he is, you are either young in the ways of software development or aren't paying attention.

via Haacked

Tags: , ,

Software | Management

Driving Development

by Jacob 14. May 2007 14:09

I listened to a recent .Net Rocks podcast about Domain Driven Design. Now, despite reminding me of a post last week by Secret Geek about how everything is driving design these days, I think a lot of what was said makes sense to me. Eric Evans' point on the podcast seemed to revolve around learning a given domain and letting experience with the domain drive development decisions.

From the contents of his book at Amazon, it looks like Eric's idea is a modification or extension of Model Driven Design—i.e. an attempt to unify development efforts within a descriptive framework approachable by non-technical entities. In this case, letting the needs of the domain determine the focus of development. For example, he talked near the end about how it is often more valuable to leave legacy code alone (maybe wrapping it with a translation layer) in order to spend resources on the things that will provide better value than you can achieve by replacing the legacy application.

Good Enough

Eric also used a phrase that I listen for as a flag that identifies someone to take seriously: "good enough". Good enough is a phrase used by people who are actually trying to apply YAGNI as more than a way to win an argument. Good enough is a limiting term, a cut-off, a phrase that gets you from the isolation of the silicon tower to actually developing software.

Now, I have to be careful here because "good enough" is also useful for burying dangerous assumptions. I think that Eric is an example because he never elaborates on what something is good enough for. Given that he's explaining Domain-Driven Design, you could assume he's saying that it's good enough for the domain, but that doesn't map. Nobody works for a domain.

Pandering to Business Managers

Which got me thinking. Eric is talking about developers and domain experts creating a common framework for communication, but really, starting at the domain elides over the most important consideration in software development: the business. The context for "good enough" should be rooted in tangible benefit for your company.

Naturally, I thought I would be all clever-like and create another "driven". Unfortunately, IBM already mucked this one up. In what looks like naked pandering to clueless business managers, IBM proposed "Business-Driven Development" in December of 2005. I mean, can you take any paper seriously that contains this quote:

The inherent problem with the enterprise software development process is that it suffers from a lack of agility to match the pace at which the business needs to change in order to keep up with the market trends and competition.

Now, you could make a case for market trends and competition being pretty malleable—even to the point of outpacing the ability of software development to keep up—but that's kind of beside the point. The fact of the matter is that of all the things a business needs to change to adopt a new business process, software is the most adaptable.

Functionally, IBM's proposal is a way to isolate software development and assert management control. Under scrutiny, it looks to me like an attempt to force agile iterative principles into a BDUF format that IBM's Rational products can actually handle.

Dollar-Driven Development

What I'd like to see is Eric's domain drivers better focused as a responsibility towards whoever is underwriting development. This is hard to name because his principles are broadly applicable for business, government or even open source.

Still, somebody is fiscally responsible for software development and it is that responsibility that should drive development decisions--i.e. the best interests of the entity underwriting it. For open source projects, that would be the interests of the benevolent dictator in charge. For business development, that would be the best interest of the business you work for.

This is easier said than done. In order to work for something's best interest, you really have to know its interests well. I think that may be the heart of what Eric was saying about learning the concerns of a domain and letting those concerns drive development. It means developing expertise in a domain that may or may not be of your choosing. This takes time, effort, and access to domain experts. It means delving deeply into the reasons things are done now and the problems current processes were meant to solve.

Mainly, it means learning about things that aren't computers and likely have nothing to do with them. Whether it's a bank, talent agency, or bail-bondsman, it's not going to be something that you learned in school. It's yet another thing you have to learn on top of all the technology stuff that keeps changing.

Good thing you're good at learning things.  Right?

Tags: , , , , , ,

Management | Programming

Subtext Release 1.9.5

by Jacob 11. May 2007 09:42

Phil Haack just announced the official release of Subtext 1.9.5. There's some cool new features in this release, one of which I made significant contributions to the project for. I'm inordinately proud of that. I've been running 1.9.5 for a while now and it's been pretty solid.

Anyway, you can download the latest release here. Subtext has a pretty responsive community and I've enjoyed working on the project.

Technorati tags: , ,

Tags: , ,

Software

Arguing Data

by Jacob 27. February 2007 00:33

People have a lot of different reasons for posting blog entries. These reasons vary from financial, to personal, to professional, to I'm afraid to know more. For me, one reason I take the time when I could be doing something else is that I like to put my ideas out there to be tested. I don't really care if a majority of people agree with me so much as I want to see what other people have to say for or against certain things. The downside to this is that I'll sometimes find that an idea isn't as good as I had originally thought it was. The upside is the opportunity to refine something to be better or to discard an idea that turns out simply to be bad.

Which is why I'm glad to see Karl Seguin's response to a post I had made about DataSets. Karl's a bright guy and he has a good background in the problem domain associated with DataSet objects. He displays class, too, even when he feels I've been a bit rough in a point or two.

The School of Hard Knocks

I empathize with his experience where DataSet misuse caused much pain and suffering. I've been in similar situations and it's no fun. In a full-blown business transaction environment, DataSets have some liabilities that make them ill-suited for business-layer usage. The thing is, the opposite problem exists as well, and it's one that is more serious than people want to give it credit for: a layer of specialized, hand-crafted business objects that don't actually do anything.

I'm currently working at a place that has an extreme case of this problem. We have four entirely separate ASP.Net applications for our internal invoice processing. All four of these applications have their own set of substantially similar custom objects that are completely unique for that application. Each object doesn't do anything more than contain a group of properties that are populated from a database and write changes back to it.

I shudder to think how many hours were wasted on this travesty. It's over-complex, can't leverage any type of automated binding, doesn't track row state, and testing and debugging changes is an unmitigated pain. It's like someone attended an n-tier lecture somewhere and never bothered understanding what the point of having one actually was. Frankly, I'd prefer if the previous developers had simply put all the data access right in each individual page--at least that'd be easier to fix when something blew up.

Learning Your Craft

The thing is, my experience no more proves custom business objects wrong than Karl's experience proves DataSets wrong. That's the trouble with anecdotal experience: it feels more important than it is (it doesn't help that pain is such an efficient teacher).

The trick of learning a craft is in gaining experience that is both specific and broad. This can be tricky in a field that is as immense as software development. You really have no choice but to specialize at some point. Even narrowing it down to ".Net Framework" isn't nearly enough to constitute adequate focus for competence.

Unfortunately, Karl's point that there are a lot of lazy programmers out there is true. Anyone who has had to hire or manage programmers will confirm this. Too many developers don't bother learning enough of their craft to be considered actually competent. Faced with the need to specialize carefully, many simply give up and learn only enough to get by (and sometimes not even that much). They're content to learn the bare minimum needed to get hired. They'll learn enough of the "how" to create a program without ever bothering to learn any of the "why".

Teaching Others

I have a minor problem with Karl's explanation, though. He says, "I advocate against the use of DataSets as a counterbalance to people who blindly use them." While I understand this position, I'm not sure I can be said to appreciate it. It smacks a little of the "for your own good" school of learning; which works well enough in a parent-child or even teacher-student relationship. I'm not sure it works so well in public or general discourse.

It is hard to correct bad habits, particularly habits as widespread as DataSet misuse seems to be. As one who often has the bad habits to be corrected, though, I think that I'd prefer having the problem explained and given the context so I can understand the trade-offs being made. That would give me the opportunity to know why something is wrong, not just that something is wrong.

That'd require discussing DataSets in specific instead of general terms. I'm not sure if Karl would really want to do that, though. I mean, his specialty at CodeBetter is really ASP.Net. Expecting him to tackle ADO.Net is not just unrealistic, it could have the effect of diluting his blog posts and alienating his regular readers or getting him embroiled in things he's less interested in.

I would like to see someone respectable and wider-read than I am take on Strongly-typed DataSets in a more complete fashion, though.

Professor Microsoft

Which is why I have to agree with Karl that the blame for DataSet misuse lies squarely in Microsoft's court. I stopped counting how many official articles and examples from Microsoft included egregious misuse or abuse of DataSets. And I have yet to see any that describe how to do it right or what kinds of things to look for in determining the trade-offs between a Strongly-typed DataSet and a more formal OR/M solution, let alone ameliorating factors for each. The only articles about DataSets that I can remember that don't actually teach bad habits are articles about how bad they are. Which isn't helpful. It'd be nice to have something, somewhere that talks about using them wisely and what their strengths actually are. Maybe that should be a future blog post here...

Tags: , , , , , , ,

Programming

Are We There Yet?

by Jacob 29. January 2007 17:19

"So when will you be done with this development project?"

I don't know about you, but I hate this question. There simply is no good answer for it. It seems like such a simple question with a simple DateTime valued answer. One of these days I swear I'll answer with, "Oh, I'll be done next Tuesday at 2:34pm." just to see what happens.

And seriously, businesses hate that we have such difficulty answering the question. It seems perfectly reasonable for them to want to know when they can plan to have the new processes that they know they desperately need. Developers demand high salaries and are ostensibly professionals, they should be able to give a professional answer, right?

The Road is Well Paved

The thing is, software development is a lot harder than people expect it to be--and this includes software professionals. Even simple software projects can run afoul of hidden complexities that can destroy well meaning estimates and make everyone unhappy. And no matter how you hedge your answers, people simply don't remember all your caveats, maybes, and what ifs that you use to indicate uncertainty.

The end result is that developers seldom make their ship-by dates and companies become disillusioned and impatient with all software development. That's not helpful for anybody, but it's pretty much the rule anymore.

And the fact of the matter is that the vast majority of developers (and development managers) never learn how to answer the estimate question. They'll move from company to company, repeating the cycle of hope, suspicion, and disappointment over and over again. Which works well enough for the developers in the boom times when the demand for development is so high that mildly talented house plants can get hired as developers.

So a lot of people are making the same mistakes over and over. Businesses can be excused for assuming that this is simply the way things are and feel confident in their distrust of software professionals. They've been there, done that, bought the t-shirt.

Paying the Toll

This environment causes developers who care about these kinds of things a lot of heartburn. Everyone pays for the ongoing cycle of disillusionment. I believe that this is what really prompts posts like the recent ones from Ted Neward talking about professional ethics. And I've been known to throw my own hat into the ring as well.

We get tired of paying for the sins of those who have gone before. And I'm not referring to the messed up legacy code we stumble into, either. Frankly, messed up code is the least of your problems coming into a situation with a client who has been burned by previous developer promises. Companies that have had deadline after deadline missed have a degree of mistrust that is very hard to overcome.

We pay for this distrust in a hundred different ways. The thing is, trust is a paying commodity in business. Working with partners you trust means a whole lot of overhead you can simply skip. An analogy: if I trust a plumber to fix my sink quickly and professionally, I can go get a burger and leave him to it. It's only when I don't have that trust that I have to pay the additional overhead of having someone I do trust watching to make sure he's not napping under the sink.

Want to see a business manager go into a dreamy fantasy? Ask them what it'd be like to be able to trust their software developers (in house or not). The more experience they've had with developers the more intense the fantasy.

The Rubber Meets the Road

We have a couple of areas of friction in businesses that exacerbate this situation. The main disconnect with business managers is that we have borrowed terminology and tools from other disciplines without understanding that our processes are fundamentally different. It's tricky because the temptation to use manufacturing terminology is immense. After all, we are creating a product of sorts. This makes so much sense on an intuitive level that it's hard to realize that the comparison is misleading and potentially dangerous.

I wish we could retrain everyone to make analogies to other business specialties. Scientific research or law come to mind as potentially useful analogies because both are similarly plagued by the impact of unique situations, changing ground rules, and unforeseen complexities. It would be interesting to investigate how managing software development like a patent application or drug research would change how we look at the problems involved. We might have stumbled onto iterative cycles and responding to altered requirements a whole lot sooner, for example.

Paying Attention

The real problem, though, is that most developers (and even most development managers) don't take the time to learn about common friction points. Nor do they take the time to build relations with their business counterparts so that you have some political capital (aka trust) to use when it is needed. It's easy to forget that much of the progress in software development practices are pretty recent in terms of business processes. After all, business managers don't move at the speed of light and changes tend to take time to penetrate those layers.

Which means that a whole lot of industry advances aren't even theory yet in the board room.

And the fact of the matter is that you cannot expect a business manager to understand what makes Agile practices work. Or the reason that strong unit testing saves time over the long run even though it takes more time up front. Learning to communicate at a level that is sufficiently detailed for smart business decisions without getting bogged down into the jargon inherent in any specialty is an invaluable skill, and one best learned earlier than later. That means thoroughly understanding those theories yourself--not just on the surface or in buzzword compliance. It also means learning to communicate that understanding from orbit, 30,000 ft, 5,000 ft, and right on the ground. This is hard to do. It takes practice. It also takes exposure to business manager types. I'm not sure which is harder...

Something to think about, though: not learning this skill leaves you at the mercy of those who do learn it.

My point, though, is that it takes both. You have to learn your profession so thoroughly that you can deconstruct its "best practices" ("design patterns", whatever) and rebuild them from basic principles on the fly. AND you have to learn to communicate that understanding comfortably to people of varying familiarity with software development in a business environment.

That's what it takes to be a true professional. It's easy to let those two skills fall out of balance. Individuals who understand both are invaluable to a company. Also rare. Companies who discover someone capable of both are often surprised at how much smoother things run with that person placed where they can do the most good--a point Jeff Atwood's latest on becoming a better programmer drives home.

So I don't have a formula for quick and accurate estimates. Just a lot of hard work. Still, here's a tip for free: anyone asking for a firm delivery date is inherently assuming BDUF. Once you know that, you know where to start your answer.

Tags: , , , , , , , ,

Management

Trackbacks Are Dead

by Jacob 21. December 2006 19:59

 

Jeff Atwood has a recent post on why he finally gave up and disabled Trackbacks on his blog. My blog is the tiniest fraction of his and I had to disable trackbacks just for sheer spam volume back in October (inspiring an anti-spam rant of my own).

Jeff lays the blame for Trackbacks' demise on Six Apart--the outfit that created the standard in 2002. Ah, those heady glory days when you still had to explain to people what a blog was. Trackbacks were a great idea. They still are a great idea. But Jeff is right, the simplicity of the standard has left it wide open to abuse and that abuse has killed them dead.

So my question is (to Jeff or anyone else), how would you alter the design to make the standard more robust?

My initial take on it was to alter the standard to incorporate a public key exchange and a signature. But then I realized that, hey, spammers can create asymmetric keys as well as anyone else can. In other words, the problem isn't being able to authenticate the link--it's being able to evaluate the linked post.

Jeff's current stop-gap of finding links to his posts through Technorati seems like a reasonable short-term solution. Introducing a third party is problematic, though, because it leads to inevitable issues in finding a trustworthy third party that will carry the authentication burden for you (as well as traffic and processing costs as people ping them for link verification). Indeed, Akismet (a popular stab at trackback filtering) has those same third-party screening issues and isn't substantively different from Jeff's use of Technorati.

I suspect that the problem might not even be in bad design by Six Apart, however. The thing that makes Trackbacks so popular and led them to be so widely adopted is that it allows the creation of inter-post linkages from unaffiliated sources with very little effort. I'm afraid that any solution to the trackback problem is going to necessarily involve increasing the effort of unaffiliated linking to a point where it becomes much less attractive.

A Stab in the Dark

That said, here's two thoughts about potentially rewarding avenues for solving the problem picked up from spam solutions in other realms.

First from email spam, and recognizing that I'm pretty thoroughly ignorant of the underlying mechanisms involved in Bayesian content analysis, I wonder if there might be some useful application for content analysis here. Spammers are increasingly sophisticated in overcoming content analysis, though. Trackbacks may be easier to analyze, though, because they have an easily available comparison text (the originating post). It may be easier to compare your post with that of the linker and come up with a tougher analysis than you can in, say, a lone email. I don't know about that.

Second, the key to the success of spamming is that they have such a very low cost per "signal" (email, comment, trackback, what have you). Their only incremental cost is bandwidth to find blog posts and to send trackback signals. Raising those costs can have a significant effect on spam. This is essentially the key to Captcha's success in curbing comment spam. If a trackback request prompted a user-interactive Captcha-like query, that alone may well be enough to stem the vast majority of trackback spam. Perhaps a design amendment that included a short interaction on a trackback ping would be successful in cutting spam back to manageable levels.

In looking at those two ideas, it occurs to me that the main problem with a Bayesian solution is that it places the burden (both in implementing the Bayesian algorithms and in processing the incoming links) squarely on the target of the spam. This can lead to an unwanted side-effect by leaving your blog much more open to another Internet dirty trick--denial of service attacks. Frankly, you don't even have to deny service to affect a lot of private bloggers--attacks that increase their bandwidth usage would be as unwelcome to many as a full-on denial might be. After all, it doesn't cost you extra hosting fees when your blog goes down.

So maybe that means I only really have one thought/solution/suggestion. Bayesian analysis would be cool for the AI geeks, but not terribly practical in the constrained environment confronting most bloggers. I wonder what it'd take to create a Captcha mechanism in trackback notification?

Tags: , , , , , ,

Programming | Software

scruffylookingcatherder.com

Information

    Recent Posts

    Calendar

    <<  September 2010  >>
    MoTuWeThFrSaSu
    303112345
    6789101112
    13141516171819
    20212223242526
    27282930123
    45678910

    View posts in large calendar
    Disclaimer
    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2010 Scruffy-looking Cat Herder