Thursday, 15 August 2013

How do they do it – the right way to write software

 

columbo

[Concrete / Interesting] Sitting at my desk and looking at the plans for the next scheduled release of our enterprise application and, oh boy, this looks like it’s gonna take forever. Not an unusual thought to pass through the mind of a development manager. But wait, does it really have to be this difficult? Is there a better way.

How do they do it?

On another subject, one of the guys in the team shows me Trello (www.trello.com) – 5 minutes in and it looks like the right tool for the job.

It’s a line of business application and it does what it says on the tin. Then my mind wanders and I find myself asking “how do these guys make any money? how do they write this stuff and sell it for nothing? how do they do it?”. So I dig a little deeper and the clue to how is to be found at the following URL:

https://trello.com/docs/

Yes, behind the web client, iOS, Android and Windows 8 clients, there is a business level REST API. And take a look at the structure, API nouns based on key business entities like ‘board’ and ‘card’ with REST based verbs like GET / PUT / DELETE and where it makes sense, a search API.

What about the client?

Well looking more closely at the web client with web kit tools and Fiddler I discover how simple it is. The web client implements the user interface directly against the REST API. A very appealing UX is provided by jQuery and a few extension libraries like jquery-ui for UI widgets, drag & drop.

XHR – The final bit of the puzzle falls into place

But is the client really doing all the work or is there a server side? Well watching the page load I can see a bunch of XHR requests – that is XMLHttpRequests or AJAX. The first one gets the summary level details like a list of the boards, summary attributes and status and then there are further calls for board details and other board status. So there is no server side as you would find in an ASP.Net class of application, just the web service API supporting a client.

Putting it all together

So Trello and applications like it can be classed as API based client applications, whether it is a web client written in HTML, CSS and jQuery or mobile app written in mobile native (iOS, Android or similar). With these applications the following features seem to come up again and again:

1) REST API

Behind the app is a REST API with business specific APIs with entity based nouns and REST verbs. And typically the API provides one or more search 2.0 methods.

2) Fixed and limited scope to the app.

The app is typically limited to a single core concept – for Trello this is ‘Organize anything together’ or collaborative project management. Even if this is not the case after a few generations of the application, the early generations are typically single paged with singular business focus and a limited set of business entities.

3) Rich UI / UX

Being very focused and limited in terms of scope allows the developers to invest in a rich user interface and user experience.

4) Multi-platform

Again, because of tight scope and especially because all the hard work is being done by the API, the developers can invest effort in native ports to multiple platforms like iOS and Android.

Testing the theory

Ok, so that all sounds easy enough. To get an app out fast I need to write an excellent but business focused REST API and chuck a rich user interface on top. To test the theory I put together a simple one page application for performing ‘Sentiment Analysis’ on twitter terms.

Sentiment Analysis simply tells you whether a population ‘likes’ or ‘hates’, ‘agrees’ or ‘disagrees’ with something. I built a Bayesian filter with corpora weighted for ‘like’ and ‘dislike’ based on a large number of Tweets. I then used this filter against a search of recent tweets matching a given search term. So for a term like ‘TFL’, 100 tweets are analysed as either positive or negative.

In this app, a simple web service is responsible for returning the resulting analysis for a given search term. You can see this API at:

http://sourcethecrowd.net/sourcethecrowdapi/sourcethecrowdservice.svc/help

For the client, I started with a template website from ThemeForest (www.themeforest.net). Lots of templates to choose from, many of them containing rich user interface component, HTML5 and responsive web designs.

Just add a little AJAX to an input field and voila – a single page application is born. Total development time – about 6 hours:

http://www.sourcethecrowd.net

image

P.S Why’s it so slow

The site is slow but the development was really quick. The site and REST API are both hosted on a slow platform and would benefit from deployment to an enterprise platform. There are a bunch of improvements that can be performed server side and this is where performance improvements should be considered so that all the client benefit.

More can be done on the client, like some graphics to better illustrate the response from the search. But this is version 1. And remember, all done in less than a day.

Performance Boost for XML Stored Procedures

xml[Concrete / Mildly Interesting] Sometimes SQL Server makes you want to pull your hair out. If you are using XML types in your stored procedures and seeing some  very poor performance, I may have an answer for you in this post.

 

 

XML is working, but maybe a little too hard

Today I was looking at the performance profile of one of our applications and after a little digging I noticed that a stored procedure was performing rather poorly. We pass a lot of parameters to some of our stored procedures and when we do this we use XML. This technique allows you build parameters using LINQ XElement types and with LINQ to SQL, simply pass the XElement as a parameter to the stored procedure. It’s a great technique so it’s even more disappointing to see it perform so badly. And what’s worse, it seems it’s the simple operation of fetching out the data using nodes() that causes all the problems.

To illustrate this, take a look at the following XML:

<Transaction>
<Parameter01>1234</Parameter01>
<Parameter02>14.99</Parameter02>
<Parameter03>2012-01-31T00:00:00Z</Parameter03>
<Parameter04>0</Parameter04>
<Parameter05>14.99</Parameter05>
<Parameter06>false</Parameter06>
<Parameter07>23</Parameter07>
<Parameter08>14.99</Parameter08>
<Parameter09>2012-01-31T21:51:35.0498047Z</Parameter09>
<Parameter10>0</Parameter10>
</Transaction>

Ok, not very exciting, but you get the idea. Here I have 10 parameters in my XML. And here is a simple stored procedure that is passed the XML and fetches out the parameters:

CREATE PROCEDURE usp_XMLTest
@doc XML
AS
BEGIN
SET NOCOUNT ON;

DECLARE
@Param01 INT, @Param02 MONEY, @Param03 DATETIME, @Param04 INT, @Param05 MONEY,
@Param06 BIT, @Param07 INT, @Param08 MONEY, @Param09 DATETIME, @Param10 INT

SET @Param01 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter01') AS doc(col))
SET @Param02 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter02') AS doc(col))
SET @Param03 = (SELECT doc.col.value('.', 'DATETIME') FROM @doc.nodes('/Transaction/Parameter03') AS doc(col))
SET @Param04 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter04') AS doc(col))
SET @Param05 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter05') AS doc(col))
SET @Param06 = (SELECT doc.col.value('.', 'BIT') FROM @doc.nodes('/Transaction/Parameter06') AS doc(col))
SET @Param07 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter07') AS doc(col))
SET @Param08 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter08') AS doc(col))
SET @Param09 = (SELECT doc.col.value('.', 'DATETIME') FROM @doc.nodes('/Transaction/Parameter09') AS doc(col))
SET @Param10 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter10') AS doc(col))
END

Of course, in the real world the parameters would have meaningful names and you’d want to do more than just set a parameter with values from the XML, but even in this simple example the problem raises it’s ugly head.

Testing the stored procedure

I’ve put together a console application to gather average timings for 100 calls to the stored procedure. I’m using XElement to represent the data and LINQ to SQL to call the stored procedure:

Ok, time to light the fuse and see what develops….

The Results

The results are very unexpected and a big cause for concern:

  • TotalExecution = 6573ms
  • AverageExecution = 65ms

The Explanation

Well, that’s going to cause any program to run like a dog. So why so slow?

After some digging into MSDN articles I found http://msdn.microsoft.com/en-US/library/ms345118. This talks about ’Performance Optimizations for the XML Data Type in SQL Server 2005’ and deep in the article is discusses problems with un-typed XML and this is the root of the problem. With un-typed element based XML, it appears that selecting a node selects the  value of the node and all the children under that node. Even if there are no children as in the example above, the operation is costly because there is no XML index to work with so the XML is parsed to match the XQuery expression.

However, there are ways to improve on this situation:

Solution 1 – Use attributes, not elements

Solution 2 – Use .text() accessor

If you can’t use attributes because you don’t own the schema of the XML, use a .text() accessor get the node as text and stop any further parsing of the XML. To achieve this, alter the .nodes() expression to include the accessor:

Conclusion

XML in SQL is powerful and extremely useful. But you need to check your implementation for performance. I’ve touched on one pitfall but be warned, there are more out there.

Tip of the day - where possible, let SQL Server know as much as possible about your XML.