<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: SOA and the N + 1 Selects Problem</title>
	<atom:link href="http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/</link>
	<description>Summa Blog</description>
	<pubDate>Thu, 11 Mar 2010 13:07:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: SOA and Authorization (Part 1): What’s so hard about it anyway? &#124; Summa Blog</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-1368</link>
		<dc:creator>SOA and Authorization (Part 1): What’s so hard about it anyway? &#124; Summa Blog</dc:creator>
		<pubDate>Thu, 30 Jul 2009 20:38:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-1368</guid>
		<description>[...] I&#8217;ve written about before, SOA puts a new twist on old problems, and it&#8217;s the same for authorization. What was a [...]</description>
		<content:encoded><![CDATA[<p>[...] I&#8217;ve written about before, SOA puts a new twist on old problems, and it&#8217;s the same for authorization. What was a [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vinay Rajadhyaksha</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-300</link>
		<dc:creator>Vinay Rajadhyaksha</dc:creator>
		<pubDate>Fri, 27 Mar 2009 05:18:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-300</guid>
		<description>This is probably an extension of the solution put forth by Ryan. How about developing  two services. Service 1 retrieves Customers based on a filter criteria, service 2 retrieves orders for a set of customers. Note that service 2 itself would be responsible for grouping the orders for a given customer. Additionally, the service itself need not be a WebService, if the customer service and order service are based on the same underlying technology, the same could be leveraged for defining the service interface. Alternatively I would look at hosting the two services on the same box allowing me to leverage within VM calls for local clients and use the service interface for remote clients. Advantage here is that we have a right grained service, which is sufficiently loosely coupled and provides options for doing performance improvement. It is not directly retrieving information from database, thus allowing service specific business logic wrappers to be retained, does not need a caching solution(in a typical scenario for performance reasons the order and customer databases would be independent) and addresses the N+1 select issue.

Let me know your thoughts.</description>
		<content:encoded><![CDATA[<p>This is probably an extension of the solution put forth by Ryan. How about developing  two services. Service 1 retrieves Customers based on a filter criteria, service 2 retrieves orders for a set of customers. Note that service 2 itself would be responsible for grouping the orders for a given customer. Additionally, the service itself need not be a WebService, if the customer service and order service are based on the same underlying technology, the same could be leveraged for defining the service interface. Alternatively I would look at hosting the two services on the same box allowing me to leverage within VM calls for local clients and use the service interface for remote clients. Advantage here is that we have a right grained service, which is sufficiently loosely coupled and provides options for doing performance improvement. It is not directly retrieving information from database, thus allowing service specific business logic wrappers to be retained, does not need a caching solution(in a typical scenario for performance reasons the order and customer databases would be independent) and addresses the N+1 select issue.</p>
<p>Let me know your thoughts.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben Northrop</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-222</link>
		<dc:creator>Ben Northrop</dc:creator>
		<pubDate>Thu, 26 Feb 2009 22:22:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-222</guid>
		<description>Thanks Steve!  (and good to hear from you)

Great points.  Agreed...the central database option definitely seems to be the best in terms of performance and complexity.  One way, maybe, to preserve some separation of concerns is to enforce some access rules at the DB level to, say, only allow read access to the Customer service, but not write access.

And good point about staleness being an issue not just with the caching option, but also with the n+1 (especially since global transactions aren't all that viable in SOA, so while orders are being retrieved for n customers, new orders could be coming in).</description>
		<content:encoded><![CDATA[<p>Thanks Steve!  (and good to hear from you)</p>
<p>Great points.  Agreed&#8230;the central database option definitely seems to be the best in terms of performance and complexity.  One way, maybe, to preserve some separation of concerns is to enforce some access rules at the DB level to, say, only allow read access to the Customer service, but not write access.</p>
<p>And good point about staleness being an issue not just with the caching option, but also with the n+1 (especially since global transactions aren&#8217;t all that viable in SOA, so while orders are being retrieved for n customers, new orders could be coming in).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Ayers</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-221</link>
		<dc:creator>Steve Ayers</dc:creator>
		<pubDate>Thu, 26 Feb 2009 00:38:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-221</guid>
		<description>This is a cool post, at least to me, since I have come across this issue twice in the past six months, each presenting one of the above predicaments.

In one scenario, I have a central database, which I'm guessing is going to occur the lion's share of the time this problem presents itself.  In this case, a simple join of the two tables was all it took.  I understand the tight coupling of the two services, but there are always going to be tradeoffs in any solution and obviously the correct approach is contingent on the circumstances of the problem.  In this instance, a little tight coupling in my opinion is far better than the n+1 performance issue.  I attempted the solution incorporating n+1 and the time it took was exponentially higher.  

In the other scenario, we have more of an According-to-Hoyle SOA environment in that one set of data resides in a different repository from another set.  In this instance, caching seems to be a viable option.  The configuration overhead might be an issue (albeit slight in my opinion) as would staleness, but you run the risk of staleness in any scenario, the only variable is how stale are we talking.  

For instance, if I go with a n+1-esque scenario or even the common service that pairs both together, by the time this logic completes, the data could be completely invalid anyway.  n+1 as I've said is an insane performance hit and a common 'matching' service dictates that essentially ALL orders be retrieved, since you have no way of applying additional customer criteria to the order table.  In my mind, the solution that makes the most sense is always the one that offers high performance gains while at the same time keeping some level of maintainability and minimal overhead.  Mr. Obvious, I know, but people do not always think that way.

Also, I do not think the granularity of these services is far off by any means.  This is a perfectly acceptable level of granularity and is very similar to many I've run across.

Cool posts, Ben.  Love the website too, keep up the good work.  Hope all is well.</description>
		<content:encoded><![CDATA[<p>This is a cool post, at least to me, since I have come across this issue twice in the past six months, each presenting one of the above predicaments.</p>
<p>In one scenario, I have a central database, which I&#8217;m guessing is going to occur the lion&#8217;s share of the time this problem presents itself.  In this case, a simple join of the two tables was all it took.  I understand the tight coupling of the two services, but there are always going to be tradeoffs in any solution and obviously the correct approach is contingent on the circumstances of the problem.  In this instance, a little tight coupling in my opinion is far better than the n+1 performance issue.  I attempted the solution incorporating n+1 and the time it took was exponentially higher.  </p>
<p>In the other scenario, we have more of an According-to-Hoyle SOA environment in that one set of data resides in a different repository from another set.  In this instance, caching seems to be a viable option.  The configuration overhead might be an issue (albeit slight in my opinion) as would staleness, but you run the risk of staleness in any scenario, the only variable is how stale are we talking.  </p>
<p>For instance, if I go with a n+1-esque scenario or even the common service that pairs both together, by the time this logic completes, the data could be completely invalid anyway.  n+1 as I&#8217;ve said is an insane performance hit and a common &#8216;matching&#8217; service dictates that essentially ALL orders be retrieved, since you have no way of applying additional customer criteria to the order table.  In my mind, the solution that makes the most sense is always the one that offers high performance gains while at the same time keeping some level of maintainability and minimal overhead.  Mr. Obvious, I know, but people do not always think that way.</p>
<p>Also, I do not think the granularity of these services is far off by any means.  This is a perfectly acceptable level of granularity and is very similar to many I&#8217;ve run across.</p>
<p>Cool posts, Ben.  Love the website too, keep up the good work.  Hope all is well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben Northrop</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-214</link>
		<dc:creator>Ben Northrop</dc:creator>
		<pubDate>Mon, 23 Feb 2009 12:48:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-214</guid>
		<description>Thanks for the comments, Ariel and Ryan.

Ariel - good point...the example is a bit fabricated, so in the real world, two such services might not make sense if these were indeed the requirements.  In general, however, I believe the problem is general to most service oriented architectures - most likely you will encounter a requirement that necessitates the merging of data across services.

Ryan - good catch!  This is a fifth solution, however, it seems like it too would some performance impact, and also be fairly cumbersome from an implementation perspective - since the "join" logic would need to exist in your application code rather than your database.   Your first query would return a set of customers.  Presumably, your second query would then use the customer IDs from the first query in the WHERE clause (like "WHERE customer_id IN (1, 5, 23, ...)") and find all the orders associated with those customers.  The business tier, then, would be responsible for merging the orders lists with the customer list, which could hurt performance (and maintainability).</description>
		<content:encoded><![CDATA[<p>Thanks for the comments, Ariel and Ryan.</p>
<p>Ariel - good point&#8230;the example is a bit fabricated, so in the real world, two such services might not make sense if these were indeed the requirements.  In general, however, I believe the problem is general to most service oriented architectures - most likely you will encounter a requirement that necessitates the merging of data across services.</p>
<p>Ryan - good catch!  This is a fifth solution, however, it seems like it too would some performance impact, and also be fairly cumbersome from an implementation perspective - since the &#8220;join&#8221; logic would need to exist in your application code rather than your database.   Your first query would return a set of customers.  Presumably, your second query would then use the customer IDs from the first query in the WHERE clause (like &#8220;WHERE customer_id IN (1, 5, 23, &#8230;)&#8221;) and find all the orders associated with those customers.  The business tier, then, would be responsible for merging the orders lists with the customer list, which could hurt performance (and maintainability).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ryan</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-199</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Fri, 20 Feb 2009 14:08:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-199</guid>
		<description>There's a very simple fifth solution, which is to add a "loadOrdersByCustomers" call to the Order service. Then you're down to two selects: load customers by criteria + load orders for a set of customers. As within a single service/project/application, loose coupling is a great design goal until it begins to impede something else, like performance in this case. Then you start adding specialized code for specific use cases. Your scenario assumes that the Order service is written without consideration of outside requirements, and you've shown that that's a bad idea.</description>
		<content:encoded><![CDATA[<p>There&#8217;s a very simple fifth solution, which is to add a &#8220;loadOrdersByCustomers&#8221; call to the Order service. Then you&#8217;re down to two selects: load customers by criteria + load orders for a set of customers. As within a single service/project/application, loose coupling is a great design goal until it begins to impede something else, like performance in this case. Then you start adding specialized code for specific use cases. Your scenario assumes that the Order service is written without consideration of outside requirements, and you&#8217;ve shown that that&#8217;s a bad idea.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ariel</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-192</link>
		<dc:creator>ariel</dc:creator>
		<pubDate>Thu, 19 Feb 2009 21:41:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-192</guid>
		<description>If those are the requirement from the application then it means that those services are not loosely coupled and they should not be in different services.</description>
		<content:encoded><![CDATA[<p>If those are the requirement from the application then it means that those services are not loosely coupled and they should not be in different services.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben Northrop</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-177</link>
		<dc:creator>Ben Northrop</dc:creator>
		<pubDate>Wed, 18 Feb 2009 14:32:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-177</guid>
		<description>Thanks for the comments!  

Chris - agreed that local caching is typically the best option for performance and perhaps complexity, but like you said, the costs of implementing this are non-trivial and there are big decisions to make (e.g. cache in memory in or in database?, is staleness an issue?, how to implement the pub-sub mechanism?, etc.)

Todd - A service layer like this could work, especially if there are big reuse opportunities.  Good point about the batch interface.</description>
		<content:encoded><![CDATA[<p>Thanks for the comments!  </p>
<p>Chris - agreed that local caching is typically the best option for performance and perhaps complexity, but like you said, the costs of implementing this are non-trivial and there are big decisions to make (e.g. cache in memory in or in database?, is staleness an issue?, how to implement the pub-sub mechanism?, etc.)</p>
<p>Todd - A service layer like this could work, especially if there are big reuse opportunities.  Good point about the batch interface.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: todd</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-171</link>
		<dc:creator>todd</dc:creator>
		<pubDate>Wed, 18 Feb 2009 08:31:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-171</guid>
		<description>You could create a service layer that was in charge of matching the customers and orders so everyone didn't have to create the same logic. And of course their should be batch interfaces so more than one item can be returned at a time.</description>
		<content:encoded><![CDATA[<p>You could create a service layer that was in charge of matching the customers and orders so everyone didn&#8217;t have to create the same logic. And of course their should be batch interfaces so more than one item can be returned at a time.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris Burnley</title>
		<link>http://www.summa-tech.com/blog/2009/02/17/soa-and-the-n-1-selects-problem/comment-page-1/#comment-169</link>
		<dc:creator>Chris Burnley</dc:creator>
		<pubDate>Wed, 18 Feb 2009 02:36:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.summa-tech.com/blog/?p=119#comment-169</guid>
		<description>I've come across this problem many times where I work ( a bank). The solution for this is usually the cache locally with pub/sub updates. The overhead is obviously creating your own storage for the cached data.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve come across this problem many times where I work ( a bank). The solution for this is usually the cache locally with pub/sub updates. The overhead is obviously creating your own storage for the cached data.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
