<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Intel Software Network Blogs &#187; Kevin Farnham</title>
	<atom:link href="http://softwareblogs.intel.com/author/kevin-farnham/feed/" rel="self" type="application/rss+xml" />
	<link>http://softwareblogs.intel.com</link>
	<description></description>
	<pubDate>Fri, 29 Aug 2008 22:50:49 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
	<language>en</language>
			<item>
		<title>Poll Result: More Developers Are Learning about TBB</title>
		<link>http://softwareblogs.intel.com/2008/03/31/poll-result-more-developers-are-learning-about-tbb/</link>
		<comments>http://softwareblogs.intel.com/2008/03/31/poll-result-more-developers-are-learning-about-tbb/#comments</comments>
		<pubDate>Tue, 01 Apr 2008 00:22:43 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/31/poll-result-more-developers-are-learning-about-tbb/</guid>
		<description><![CDATA[
The March Threading Building Blocks poll suggests that the developer community is learning about TBB, but not that many developers are actively applying TBB in actual projects. The poll asked:



At what project level are you currently applying TBB?



81 people participated in the poll, making the following selections:



75.3% (61 votes) - Just getting started (learning about [...]]]></description>
			<content:encoded><![CDATA[<p>
The March <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> poll suggests that the developer community is learning about TBB, but not that many developers are actively applying TBB in actual projects. The poll asked:
</p>
<blockquote>
<p>
At what project level are you currently applying TBB?
</p>
</blockquote>
<p>
81 people participated in the poll, making the following selections:
</p>
<blockquote>
<ul>
<li>75.3% (61 votes) - Just getting started (learning about TBB)</li>
<li>6.2% (5 votes) - Developing new software that applies TBB</li>
<li>6.2% (5 votes) - Modifying existing software to use TBB</li>
<li>4.9% (4 votes) - Designing new software that will apply TBB</li>
<li>4.9% (4 votes) - Working at multiple levels on multiple TBB projects</li>
<li>2.5% (2 votes) - Maintaining software that uses TBB</li>
</ul>
</blockquote>
<p>
While it's clear that most people who participated are investigating TBB, it's also interesting to note the breakout for the developers who are actively using (or planning to use) TBB in actual projects. Over 11% of respondants reported that they are designing or developing new software that applies Threading Building Blocks. Almost 9% of respondants are either modifying existing software to apply TBB or maintaining software that already applies Threading Building Blocks. And another 5% of the responding developers are working at multiple levels on multiple TBB projects.
</p>
<p>
The poll results show that, eight months after TBB was launched as an open source project, there is a group of developers who are deploying TBB in new and existing applications. Meanwhile, there is a much larger group of developers who are interested in the Threading Building Blocks technology. When these projects that are applying TBB are completed to a degree that they can be made publicly available, they will provide a template that can be studied by other developers, as they design and develop their own projects that apply TBB for multithreading and scaling. It will be interesting to repeat this poll after some of these projects are completed and made public.
</p>
<p>
<strong>New poll: your OS for TBB development</strong>
</p>
<p>
The April <a href="http://www.threadingbuildingblocks.org">Threading Building Blocks poll</a> has been posted. This poll asks:
</p>
<blockquote>
<p>
On what Operating System(s) do you develop your TBB applications?
</p>
</blockquote>
<p>
The response options are:
</p>
<blockquote>
<ul>
<li>FreeBSD</li>
<li>Linux</li>
<li>MacOS</li>
<li>Microsoft Windows</li>
<li>Unix</li>
<li>Unix on Windows (Cygwin, MinGW, UWIN)</li>
<li>Other</li>
<li>More than one OS</li>
</ul>
</blockquote>
<p>
Even if you're not working on an actual TBB-related project yet, feel free to vote. Just select the operating system you're using for experimenting with TBB as you learn about it.
</p>
<p>
To vote, go to the <a href="http://www.threadingbuildingblocks.org">TBB home page</a> and scroll down a bit; you'll see the poll on the right side of the page.
</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a>
</p>
<p>
<a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/31/poll-result-more-developers-are-learning-about-tbb/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Threading Building Blocks on Wikipedia</title>
		<link>http://softwareblogs.intel.com/2008/03/28/threading-building-blocks-on-wikipedia/</link>
		<comments>http://softwareblogs.intel.com/2008/03/28/threading-building-blocks-on-wikipedia/#comments</comments>
		<pubDate>Sat, 29 Mar 2008 05:24:30 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/28/threading-building-blocks-on-wikipedia/</guid>
		<description><![CDATA[ I just finished adding new information to the Threading Building Blocks entry on Wikipedia (http://en.wikipedia.org/wiki/Threading_Building_Blocks). I added information on what's happened since TBB became an open source project, and I also added two new sections:

Open Source Operating Systems that Offer TBB Packages
Open Source Projects that Apply TBB

I consider Wikipedia one of the greatest new [...]]]></description>
			<content:encoded><![CDATA[<p> I just finished adding new information to the <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> entry on Wikipedia (<a href="http://en.wikipedia.org/wiki/Threading_Building_Blocks">http://en.wikipedia.org/wiki/Threading_Building_Blocks</a>). I added information on what's happened since TBB became an open source project, and I also added two new sections:</p>
<ul>
<li>Open Source Operating Systems that Offer TBB Packages</li>
<li>Open Source Projects that Apply TBB</li>
</ul>
<p>I consider Wikipedia one of the greatest new informational resources the Web has produced. It provides information to those who want to find out about something, but it also provides an open venue for anyone who wants to contribute information to the world in their area of expertise.</p>
<p>I encourage the Threading Building Blocks community to extend the <a href="http://en.wikipedia.org/wiki/Threading_Building_Blocks">TBB Wikipedia page</a> by supplementing the information about TBB itself, or adding new information about your own TBB-related projects, and about Linux and other operating systems that are making TBB available through their package manager systems.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/28/threading-building-blocks-on-wikipedia/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Threading Building Blocks and Linux Distributions, Part 2</title>
		<link>http://softwareblogs.intel.com/2008/03/26/threading-building-blocks-and-linux-distributions-part-2/</link>
		<comments>http://softwareblogs.intel.com/2008/03/26/threading-building-blocks-and-linux-distributions-part-2/#comments</comments>
		<pubDate>Wed, 26 Mar 2008 17:29:00 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/26/threading-building-blocks-and-linux-distributions-part-2/</guid>
		<description><![CDATA[ In my last post I talked about the availability of Threading Building Blocks packages in Debian Linux, Ubuntu Linux, and the Fedora Project. In this post, I'll investigate TBB's availability in other Linux distributions and also in FreeBSD.
Commercial TBB supported Linux distros
The Commercial TBB site includes a System Requirements section that identifies several Linux [...]]]></description>
			<content:encoded><![CDATA[<p> In my <a href="http://softwareblogs.intel.com/2008/03/24/threading-building-blocks-and-linux-distributions-part-1/">last post</a> I talked about the availability of <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> packages in <a href="http://www.debian.org">Debian Linux</a>, <a href="http://www.ubuntu.com/">Ubuntu Linux</a>, and the <a href="http://fedoraproject.org/">Fedora Project</a>. In this post, I'll investigate TBB's availability in other Linux distributions and also in FreeBSD.</p>
<p><strong>Commercial TBB supported Linux distros</strong></p>
<p>The <a href="http://www.threadingbuildingblocks.com">Commercial TBB</a> site includes a <a href="http://www.intel.com/cd/software/products/asmo-na/eng/threading/294797.htm#sysreq">System Requirements</a> section that identifies several Linux distributions on which TBB has been tested. These distributions are:</p>
<ul>
<li>Red Hat Enterprise Linux* 3, 4, or 5</li>
<li>Red Hat Fedora Core* 4, 5, or 6 (not supported on Itanium-based systems)</li>
<li>Asianux* 2.0</li>
<li>Red Flag DC Server* 5.0</li>
<li>Haansoft Linux* Server 2006</li>
<li>Miracle Linux v4.0</li>
<li>SuSE Linux Enterprise Server* 9 or 10</li>
<li>SGI Propack* 4.0 (supported on Itanium-based systems only)</li>
<li>SGI Propack 5.0 (not with IA-32 architecture processors)</li>
<li>Mandriva/Mandrake Linux 10.1.06 (not with Intel Itanium processors)<br />
Turbolinux GreatTurbo* Enterprise Server 10 SP1 (not with Intel Itanium processors)</li>
</ul>
<p>This list tells us that commercial TBB has been installed and tested on these distributions, but it doesn't tell us which of these distributions offers or plans to offer TBB open source packages. In July 2007 (when TBB Open Source was announced), people associated with several of these distributions commented on TBB open source (see the <a href="http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/30238853.aspx">"TBB's status with Operating System Vendors (OSVs)"</a> TBB forum post), suggesting that they planned to make TBB "more easily assessible" on their system (Novell/OpenSUSE), or that they are bundling TBB with their distribution (Asianux and Turbolinux). So, I did some searching to see if I could find out the current status of TBB and these distributions.</p>
<p><strong>Red Hat Enterprise Linux:</strong> no open source TBB package information found...</p>
<p><strong>Fedora 8:</strong> As I reported in my last post, the <a href="http://threadingbuildingblocks.org/ver.php?fid=84">tbb20_20070927oss</a> stable release has apparently been packaged into Fedora 8 and is available as an <a href="http://download.fedora.redhat.com/pub/fedora/linux/updates/8/SRPMS/tbb-2.0-4.20070927.fc8.src.rpm">RPM file</a> on the Fedora Download Server.</p>
<p><strong>Asianux, Red Flag, Haansoft Linux, Miracle Linux:</strong> the <a href="http://www.asianux.com/aboutAX.do">Asianux About</a> page suggests that these are all essentially the same Linux distribution, just distributed by different vendors. I wasn't able to find any references to TBB in the Asianux bug list. I created an account so I could access the technical support site, but there is a new account approval procedure that blocked my access. Searching on the web, I found a description of a purchasable <a href="http://translate.google.com/translate?hl=en&amp;sl=ko&amp;u=http://www.haansoftlinux.com/product/server/server_value.php&amp;sa=X&amp;oi=translate&amp;resnum=9&amp;ct=result&amp;prev=/search%3Fq%3Dasianux%2B%2522Threading%2BBuilding%2BBlocks%2522%26start%3D10%26hl%3Den%26client%3Diceweasel-a%26rls%3Dorg.debian:en-US:unofficial%26sa%3DN">Value Pack</a> of Intel software for Haansoft Linux. So, perhaps the way you get TBB onto Asianux and the related distributions is through purchasing and installing a commercial set of Intel software. I found no information about open source TBB packages for Asianux.</p>
<p><strong>SuSE:</strong> no open source TBB package information found...</p>
<p><strong>SGI Propack:</strong> this is a purchasable software package designed for SGI Altix computers.</p>
<p><strong>Mandriva/Mandrake:</strong> For Mandriva, I searched the forums and mailing list archives, and did general web searches; no open source TBB package information turned up.</p>
<p><strong>TBB in FreeBSD</strong></p>
<p><a href="http://www.freebsd.org">FreeBSD</a> has embraced open source TBB since soon after its inception. The latest update to the <a href="http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/tbb/">FreeBSD TBB port</a> was made on February 7, 2008. I believe this means that the <a href="http://softwareblogs.intel.com/2008/02/13/threading-building-blocks-early-2008-development-releases/">tbb20_20080207oss</a> development release of TBB is available in FreeBSD. That release is no longer listed on the <a href="http://threadingbuildingblocks.org/download.php">TBB Downloads</a> site. Because of this, I'm not sure what will happen if you try to install TBB using the FreeBSD package manager. It depends on if there is a dependency in the FreeBSD package on a link to the original tbb20_20080207oss download, and if that the download link still exists (even though we can't navigate to it using a browser any longer).</p>
<p><strong>Conclusion</strong></p>
<p>Threading Building Blocks is used by many developers on Linux and FreeBSD systems. Packaging of TBB is actively under way for <a href="http://www.debian.org">Debian Linux</a>, <a href="http://www.ubuntu.com/">Ubuntu Linux</a>, and the <a href="http://fedoraproject.org/">Fedora Project</a>; TBB is also available through the <a href="http://www.freebsd.org">FreeBSD</a> package manager.</p>
<p>It's likely that work is being done to package Threading Building Blocks for other Linux distributions; but, at the moment, there does not appear to be publicly available information about these efforts.</p>
<p>Fortunately, it's relatively easy to install TBB onto any Linux system. You install the source and build it; or you can download one of the prebuilt Linux binary downloads that are delivered with the TBB <a href="http://threadingbuildingblocks.org/file.php?fid=78">Commercial Aligned</a> releases. Once you set your environment variables (see the <em>tbbvars.*</em> files), you'll be set to go!</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/26/threading-building-blocks-and-linux-distributions-part-2/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Threading Building Blocks and Linux Distributions, Part 1</title>
		<link>http://softwareblogs.intel.com/2008/03/24/threading-building-blocks-and-linux-distributions-part-1/</link>
		<comments>http://softwareblogs.intel.com/2008/03/24/threading-building-blocks-and-linux-distributions-part-1/#comments</comments>
		<pubDate>Mon, 24 Mar 2008 18:52:29 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/24/threading-building-blocks-and-linux-distributions-part-1/</guid>
		<description><![CDATA[ Most of the people I've conversed with who are developing applications using Threading Building Blocks are working on Linux platforms. On the #tbb IRC channel, I've talked with people who are working with TBB on Gentoo, Debian, and Ubuntu Linux, and I'm sure several other distributions are represented as well.
Threading Building Blocks can be [...]]]></description>
			<content:encoded><![CDATA[<p> Most of the people I've conversed with who are developing applications using <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> are working on Linux platforms. On the #tbb IRC channel, I've talked with people who are working with TBB on <a href="http://www.gentoo.org/">Gentoo</a>, <a href="http://www.debian.org">Debian</a>, and <a href="http://www.ubuntu.com/">Ubuntu</a> Linux, and I'm sure several other distributions are represented as well.</p>
<p>Threading Building Blocks can be installed on any Linux system by downloading the source and directly building TBB. I did it this way for my Gentoo system (see my <a href="http://softwareblogs.intel.com/2007/08/02/threading-building-blocks-amd-athlon-64-x2-and-gentoo-linux/">"Threading Building Blocks, AMD Athlon 64 X2, and Gentoo Linux"</a> post). But in the eight months since TBB became an open source project, a significant amount of work has also been done by developers in packaging TBB for specific Linux distributions, enabling installation of TBB using each distribution's package manager.</p>
<p>In this series of posts I'll provide an overview of the status of TBB on various Linux distributions. As far as I know, TBB will be included in several upcoming Linux releases, but is not yet available as a package in current stable Linux releases. Intel thoroughly tested TBB on several Linux distributions before TBB became an open source project, so these distributions would seem to be excellent candidates for packaging open source TBB. I'll investigate all of these as well, over the course of these blogs.</p>
<p><strong>Ubuntu and Debian</strong></p>
<p>As I reported in January, Threading Building Blocks is being <a href="http://softwareblogs.intel.com/2008/01/11/threading-building-blocks-packaged-into-ubuntu-hardy-heron/">packaged into Ubuntu Hardy Heron</a>, which is expected to be released in April (the beta can be downloaded now). This came about through a <a href="https://bugs.launchpad.net/ubuntu/+bug/181137">request</a> submitted to the Ubuntu bug list by Sadiq Jaffer, a regular visitor to the #tbb IRC channel. Though the period for syncing new packages from Debian had expired, the Ubuntu team made an exception for TBB. So, TBB will be <a href="http://packages.ubuntu.com/hardy/libdevel/libtbb-dev">available to Ubuntu Hardy Heron users</a> using the Ubuntu package manager.</p>
<p>TBB being in Ubuntu Hardy Heron was of course possible only because of the efforts of people at <a href="http://www.athenacr.com/">Athena Capital Research</a> who packaged TBB into <a href="http://packages.debian.org/source/sid/tbb">Debian Sid</a>. For Debian users who are working with the current stable Etch version, Athena's Roberto Sanchez has created a non-official set of <a href="http://people.connexer.com/~roberto/debian/">Etch TBB packages</a>. I installed TBB onto my Debian Etch machine using Debian's APT package manager and Roberto's Etch TBB packages, and described that process in my <a href="http://softwareblogs.intel.com/2008/01/04/threading-building-blocks-debian-linux-packages/">"Threading Building Blocks Debian Linux Packages"</a> post.</p>
<p><strong>Fedora</strong></p>
<p>The status of Threading Building Blocks on Fedora is a bit confusing. The Fedora Package Database site indicates that TBB has been <a href="https://admin.fedoraproject.org/pkgdb/packages/name/tbb">approved for inclusion in Fedora 8</a>, but the package has not yet been implemented. The <a href="http://blog.fedoramd.org/">FedoraMD.org blog</a> documented <a href="http://blog.fedoramd.org/2008/02/13/tbb-20-420070927fc8src/">TBB's approval</a> for Fedora 8 on February 13, 2008.</p>
<p>But further searching located a downloadable <a href="http://download.fedora.redhat.com/pub/fedora/linux/updates/8/SRPMS/tbb-2.0-4.20070927.fc8.src.rpm">tbb-2.0-4.20070927.fc8.src.rpm</a> file on the Fedora Download Server. This package is based on the <a href="http://threadingbuildingblocks.org/ver.php?fid=84">tbb20_20070927oss</a> release, which is now classified as a stable release. You can see an overview of the features that were new in this release in my <a href="http://softwareblogs.intel.com/2007/12/07/threading-building-blocks-open-source-release-versions-matrix/">"Threading Building Blocks Open Source Release Versions Matrix"</a> post. The 20070927 release also includes changes that were initially released in the earlier 20070815 and 20070719 TBB releases.</p>
<p><strong>Conclusion</strong></p>
<p>Debian, Ubuntu, and Fedora are the three of the most widely used Linux distributions. In all three cases, TBB is (or will soon be) available using the respective distribution package managers.</p>
<p>In my next post I'll look into TBB availability on some other popular Linux distributions, and I'll also look at TBB on FreeBSD as well.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/24/threading-building-blocks-and-linux-distributions-part-1/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Superlinearity and Algorithmic Complexity; or, My Interesting Conversation with Herb Sutter</title>
		<link>http://softwareblogs.intel.com/2008/03/19/superlinearity-and-algorithmic-complexity-or-my-interesting-conversation-with-herb-sutter/</link>
		<comments>http://softwareblogs.intel.com/2008/03/19/superlinearity-and-algorithmic-complexity-or-my-interesting-conversation-with-herb-sutter/#comments</comments>
		<pubDate>Wed, 19 Mar 2008 16:55:51 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/19/superlinearity-and-algorithmic-complexity-or-my-interesting-conversation-with-herb-sutter/</guid>
		<description><![CDATA[ In my recent "Superlinearity Is Impossible; We Just Don't Always Think Correctly" I argued that algorithmic processing superlinearity is impossible. It might appear that a parallel application was achieving superlinearity, but that appearance was due to factors other than the algorithm itself engendering a more efficient computation. For example, memory access might be more [...]]]></description>
			<content:encoded><![CDATA[<p> In my recent <a href="http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/">"Superlinearity Is Impossible; We Just Don't Always Think Correctly"</a> I argued that algorithmic processing superlinearity is impossible. It might appear that a parallel application was achieving superlinearity, but that appearance was due to factors other than the algorithm itself engendering a more efficient computation. For example, memory access might be more efficient in the parallel algorithm. But this would not mean the limits defined by <a href="http://en.wikipedia.org/wiki/Amdahl's_law">Amdahl's Law</a> had been surpassed. I agreed with the statement Herb Sutter makes in his <a href="http://www.ddj.com/cpp/206100542">"Going Superlinear"</a> article in the February <a href="http://www.ddj.com">Dr. Dobb's Journal</a>:</p>
<blockquote><p> "But wait," someone could complain, "your example so far is unfair because you've stacked the deck. The truth is that, when you find a superlinear speedup, what you've really found is an inefficiency in the sequential algorithm."</p></blockquote>
<p><strong>When parallelism is simpler</strong></p>
<p>I had the good fortune to be able to speak with Herb Sutter today about these issues. We found ourselves in agreement in most areas. We agree, for example, that memory issues can make an enormous performance difference when you compare an algorithm run in parallel with a serial algorithm that retraces the data processing of the parallel algorithm in an element-by-element manner.</p>
<p>But even when I asked Herb to ignore memory issues (which is where I believe the greatest advantages will lie for some parallel algorithms), he remained quite adamant that superlinearity is possible. How so? Well, Herb mentioned things like algorithm complexity, and the known or unknown characteristics of the data. As I pondered this, Herb asked that I re-look at the last section of his "Going Superlinear" article. Here we find (among other things) a comparison between a simple parallel algorithm and the serial algorithm that retraces the parallel algorithm's data processing element-by-element. The serial algorithm, in this case, is:</p>
<blockquote><p> more complex. It has to do more bookkeeping than a simple linear traversal. This additional work can be a small additional source of performance overhead ...</p></blockquote>
<p>This is indeed correct. The "bread-slicing" example in my <a href="http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/">"Superlinearity Is Impossible"</a> post would involve slightly more complicated programming than the simple incremented loop that could be easily parallelized using something like <a href="http://www.ThreadingBuildingBlocks.org">TBB</a>.</p>
<p>Here's Herb's conclusion about the serial algorithm that mimics the parallel processing and the parallel algorithm:</p>
<blockquote><p> when we're comparing the proposed algorithm with simple parallel search, we're not really comparing apples with apples. We are comparing:</p>
<ul>
<li>A complex sequential algorithm that has been designed to optimize for certain expected data distributions, and</li>
<li>A simple parallel algorithm that doesn't make assumptions about distributions, works well for a wide range of distributions, and naturally takes advantage of the special ones the optimized one is trying to exploit ...</li>
</ul>
</blockquote>
<p><strong>So, is superlinearity possible?</strong></p>
<p>Have I been proven wrong? I don't think so. When we talk about "superlinear" speedups, I think it all comes down to a matter of definition: what do we each mean by "superlinear performance"? The comments posted to my "Superlinearity Is Impossible" post revealed that different people think about this quite differently.</p>
<p>My background is physics, and mathematical modeling and simulation. I'm accustomed to thinking in terms of an ideal world that doesn't really exist when I think of equations (a world full of infinitely-long wires, which can be approximated as having an infinitesimal thickness, etc.). I view <a href="http://en.wikipedia.org/wiki/Amdahl's_law">Amdahl's Law</a> as existing in this type of realm. Hence, I asked Herb to pretend memory access is instantaneous as we discussed whether or not superlinearity is possible.</p>
<p>In this "purist" theoretical way of looking at superlinearity and Amdahl's Law, I still consider superlinearity impossible. I view Amdahl's Law as akin to the "law" of gravity. Just because I see an airplane go up into the air doesn't mean the law of gravity has been broken. Other factors were involved.</p>
<p>The fact that duplication of the parallel algorithm's element-by-element processing can require a more complex serial algorithm means you really aren't comparing apples with apples, as Herb says. Nor can you; this situation seems inescapable.</p>
<p>Indeed, I ran several actual tests tonight on my quad-core system, writing some very simple programs where I eliminated advantages due to memory cache as much as possible. I found many cases where the more complex serial algorithm was slower than the simple algorithm that would have been parallelized; but surprisingly, sometimes my compiler's optimization actually made the more complex serial algorithm run faster than it's simpler cousin! If I turned the optimizer off, then the more complex algorithm always took more time to complete.</p>
<p>So, this is a complex realm. Looking at it all through a practical (i.e., non-purist, not rigidly theoretical) lens, I certainly agree that there are software engineering techniques which, when applied to software that is intended for a parallel environment, will produce programs that will complete their tasks in an amount of time that implies superlinear performance when compared with an equivalent serial algorithm. I do not, however, believe that this means that the computational limits represented by Amdahl's Law are being violated.</p>
<p><strong>The implications for programming education</strong></p>
<p>All this complexity implies that parallel programming might be best considered as being its own unique realm when it comes to performance optimization techniques. Even discounting the standard problems involving thread-safety, race conditions, deadlocks, etc., optimized parallel programming is a very different creature from optimized serial programming. To consider parallel programming as being a "next natural stage" that is somehow in sync with the serial programming most developers already understand is quite possibly a mistake, in varying ways. Software performance optimization within the parallel realm can be <em>very</em> different, as Herb Sutter and other are showing.</p>
<p>So, once again: is superlinear performance through parallelization possible? Can Amdahl's Law be broken? You tell me...</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/19/superlinearity-and-algorithmic-complexity-or-my-interesting-conversation-with-herb-sutter/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Multicore Race Continues: Who, How, and Why</title>
		<link>http://softwareblogs.intel.com/2008/03/18/the-multicore-race-continues-who-how-and-why/</link>
		<comments>http://softwareblogs.intel.com/2008/03/18/the-multicore-race-continues-who-how-and-why/#comments</comments>
		<pubDate>Tue, 18 Mar 2008 16:42:31 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/18/the-multicore-race-continues-who-how-and-why/</guid>
		<description><![CDATA[ An article in this past Saturday's Wall Street Journal (WSJ) titled "Racing to Gain Edge On Multicore Chips" talks about an effort being jointly funded by Intel and Microsoft for research on programming techniques suited for multi/many core computers. The article talks about the predicted many-core future:
 Intel and rival Advanced Micro Devices Inc. [...]]]></description>
			<content:encoded><![CDATA[<p> An article in this past Saturday's <a href="http://www.wsj.com">Wall Street Journal</a> (WSJ) titled <a href="http://online.wsj.com/article/SB120572280352740819.html">"Racing to Gain Edge On Multicore Chips"</a> talks about an effort being jointly funded by Intel and Microsoft for research on programming techniques suited for multi/many core computers. The article talks about the predicted many-core future:</p>
<blockquote><p> Intel and rival Advanced Micro Devices Inc. now have four-processor offerings, but companies predict the advent of "many-core" chips with dozens or even hundreds of microprocessors.</p></blockquote>
<p>and about the problems such computers present to software developers:</p>
<blockquote><p> "Everybody is madly racing toward multicore technology, and they don't have a clue about how to program it," said William Dally, a professor of computer science at Stanford University.</p></blockquote>
<p>Of course, this is precisely the problem that <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> is designed to address. Having worked on the TBB open source project since the project was launched at OSCON last July, TBB continues to look like an excellent solution to the multicore problem to me. I wish TBB had been around in 1993 when I first started developing mathematical modeling and simulation software for operation on multiprocessor Sun machines!</p>
<p><strong>But, who needs multithreaded programs???</strong></p>
<p>Over and over, I've heard the question: "who really needs multithreaded programs on their home (or office) desktop computer?" This question reminds me so much of questions from the past, like "Who really needs more than 640 KBytes of memory on a PC? How could you ever use so much memory???"</p>
<p>The WSJ article includes some answers to the question of how all those dozens or hundreds of cores might be used. Here are a few snippets:</p>
<blockquote>
<ul>
<li>jobs such as three-dimensional graphics</li>
<li>automatically picking out faces from large databases of photographs</li>
<li>helping a handheld device photograph and analyze the calorie content of each meal</li>
<li>in the financial sector, ... millions of transactions need to be swiftly analyzed for patterns that can make investors money</li>
</ul>
</blockquote>
<p>Now, some of these applications may sound non-critical for the future survival of society. But that doesn't mean people won't want these apps if they become available. "Find me all the pictures in my thousands of digital images that include Sally" or "find me all the pictures on my hard drive that include both Sally and Toby"--I can easily imagine that application becoming one that is taken for granted in the relatively near future. "What? You don't have that app? How do you live without it? Huh? You still <em>manually</em> search through your digital images???"</p>
<p>It doesn't have to be a life-critical application in order for it to be an app that everyone wants to have.</p>
<blockquote><p> "The future is really around media-rich applications," said Phil Hester, Advanced Micro's chief technology officer, who argued those jobs are particularly suited to chips with multiple types of microprocessor cores.</p></blockquote>
<p>I've long agreed with this. And I think the investment by Microsoft and Intel is further evidence that the day when highly computational multithreaded applications are a standard on every home and office desktop (and even on many hand-held devices) is on its way.</p>
<p>I also expect that TBB will play an important role as all of this unfolds...</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/18/the-multicore-race-continues-who-how-and-why/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Superlinearity Is Impossible; We Just Don't Always Think Correctly</title>
		<link>http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/</link>
		<comments>http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 17:05:10 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/</guid>
		<description><![CDATA[ I just received the April 2008 edition of Dr. Dobb's Journal (in my physical mailbox), and in this issue Herb Sutter continues his discussion of super linearity (where parallelizing a function produces a greater performance improvement than the number of applied processing cores). Here's Herb's article: "Super Linearity and the Bigger Machine".
I've thought a [...]]]></description>
			<content:encoded><![CDATA[<p> I just received the April 2008 edition of <a href="http://www.ddj.com">Dr. Dobb's Journal</a> (in my physical mailbox), and in this issue Herb Sutter continues his discussion of super linearity (where parallelizing a function produces a greater performance improvement than the number of applied processing cores). Here's Herb's article: <a href="http://www.ddj.com/architect/206903306">"Super Linearity and the Bigger Machine"</a>.</p>
<p>I've thought a lot about Herb's last article, which I commented on in my <a href="http://softwareblogs.intel.com/2008/02/15/can-parallelism-achieve-superlinear-performance-gains/">"Can Parallelism Achieve Superlinear Performance Gains"</a> post. In that article, Herb used the example of searching for data in a situation where the data is clumped as representing a case where superlinear performance is possible when an application is multithreaded.</p>
<p>My further thoughts about this situation brought me to the conclusion that a multithreaded search compared with the single-threaded search Herb posited would indeed achieve apparent superlinear performance. But, that is because the sensible approach for searching for clumped data would not be an element-by-element linear search through an array of data. In other words, the superlinearity would be achieved only because the proposed single-threaded methodology was not the ideal approach to the problem.</p>
<p><strong>What if you lost your ring in a loaf of bread?</strong></p>
<p>Think about it this way. You are baking a loaf of bread. You suddenly realize that your favorite ring is no longer on your finger! You search all over the counters. Finally, you conclude your ring must be inside that loaf of bread that's baking in the oven.</p>
<p>So, you wait until the bread is fully baked, then begin your search for the ring. Now, your ring is a relatively large object that is located somewhere inside that loaf of bread. It's "clumped data" within the background of solid bread. So, if your objective is to find that ring as quickly as possible, what will your approach be? Will you start from one end of the loaf, and cut almost infinitely tiny slices until you reach the ring? Or, would common sense suggest that you chop the loaf into slices that are more ring-sized? Since all you have to do is touch the ring, and you'll have found it.</p>
<p>The single-threaded method of starting at one end of the loaf and slicing tiny slices will clearly not be the most efficient method. The parallel method of chopping the loaf into eighths so you, your family, and several friends can each work with 1/8th of the loaf to try to find the ring will in most cases attain a greater than 8 times efficiency than the tiny-slices-starting-at-one-end method.</p>
<p>But would this really be superlinear performance? I think not. Because you yourself could have (should have) cut the loaf into eighths, then into 16ths, etc. That would have been the most efficient single-threaded method of finding that clumped data (i.e., your lost ring).</p>
<p><strong>Conclusion</strong></p>
<p>Superlinear performance is impossible. However, when we're programming, our brains don't always think of strategies that would be common sense if we were facing a "real world" problem (like finding a relatively large hard object inside a loaf of bread). Hence, parallelism will sometimes appear to produce superlinear results. But if we apply the most efficient method to the problem in the first place, the illusion of superlinear performance will never appear.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/17/superlinearity-is-impossible-we-just-dont-always-think-correctly/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Threading Building Blocks Early March 2008 Development Releases</title>
		<link>http://softwareblogs.intel.com/2008/03/14/threading-building-blocks-early-march-2008-development-releases/</link>
		<comments>http://softwareblogs.intel.com/2008/03/14/threading-building-blocks-early-march-2008-development-releases/#comments</comments>
		<pubDate>Sat, 15 Mar 2008 03:32:23 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/14/threading-building-blocks-early-march-2008-development-releases/</guid>
		<description><![CDATA[ The Threading Building Blocks development team has moved to a weekly development release schedule recently: there have already been two early March releases following the February 26 release that I blogged about in my last post. In this post, I aim to catch up with the development team!
2008 March 4 TBB Development release
The CHANGES [...]]]></description>
			<content:encoded><![CDATA[<p> The <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> development team has moved to a weekly development release schedule recently: there have already been two early March releases following the February 26 release that I blogged about in my <a href="http://softwareblogs.intel.com/2008/03/13/threading-building-blocks-20080226-development-release/">last post</a>. In this post, I aim to catch up with the development team!</p>
<p><strong>2008 March 4 TBB Development release</strong></p>
<p>The <em>CHANGES</em> file for the <a href="http://threadingbuildingblocks.org/ver.php?fid=99">tbb20_20080304oss</a> development release highlights these changes:</p>
<ul>
<li>Task-to-thread affinity support, previously kept under a macro, now fully legalized.</li>
<li>Work-in-progress on cache_aligned_allocator improvements.</li>
<li>Pipeline really supports parallel input stage; it's no more serialized.</li>
<li>Various improvements to code, tests, examples and Makefiles.</li>
</ul>
<p>and these bug fixes:</p>
<ul>
<li>119 - fix for scalable_malloc sometimes failing to return a big block.</li>
<li>TR575 - fixed a deadlock occurring on Windows in startup/shutdown under some conditions.</li>
</ul>
<p>An important thing to note here is the Bug 119 fix. This is a bug that was entered by "uj" into the <a href="http://threadingbuildingblocks.org/bugzilla_search.php">public TBB bug database</a>. I've been telling people for a while that, while the online TBB bug database doesn't contain every bug (the internal bug list hasn't yet been merged into the public bug database), it's still very worthwhile for TBB users to post bugs into the public database, because the TBB development team pays attention to everything that's posted there. If you find what you believe may be a bug, please do take the time to post the bug, and help improve TBB!</p>
<p>A quick look at a <code>diff</code> of the 20080226 and 20080304 releases shows significant changes (mostly additions) to the <em>cache_aligned_allocator.cpp</em> and <em>pipeline.cpp</em> files in <em>src/tbb</em>, and many changes in <em>src/tbb/task.cpp</em> related to affinity. A lot of other files have a few altered lines, many of which are again related to the affinity changes mentioned in the <em>CHANGES</em> file.</p>
<p>The <em>test_pipeline.cpp</em> and <em>test_task.cpp</em> files in the <em>src/test</em> directory were significantly modified to better test the new pipeline and task changes.</p>
<p><strong>2008 March 11 TBB Development release</strong></p>
<p>The <a href="http://threadingbuildingblocks.org/ver.php?fid=100">tbb20_20080311oss</a> development release <em>CHANGES</em> files lists the following changes:</p>
<ul>
<li>An enumerator added for pipeline filter types (serial vs. parallel).</li>
<li> New task_scheduler_observer class introduced, to observe when threads start and finish interacting with the TBB task scheduler.</li>
<li>task_scheduler_init reverted to not use internal versioned class; binary compatibility guaranteed with stable releases only.</li>
<li>Various improvements to code, tests, examples and Makefiles.</li>
</ul>
<p>To me, the new <code>task_scheduler_observor</code> class is quite interesting. It would seem that this class will empower developers with capability to tune their applications for greater efficiency and performance. I'd like to investigate this new class more thoroughly in a future post.</p>
<p>Looking at a <code>diff</code> between the 20080304 and 20080311 development releases, changes in <em>src/tbb/task.cpp</em> stand out in terms of number of changed lines (most of them additions). This backs up my expressed interest in the new <code>task_scheduler_observor</code> class, since much of the code for the observor is in the <em>task.cpp</em> file. Further investigation of this is promised!</p>
<p><strong>Conclusion</strong></p>
<p>It's clear that the TBB development team is hard at work. The team is working toward an upcoming significant release--maybe not a 3.0, but a 2.1 anyway. I encourage those who are experimenting with TBB to download and work with the later development releases. There's a lot there that is not available in the latest stable or commercial-aligned releases. And, from what I'm told, much of what's in these recent development releases--all the new features and improvements to the base code, the tests, the examples--is likely to solidify into a highly-stable commercial-aligned release sometime in the next few months. I heartily encourage TBB experimenters to work with the latest TBB development releases.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/14/threading-building-blocks-early-march-2008-development-releases/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Threading Building Blocks 20080226 Development Release</title>
		<link>http://softwareblogs.intel.com/2008/03/13/threading-building-blocks-20080226-development-release/</link>
		<comments>http://softwareblogs.intel.com/2008/03/13/threading-building-blocks-20080226-development-release/#comments</comments>
		<pubDate>Thu, 13 Mar 2008 19:51:44 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/13/threading-building-blocks-20080226-development-release/</guid>
		<description><![CDATA[The tbb20_20080226oss development release of Threading Building Blocks includes the following changes (as listed in the CHANGES file included with the release):

Introduced tbb_allocator to select between standard allocator and tbb::scalable_allocator when available.
Removed spin-waiting in pipeline and concurrent_queue.
Improved performance of concurrent_hash_map by using tbb_allocator.
Improved support for Intel® Thread Checker.
Various improvements to code, tests, examples and Makefiles.

I [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://threadingbuildingblocks.org/ver.php?fid=97">tbb20_20080226oss</a> development release of <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> includes the following changes (as listed in the <em>CHANGES</em> file included with the release):</p>
<ul>
<li>Introduced <code>tbb_allocator</code> to select between standard allocator and <code>tbb::scalable_allocator</code> when available.</li>
<li>Removed spin-waiting in <code>pipeline</code> and <code>concurrent_queue</code>.</li>
<li>Improved performance of <code>concurrent_hash_map</code> by using <code>tbb_allocator</code>.</li>
<li>Improved support for Intel® Thread Checker.</li>
<li>Various improvements to code, tests, examples and Makefiles.</li>
</ul>
<p>I differenced the 20080226 code with the previous TBB development release (tbb20_20080207oss), to look more closely at the new changes.</p>
<p><strong>concurrent_queue</strong></p>
<p>The <code>concurrent_queue</code> changes involved lots of new lines. Most of the changes appear to be related to the elimination of spin-waiting. This is one of the most significant changes in the 20080226 version. Many of the changes are related to pthreads and pthread mutexes. There is also some Windows-specific new code.</p>
<p>If you've studied the TBB <code>concurrent_queue</code> source code previously, you'll want to take a look at the new code, since the changes are substantial.</p>
<p><strong>pipeline</strong></p>
<p>As with the <code>concurrent_queue</code>, spin-waiting was eliminated from the TBB <code>pipeline</code> component. The changes to accomplish this in <code>pipeline</code> were much smaller than was the case for <code>concurrent_queue</code>.</p>
<p><strong>Changes in other TBB components</strong></p>
<p>The <code>atomic</code>, <code>cache_aligned_allocator</code>, <code>concurrent_hash_map</code>, <code>concurrent_vector</code>, <code>pipeline</code>, <code>scalable_allocator</code>, and <code>spin_rw_mutex</code> TBB components were changed, but the number of new or changed lines of code for these components was relatively small.</p>
<p>Quite a few <code>ASSERT</code>s were added to the <em>task</em> source files. There were quite a lot of changes to the <em>tbb_misc.*</em> files, many related to the <code>__TBB_WEAK_SYMBOLS</code> defined variable. There were changes to <em>itt_notify.cpp</em>, and scattered changes in some other files.</p>
<p><strong>Build changes</strong></p>
<p>There were a lot of changes to the files in the <em>build</em> subdirectory: the <em>*.inc</em> files for the supported operating systems and the makefiles were modified. Changes were also made to many of the example problem makefiles, and the test makefiles. In addition, there were platform-related changes to the <em>*_tbb_export.def</em> files located in the <em>src/tbb</em> subdirectory.</p>
<p>There's probably not a lot of need for developers to look closely at the build changes, unless you've made your own changes to the build files and want to see how to merge your changes with the latest TBB build files.</p>
<p><strong>Example and test program changes</strong></p>
<p>There are some fairly small changes in the <em>parallel_reduce/convex_hull</em> and <em>test_all/fibonacci</em> examples (in the <em>examples</em> subdirectory). Several of the <em>src/test</em> programs included minor changes.</p>
<p><strong>Conclusion</strong></p>
<p>The <a href="http://threadingbuildingblocks.org/ver.php?fid=97">tbb20_20080226oss</a> development release embodies almost three weeks of development work (the prior development release is dated 20070207). The new release includes a lot of small improvements in TBB components, build and platform improvements, and fairly major performance improvements for the <code>concurrent_queue</code> and <code>pipeline</code> components.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/13/threading-building-blocks-20080226-development-release/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Hacking Threading Building Blocks into Cygwin, Part 3</title>
		<link>http://softwareblogs.intel.com/2008/03/10/hacking-threading-building-blocks-into-cygwin-part-3/</link>
		<comments>http://softwareblogs.intel.com/2008/03/10/hacking-threading-building-blocks-into-cygwin-part-3/#comments</comments>
		<pubDate>Mon, 10 Mar 2008 23:29:32 +0000</pubDate>
		<dc:creator>Kevin Farnham</dc:creator>
		
		<category><![CDATA[Multicore]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Threading Building Blocks]]></category>

		<guid isPermaLink="false">http://softwareblogs.intel.com/2008/03/10/hacking-threading-building-blocks-into-cygwin-part-3/</guid>
		<description><![CDATA[ My last post about hacking Threading Building Blocks into Cygwin ended with an "Unknown OS" error in file src/tbb/tbb_misc.h. This was a good sign because it meant I had things configured correctly enough for my Cygwin GCC compiler to actually start building TBB.
To reiterate, what I'm trying to do is indeed a "hack"; I'm [...]]]></description>
			<content:encoded><![CDATA[<p> My <a href="http://softwareblogs.intel.com/2008/03/05/hacking-threading-building-blocks-into-cygwin-part-2/">last post</a> about hacking <a href="http://www.ThreadingBuildingBlocks.org">Threading Building Blocks</a> into <a href="http://cygwin.com/">Cygwin</a> ended with an "Unknown OS" error in file <em>src/tbb/tbb_misc.h</em>. This was a good sign because it meant I had things configured correctly enough for my Cygwin GCC compiler to actually start building TBB.</p>
<p>To reiterate, what I'm trying to do is indeed a "hack"; I'm not at this point doing what could be considered a nicely rendered port of TBB into the Cygwin environment. My objective is simply to find and make the modifications necessary so that I can launch Cygwin and apply TBB within that environment.</p>
<p><strong>Unknown OS redux, redux, ...</strong></p>
<p>The TBB source code in many areas is tailored for individual operating systems(<code>_WIN32</code>, <code>_WIN64</code>, <code>__linux__</code>, or <code>__APPLE__</code>). Where the code cannot be compiled due to an undefined OS, the compiler stops and displays messages of this type:</p>
<blockquote>
<pre>
In file included from ../../src/tbb/concurrent_queue.cpp:33:
../../src/tbb/tbb_misc.h:86:2: #error Unknown OS
make[1]: *** [concurrent_queue.o] Error 1</pre>
</blockquote>
<p>I had defined my operating system to be Cygwin. Hence, everywhere in the TBB code where specific code was defined for individual operating systems, I had to modify the code to provide a set of code to be compiled when the defined OS is Cygwin. Since I was building TBB using GCC, and since Cygwin is in many ways similar to Linux, I often applied the code assigned for <code>__linux__</code> for my Cygwin TBB build. In most cases, this worked.</p>
<p>Taking <em>tbb_misc.h</em> as an example, the original code included this set of OS-specific source:</p>
<blockquote>
<pre>
namespace tbb {

static volatile int number_of_workers = 0;

#if defined(__TBB_DetectNumberOfWorkers)
static inline int DetectNumberOfWorkers() {
    return __TBB_DetectNumberOfWorkers();
}
#else
#if _WIN32||_WIN64

static inline int DetectNumberOfWorkers() {
    if (!number_of_workers) {
        SYSTEM_INFO si;
        GetSystemInfo(&amp;si);
        number_of_workers = static_cast(si.dwNumberOfProcessors);
    }
    return number_of_workers;
}

#elif __linux__ 

static inline int DetectNumberOfWorkers( void ) {
    if (!number_of_workers) {
        number_of_workers = get_nprocs();
    }
    return number_of_workers;
}

#elif __APPLE__

static inline int DetectNumberOfWorkers( void ) {
    if (!number_of_workers) {
        int name[2] = {CTL_HW, HW_AVAILCPU};
        int ncpu;
        size_t size = sizeof(ncpu);
        sysctl( name, 2, &amp;ncpu, &amp;size, NULL, 0 );
        number_of_workers = ncpu;
    }
    return number_of_workers;
}

#else

#error Unknown OS

#endif /* os kind */

#endif</pre>
</blockquote>
<p>My first attempt was to copy the <code>__linux__</code> definition of the <code>DetectNumberOfWorkers()</code> function into <code>#error Unknown OS</code> section, and comment out the error message. In this case, it didn't work (it didn't compile). So, just to be able to move on, I hard-coded the <code>number_of_workers</code> variable to be 4 (since I'm doing this on my quad-core system). That's <em>not</em> how you'd do it if you were making a true port, mind you!</p>
<p>So, here's what my modified code segment looks like:</p>
<blockquote>
<pre>
#else

//cygwin: use known setting (quad core)
static inline int DetectNumberOfWorkers( void ) {
    //if (!number_of_workers) {
    //    number_of_workers = get_nprocs();
    //}
    number_of_workers = 4;
    return number_of_workers;
}
//#error Unknown OS

#endif /* os kind */</pre>
</blockquote>
<p>Next followed similar changes in response to "Unknown OS" errors in <em>src/tbb/itt_notify.cpp</em> and <em>src/tbb/cache_aligned_allocator.cpp</em>. In both cases, I replaced the "Unknown OS" error line with the code that was defined for <code>__linux__</code>.</p>
<p><strong>Issues of various kinds...</strong></p>
<p>Next, my build produced an error in <em>src/tbb/task.cpp</em>:</p>
<blockquote>
<pre>
src/tbb/task.cpp:400: error: invalid conversion from 'int*' to 'int32_t*'</pre>
</blockquote>
<p>The code in question is checking a set of multibyte characters for the string "GenuineIntel". This didn't seem critical to successful operation of TBB within the Cygwin environment, so I commented out the entire area of code that was producing the error and set the <code>result</code> to <code>true</code>:</p>
<blockquote>
<pre>
//! True if running on genuine Intel hardware
static inline bool IsGenuineIntel() {
    bool result = true;
#if defined(__TBB_cpuid)
    char info[16];
    char *genuine_string = "GenuntelineI";
//cygwin: comment out line that caused error; set result to true
//    __TBB_x86_cpuid( reinterpret_cast(info), 0 );
    // The multibyte chars below spell "GenuineIntel".
    //if( info[1]=='uneG' &amp;&amp; info[3]=='Ieni' &amp;&amp; info[2]=='letn' ) {
    //    result = true;
    //}
//    for (int i = 4; i &lt; 16; ++i) {
//        if ( info[i] != genuine_string[i-4] ) {
//            result = false;
//            break;
//        }
//    }
    result = true;  //cygwin
#elif __TBB_ipf
    result = true;
#else
    result = false;
#endif
    return result;
}</pre>
</blockquote>
<p>My next problem was related to <em>build/version_info_linux.sh</em>. I had created my <em>cygwin.inc</em> file using <em>linux.inc</em> as a template. <em>version_info_linux.sh</em> applies Linux commands to produce a version information string. Many scripting commands that are valid in Linux are not valid in Cygwin. So, I created a new file, <em>build/version_info_cygwin.sh</em>, and had my <em>cygwin.inc</em> file call that instead of calling the Linux version. I hard-coded variables as necessary simply to be able to move on quickly (again, this is a hack, not a port). Here's the core section of my <em>version_info_cygwin.sh</em> file:</p>
<blockquote>
<pre>
#cygwin - based on version_info_linux.sh - many changes

echo "#define __TBB_VERSION_STRINGS \\"
#cygwin echo '"TBB: ' "BUILD_HOST\t\t"`hostname -s`" ("`arch`")"'" ENDL \'
echo '"TBB: ' "BUILD_HOST\t\tQUAD_CORE ("`arch`")"'" ENDL \'
echo '"TBB: ' "BUILD_OS\t\t"`head -1 /etc/issue | sed -e 's/\\\\//g'`'" ENDL \'
#cygwin echo '"TBB: ' "BUILD_KERNEL\t"`uname -rv`'" ENDL \'
echo '"TBB: ' "BUILD_KERNEL\t1.5.25"'" ENDL \'
echo '"TBB: ' "BUILD_GCC\t\t"`g++ -v &amp;1 | grep 'gcc.*version'`'" ENDL \'
[ -z "$COMPILER_VERSION" ] || echo '"TBB: ' "BUILD_COMPILER\t"$COMPILER_VERSION'" ENDL \'
echo '"TBB: ' "BUILD_GLIBC\t2.3.5"'" ENDL \'
echo '"TBB: ' "BUILD_LD\t\t"`ld -v | grep 'version'`'" ENDL \'
echo '"TBB: ' "BUILD_TARGET\t$arch on $runtime"'" ENDL \'
echo '"TBB: ' "BUILD_COMMAND\t"$*'" ENDL \'
echo ""
echo "#define __TBB_DATETIME \""`date -u`"\""</pre>
</blockquote>
<p><strong>Almost there!</strong></p>
<p>My next execution of <code>make</code> brought me all the way to the <code>ld</code> process. There, I was told that the <code>-lrt</code> option was not valid. The fix for this was simple: I edited my <em>cygwin.gcc.inc</em> file and removed the <code>-lrt</code> option:</p>
<blockquote>
<pre>
LIB_LINK_FLAGS = -shared
#cygwin LIBS = -lpthread -lrt -ldl
LIBS = -lpthread -ldl</pre>
</blockquote>
<p>The next error message was:</p>
<blockquote>
<pre>
src/tbbmalloc/MemoryAllocator.cpp:339:
   #error highestBitPos() not implemented for this platform</pre>
</blockquote>
<p>This was somewhat similar to the earlier "Unknown OS" errors. But in this case, the base TBB code defines instructions for <code>__ARCH_unknown</code>. I chose to implement this code for Cygwin, to avoid having to get into the ASSEMBLER instructions that are defined for other operating systems. So, this area of my <em>MemoryAllocator.cpp</em> looks like this:</p>
<blockquote>
<pre>
static inline unsigned int highestBitPos(unsigned int number)
{
    unsigned int pos;
#if __ARCH_x86_32||__ARCH_x86_64

# if __linux__||__APPLE__
    __asm__ ("bsr %1,%0" : "=r"(pos) : "r"(number));
# elif (_WIN32 &amp;&amp; (!_WIN64 || __INTEL_COMPILER))
    __asm
    {
        bsr eax, number
        mov pos, eax
    }
# elif _WIN64 &amp;&amp; _MSC_VER&gt;=1400
    _BitScanReverse((unsigned long*)&amp;pos, (unsigned long)number);
# else
//cygwin
//#   error highestBitPos() not implemented for this platform
    static unsigned int bsr[16] = {0,6,7,7,8,8,8,8,9,9,9,9,9,9,9,9};
    MALLOC_ASSERT( number&gt;=64 &amp;&amp; number&gt;6 ];
# endif

#elif __ARCH_ipf || __ARCH_unknown
    static unsigned int bsr[16] = {0,6,7,7,8,8,8,8,9,9,9,9,9,9,9,9};
    MALLOC_ASSERT( number&gt;=64 &amp;&amp; number&gt;6 ];
#else
#   error highestBitPos() not implemented for this platform
#endif
    return pos;
}</pre>
</blockquote>
<p><strong>Success!</strong></p>
<p>I re-executed <code>make release</code>, and it was apparently successful: release versions of libtbb.so and libtbbmalloc.so were created!</p>
<p>Next I tried running <em>tbbvars.sh</em>, then making the <code>sub_string_finder</code> "Getting Started" example problem. The TBB <em>*.h</em> files were not being found. I tried lots of different things, and it became apparent that the TBB shared object files I'd just created weren't being noticed by the <code>sub_string_finder</code> build process. No matter what I did with my path definitions, nothing worked.</p>
<p>Finally, I tried copying the shared object files to new names, with the file extension changed from <em>.so</em> to <em>.dll</em>. The <code>make</code> worked!</p>
<p>I ran <code>sub_string_finder_extended</code> and got the following results:</p>
<blockquote>
<pre>
$ ./sub_string_finder_extended.exe
 Done building string.
 Done with serial version.
 Done with parallel version.
 Done validating results.
Serial version ran in 6.291 seconds
Parallel version ran in 1.627 seconds
Resulting in a speed up of 3.86663</pre>
</blockquote>
<p>TBB is working under Cygwin! I see a 3.86 speedup on my quad-core Windows system.</p>
<p>Not all of the TBB examples worked. For example, tacheon and other examples that involve graphics aren't working. This could be due to missing components in my Cygwin installation. I'm not too worried about that. These TBB example problems do work under my current Cygwin TBB build:</p>
<ul>
<li>concurrent_hash_map/count_strings</li>
<li>GettingStarted/sub_string_finder</li>
<li>parallel_reduce/primes</li>
<li>parallel_while/parallel_preorder</li>
<li>pipeline/textfilter</li>
<li>task/tree_sum</li>
<li>test_all/fibonacci</li>
</ul>
<p><strong>Conclusion</strong></p>
<p>I'm quite thrilled to have TBB built and running using GCC under Cygwin, such that it fully utilizes all of my system's four cores. Next up: <a href="http://www.mingw.org/">MinGW</a> (Minimalist GNU for Windows); and ultimately a return to my <a href="http://softwareblogs.intel.com/2007/12/08/building-threading-building-blocks-on-uwin-part-1/">"Building Threading Building Blocks on UWIN"</a> effort.</p>
<p><strong>Kevin Farnham, O'Reilly Media</strong> <a href="http://www.ThreadingBuildingBlocks.org">TBB Open Source Community</a>, Freenode IRC #tbb, <a href="http://sourceforge.net/mail/?group_id=200923">TBB Mailing Lists</a></p>
<p><a href="http://threadingbuildingblocks.org/download.php">Download TBB</a></p>
]]></content:encoded>
			<wfw:commentRss>http://softwareblogs.intel.com/2008/03/10/hacking-threading-building-blocks-into-cygwin-part-3/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
