Monday, December 15, 2008

Quick Review of Microsoft's Seadragon

Just got around to installing Seadragon on my iPhone and took it for a quick spin. First impressions are that it rocks. If I load the map views, for example, zooming seems far far smoother than Google Maps. Zooming in on the articles in the Library of Congress Set was amazingly smooth as well.

Herding Code: Episode 29 w/ Miguel de Icaza (Part 2)

Part 2 of the Herding Code podcast w/ Miguel de Icaza is up: http://herdingcode.com/?p=114

I'll be listening to this tonight when I get home.

Sunday, December 14, 2008

Fun with Tries

A number of years ago, Michael Zucchi introduced me to the Aho-Corasick algorithm which he implemented in Evolution for the "Find in Message" feature. Later, I ended up extracting that logic out and refactoring it a bit into the ETrie class which I ended up using to replace the regex logic (which is common among most GNOME apps that do this) to scan for urls in the message body (in order to insert hyperlinks, etc). This turned out to be an order of magnitude faster.

For those who aren't in the know, the Aho-Corasick algorithm is for searching for multiple strings in a given input. In other words, it searches for one of multiple needles in a single haystack.

The pseudocode for the search itself would look something like this:

q = root
FOR i = 1 TO n
  WHILE q != fail AND g(q, text[i]) == fail
    q = h(q)
  ENDWHILE
  IF q == fail
    q = root
  ELSE
    q = g(q, text[i])
  ENDIF
  IF isElement(q, final)
    RETURN TRUE
  ENDIF
ENDFOR
RETURN FALSE

Of course, ETrie has a slight modification to the above algorithm, which is that instead of returning TRUE or FALSE, it returns a pointer to the beginning of the match or NULL if it find no matches.

The problem with this particular implementation, though, is that it would simply return a pointer to the offset into the string of the first match found. But what if you had multiple pattern strings that started with identical characters?

Like the ETrie implementation, the traditional Aho-Corasick algorithm (if I am remembering correctly) stops searching as soon as it finds any match. The difference, of course, is that given access to the current to the state / tree structure, a programmer could choose to continue matching to see if there is a more greedy match. With ETrie, however, in the interest of simplicity, it just returned the first / shortest match it got to.

In case I'm explaining this badly, say our search patterns are "the", "there", and "therefor". In the case of ETrie, it would always return that it matched "the" even when the "the" that it matched was the starting substring of the larger string, "therefor".

In something I was working on more recently (in c++), I wanted a greedier match (e.g. I wanted it to match "therefor" instead of "the").

In order to achieve this, I modified the implementation a slight bit:

const char *
Trie::Search (const char *buf, size_t buflen, int *matched_id)
{
    const char *inptr, *inend, *prev, *pat;
    size_t matched = 0;
    TrieMatch *m = 0;
    TrieState *q;
    char c;
    
    inend = buf + buflen;
    inptr = buf;
    
    q = &root;
    pat = prev = inptr;
    while (inptr < inend) {
        c = *inptr++;
        
        if (icase && (c >= 'A' && c <= 'Z'))
            c += 0x20;
        
        while (q != 0 && (m = g (q, c)) == 0 && matched == 0)
            q = q->fail;
        
        if (q == &root) {
            if (matched)
                return pat;
            
            pat = prev;
        }
        
        if (q == 0) {
            if (matched)
                return pat;
            
            q = &root;
            pat = inptr;
        } else if (m != 0) {
            q = m->state;
            
            if (q->final > matched) {
                if (matched_id)
                    *matched_id = q->id;
                
                matched = q->final;
            }
        }
        
        prev = inptr;
    }
    
    return matched ? pat : 0;
}

I have published the full source code for trie.cpp and trie.h on my website for your perusal.

Saturday, December 13, 2008

Herding Code: Episode 28 w/ Miguel de Icaza

Been waiting to listen to this for a few days now. Figured others might also be interested: Herding Code: Episode 28.

Discussion of the history of Mono and some of the new features that have been added recently.

Wednesday, December 3, 2008

Moonlight 1.0 beta 1

As most of you have probably heard by now, the first Beta release of Moonlight 1.0 has been announced, you can install it from http://www.go-mono.com/moonlight/.

Miguel's got a great post explaining how the multimedia stack works in Moonlight in case that sort of thing interests you.

Monday, November 3, 2008

Re: Black and White

I've just finished reading Linus' blog post entitled Black and White and I have to agree.

It's not productive to be anti-something, it is much healthier and more productive to be for something.

I think Linus' article ties back into the How To Survive Poisonous People presentation I linked to and commented on back in June.

Very often times, the poisonous people in a community are those who try to make everything black and white and typically take the anti approach to things.

We all do it from time to time (myself included, *cough*), but the truly poisonous people are the ones that can't ever let things go.

Friday, October 31, 2008

Mono @ PDC 2008

For those who weren't able to attend Microsoft's Professional Developer's Conference this past week, Miguel gave a presentation (wmv) about Mono detailing some exciting new developments.

For example, Mono 2.2 (slated for the first week of December) will have a C# shell which Miguel and Marek have been working on and which Microsoft has announced will be part of C# version 5, slated for release sometime in 2012 (yes, that means Mono is ahead of Microsoft!).

Also presented was Mono in video games, such as those on the Wii and iPhone platforms (which are both supported targets of the Unity3D game development studio). Some game studios are apparently also using Mono on the PlayStation3, Xbox360, Windows, and Mac platforms. The main interest here was that Unity3D has been rewritten from Objective-C (on the Mac platform) to C# which should make it eventually portable to Linux (C# version now runs on Windows and Mac).

Another interesting development that came about because of interest in using Mono for video games (since some game development studios are now writing games in fully managed code), was that it was important for Mono to get SIMD support which has now been implemented and will be available in Mono 2.2. This has made Mono extremely fast (up to 10x faster) with respect to 3D vector math and other areas that benefit from SIMD.

This is really exciting!

Optimizing Moonlight's InkPresenter

This past week I started out staring at endless amounts of javascript trying to figure out what it was supposed to do so that I could figure out what Moonlight was breaking on in order to fix some bugs. As you can imagine, this is a slow and boring process where one's eyes go dry and it feels like you are getting nowhere fast.

As occasionally happens, I give up and move onto the next bug hoping that the next bug will be easier to fix and/or give me some insight into the previous bug. As it turned out, I moved onto bug #409793 which was a performance bug on sites like Ink Journal and Ink Tattoo Studio.

What was happening on these sites was that as more points got added to the stroke, rendering would get slower and slower (thus causing the line to lag behind the mouse cursor) because Moonlight was invalidating the entire InkPresenter canvas on each frame and so having to render the entire thing even though it was unnecessary.

To optimize this, I added a 'dirty' rectangle that kept track of the actual regions we needed to redraw. As points got added to the collection between frames, I added the bounds of the new point plus the region between the new point and the previous/next points (the 'next' point is obviously only needed if the newly added point was an insertion). The result for sites like InkJournal and InkTattooStudio was that we only invalidated the newly appended points at each frame render, vastly improving our performance which now matches Microsoft's Silverlight performance for these sites afaict.

Wednesday, October 29, 2008

Re: It's Time for a FOSS Community Code of Conduct

Bruce Byfield's article, It's Time for a FOSS Community Code of Conduct, has sadly made me aware that this particular problem is more widespread than I thought.

To Aaron Saigo & the rest of the KDE team: just keep rocking, dudes. Eventually these donkeys will either give up or be committed to mental institutions (or better yet, they'll cross paths with someone just like them and get what's coming around).

To the people out there attacking developers/projects:

Like I blogged a few years ago, rampant fanboyism is fine so long as you can keep it positive. You aren't doing the project you support any favors when you badmouth the other projects.

For example, GNOME devs aren't pleased when they see GNOME fanboys attack KDE all over forums or on news sites.

I'm sure the KDE devs feel the same way.

Thursday, October 23, 2008

QuakeLight: A Silverlight Port of Quake

Looks like Channel9 has just put up a video screen cast of QuakeLight, the Silverlight port of Quake - probably the game I lost the most amount of productivity to back in the late 90's.

I'd love to play with this under Moonlight to see how well we fare...

Update: Here's a handy link to an interview with the developer.

Sunday, October 5, 2008

Parallel-Installable Gtk-Docs

I recently release GMime 2.4.0 but missed the fact that the gtk-docs would install into the same location as the old 2.2 docs, namely ${prefix}/share/gtk-doc/html/gmime. Thwe problem is clear, since GMime 2.4 changed API (rather than just extending the 2.0/2.2 API), GMime 2.4 really needs to be fully parallel installable.

Thanks to Gilles Dartiguelongue from the Gentoo community for discovering this and bringing it to my attention.

As I tried to figure out how to accomplish what I needed, I discovered that Gtk-Doc needed some fixes so I did a quick hack and submitted it upstream.

Unfortunately, my original fix was broken in that I didn't realize that a simple version extension to the install directory wouldn't be enough because GNOME's devhelp program expected the .devhelp and .devhelp2 files would be named with the same version extension, or they would not be detected.

After that was pointed out to me, I looked deeper into the problem and discovered it wasn't such a terribly hard fix afterall, just a little shell scripting magic was all that was needed ;-)

Since Stefan Kost (the gtk-doc maintainer) and myself were both interested in such a solution, I figured there must be other library maintainers who would be interested in this and so thought I would blog about it.

For those interested in packaging GMime 2.4.x for the various distros, I'll be releasing a 2.4.2 release in the next day or so (going to wait a little longer just in case anyone finds any other last minute parallel-install bugs in the gmime-2.4.1.1 tarball I put together for testing; it's bad enough that I've already released a gmime-2.4.1 with nothing but parallel install fixes and that 2.4.2 will be more of the same, I'd rather not have to issue a 2.4.3).

Friday, October 3, 2008

Setting Up a Multi-Tabbed GNOME-Terminal Development Environment

For many developers (such as myself), the first thing we tend to do when we login to our desktops, is to launch gnome-terminal (or similar) with a number of tabs and then will typically have to setup a bunch of environment variables in each tab.

Most of us have probably simplified this at least somewhat by creating a file that sets our environment variables for us when we do

. ~/project.env
or something to that effect.

The problem is that if we have multiple tabs that we've got to source our project.env files in, this gets tedious.

This morning, Rolf shared with me a way to spawn gnome-terminal and have it automatically cd into a list of specified working-directories... but I wanted to take it a step further and have gnome-terminal auto-magically source my project environment variables.

Here's how I solved this problem:

* Step 1: Create a ~/bin/devterms shell script:

#!/bin/bash -e

MONO_ENV="/bin/bash --rcfile ~/.devtermsrc -i"

gnome-terminal \
  --tab --working-directory=/cvs/moon --command="$MONO_ENV" \
  --tab --working-directory=/cvs/moon/test --command="$MONO_ENV" \
  --tab --working-directory=/cvs/mono --command="$MONO_ENV" \
  --tab --working-directory=/cvs/monodevelop --command="$MONO_ENV" \
  &

* Step 2: Create ~/.devtermsrc rc file for bash:

# Load ~/.bashrc to setup all of my standard environment variables and aliases
test -s ~/.bashrc && . ~/.bashrc || true

# Load Mono-specific development environment variables
test -s ~/mono.env && . ~/mono.env || true

* Step 3: Create a launcher (in my case, I made ~/Desktop/Development Environment.desktop):

[Desktop Entry]
Encoding=UTF-8
Version=1.0
Type=Application
Terminal=false
Name[en_US]=Development Environment
Exec=/home/fejj/bin/devterms
Name=Development Environment
Icon=gnome-terminal

Hopefully this will help other people make their own lives a little simpler.

Thursday, September 18, 2008

Light Bot

A friend of mine sent me a link to Light Bot, a game for programmers. I'm already addicted.

Thursday, August 28, 2008

Banshee "Now Playing" Animations

Aaron has been complaining lately that writing fancy graphics hacks for Banshee by hand using low-level Cairo calls is too "hard" and too tiresome to do.

Wimp.

For Hack Week, this week, I've been fixing up the GtkSharp Moonlight widget (aka GtkSilver) to work again so that The Great Bocky can write fancy graphics hacks using XAML instead.

One area that Aaron has mentioned that he'd love to do fancier animations in is the "Now Playing" view which currently is limited to a fade-in and a fade-out of the current track info (album cover w/ a reflection), track title, artist, and album name.

It took him far too long to write the code that displays that view (+ time spent optimizing it to get better than 11 FPS) and if he ever gets around to implementing anything fancier, he'll be popping those Excedrin pills like candy.

Lucky for him, I mocked up a slightly improved version of his "Now Playing" view in XAML (no code required except to update the text, album art, and to invoke the animations). All animations and graphics are represented in just 70 lines of XAML.

The great thing about designing this view in XAML is that it is easy to change the layout or adjust the animations without having to write any new code. Users can customize their "Now Playing" screens in any way they want to, all they have to do is keep the names of the controls and animations consistent with what the application expects. It's a lot like Glade XML files, only it allows the user to do fancy things like animations as well.

The demo I hacked up uses the following XAML:

<Canvas xmlns="http://schemas.microsoft.com/client/2007"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Background="Black" Width="700" Height="550">
  <Canvas x:Name="Display">
    <TextBlock Canvas.Left="10" Canvas.Top="200" x:Name="TrackInfo" Width="280"
               FontFamily="Sans" FontSize="14" TextAlignment="Right" Foreground="White">
      <Run x:Name="Title" FontSize="24">Wonderboy</Run><LineBreak/>
      <Run x:Name="By" FontSize="12" Foreground="Gray">by</Run>
      <Run x:Name="Artist">Tenacious D</Run><LineBreak/>
      <Run x:Name="From" FontSize="12" Foreground="Gray">from</Run>
      <Run x:Name="Album">Tenacious D</Run>
    </TextBlock>
    <Rectangle x:Name="AlbumArt" Canvas.Left="300" Canvas.Top="50" Width="300" Height="300">
      <Rectangle.Fill>
        <ImageBrush x:Name="AlbumArtBrush" Stretch="Fill" ImageSource="tenaciousd.jpg"/>
      </Rectangle.Fill>
    </Rectangle>
    <Rectangle x:Name="AlbumArtReflection" Canvas.Left="300" Canvas.Top="50" Width="300" Height="300"
               RenderTransformOrigin="0,1">
      <Rectangle.Fill>
        <ImageBrush x:Name="AlbumArtReflectionBrush" Stretch="Fill" ImageSource="tenaciousd.jpg"/>
      </Rectangle.Fill>
      <Rectangle.OpacityMask>
        <LinearGradientBrush StartPoint="0,1" EndPoint="0,0">
          <LinearGradientBrush.GradientStops>
            <GradientStop Color="#8f8f8f8f" Offset="0"/>
            <GradientStop Color="#00000000" Offset="0.75"/>
            <GradientStop Color="#00000000" Offset="1"/>
          </LinearGradientBrush.GradientStops>
        </LinearGradientBrush>
      </Rectangle.OpacityMask>
      <Rectangle.RenderTransform>
        <TransformGroup>
          <ScaleTransform ScaleX="1" ScaleY="-0.35"/>
          <SkewTransform x:Name="SkewTransform" AngleX="45" AngleY="0"/>
        </TransformGroup>
      </Rectangle.RenderTransform>
    </Rectangle>
    <Canvas.Resources>
      <Storyboard x:Name="ReflectionAnimation">
        <DoubleAnimationUsingKeyFrames BeginTime="00:00:00" Storyboard.TargetName="AlbumArtReflection"
         Storyboard.TargetProperty="(UIElement.RenderTransform).(TransformGroup.Children)[1].(SkewTransform.AngleX)">
          <SplineDoubleKeyFrame KeyTime="00:00:00" Value="-45" KeySpline="0.25,0.00 0.75,0.75"/>
          <SplineDoubleKeyFrame KeyTime="00:00:02" Value="45" KeySpline="0.25,0.75 0.75,1.0"/>
        </DoubleAnimationUsingKeyFrames>
      </Storyboard>
      <Storyboard x:Name="FadeIn">
        <DoubleAnimationUsingKeyFrames BeginTime="00:00:00" Storyboard.TargetName="Display"
         Storyboard.TargetProperty="Opacity">
          <SplineDoubleKeyFrame KeyTime="00:00:00" Value="0"/>
          <SplineDoubleKeyFrame KeyTime="00:00:01" Value="1"/>
        </DoubleAnimationUsingKeyFrames>
      </Storyboard>
      <Storyboard x:Name="FadeOut">
        <DoubleAnimationUsingKeyFrames BeginTime="00:00:00" Storyboard.TargetName="Display"
         Storyboard.TargetProperty="Opacity">
          <SplineDoubleKeyFrame KeyTime="00:00:00" Value="1"/>
          <SplineDoubleKeyFrame KeyTime="00:00:01" Value="0"/>
        </DoubleAnimationUsingKeyFrames>
      </Storyboard>
    </Canvas.Resources>
  </Canvas>
</Canvas>

To see this in action, you'll need to grab Moonlight out of SVN (I ended up using some Silverlight 2.0 TextBlock features like TextAlignment="Right" so that no code was needed to align the text like what Aaron has to calculate manually in the current code). Once you have that installed, you should be able to view my Banshee "Now Playing" mockup in the iframe below. Mousing over the iframe should cause the fade-in effect to start and mousing out of the iframe area should cause the fade-out to take effect.

Oh, XAML, what is the secret of your power? Won't you take me far away from the mucky-muck man?

Update: It has been pointed out that the above iframe could have been done in pure XAML using EventTriggers for MouseEnter/Leave events to trigger the animations, however I designed the XAML to be used by Banshee where Aaron will likely want to be able to control when the animations start/stop programatically and not based on mouse hover events ;-)

Update: The XAML should now work with Silverlight (apparently it didn't like my LineStackingStrategy or LineHeight property on the TextBlock).

Saturday, August 16, 2008

GMime Devel List

The GNOME admins have just approved and created a mailing list for discussing GMime development (meant both for contributors to the GMime library as well as developers using GMime).

You can subscribe to the mailing-list at http://mail.gnome.org/mailman/listinfo/gmime-devel-list.

Friday, August 15, 2008

Thursday, August 14, 2008

Moonlight Performance Enhancements

Spent today working with Sebastien "Master Profiler" Pouliot on finding and resolving one of our longest outstanding performance bottlenecks in Moonlight for which the GlyphMap Utility is a perfect test case.

Months ago, I had rewritten the font caching a bit so that we prevented (in most cases) the loading of the same font file that were shared between multiple textual XAML elements (Glyphs and TextBlock). At the time, font loading was at the top of the performance profile. While this fix did help a little as far as rendering speed went, it was barely noticeable visually. It wasn't a waste, however, because it greatly reduced memory overhead and later allowed for some better font scaling optimization tricks that I implemented.

Next up, I rewrote the Glyphs XAML element's ::ComputeBounds() routine such that not only did it calculate the rendering extents of the Glyphs element, but also cached the glyph string layout in a Cairo path such that later calls to ::Render() could simply blit the path rather than having to do its own layout.

Still, visual rendering performance seemed barely affected.

Then, today, Sebastien decided to take a look into the problem again and had discovered that sorting UIElements based on ZIndex was toward the top of the list as far as time-eaters went. The fix for this was to delay sorting the newly added UIElements until the engine went to render the next frame.

This bumped the sorting code right out of the time-eaters list but still no visual improvement :(

Sebastien reported back to me that we seemed to be idling between Glyphs draws which immediately made me recall that our Downloaders asyncronously download the referenced font files that each Glyphs element references and that at each render tick, we only popped a single request from our async queue.

I immediately went to work and fixed the async queue logic to pop as many frames as we could in 1/30th of a second (this value may need to be tweaked a slight bit, but it should typically be acceptable).

The GlyphMap table now renders instantly.

Wednesday, August 13, 2008

Moonlighting the Olympics

There's been a lot of interest in Moonlight lately, largely by people who hope to use it to watch the Olympics at http://nbcolympics.com, unfortunately Moonlight isn't quite ready to view this content :-(

The good news is that we are furiously hacking away on Moonlight 2.0 in the hopes of making it usable as quickly as we can.

The first roadblock is that NBC Olympics site is using an old Silverlight.js initialization script and currently requires one of the following browser/OS combinations to work:

  • Internet Explorer 6, 7 for Windows (Vista, XP SP2 or greater and 2003)
  • Firefox 1.5, 2, 3 for Windows (Vista, XP SP2 or greater and 2003)
  • Firefox 1.5, 2, 3 for Mac (OS 10.4.8 or greater, Intel only)
  • Safari 2 & 3 for Mac (OS 10.4.8 or greater, Intel only)

To get past this roadblock for use with Moonlight, you first need to download and install the User Agent Switcher Add-On. Once you install that and restart your Firefox browser, you'll need to go to Tools/User Agent Switcher/Options/Options. This will bring up a configuration dialog allowing you to add custom User Agent strings. You'll want to click on the Add... button and enter the following information:

Description: Firefox 3.0 (Mac)
User Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9) Gecko/2008061004 Firefox/3.0
App Name: Firefox
App Version: 3.0 [en] (Intel Mac OS X 10.5; U)
Platform: Macintosh

I think the only required field is the User Agent, the other fields can probably be set to whatever you want (thanks to Larry Ewing for the User-Agent string and for explaining to me how to do this).

Now that you have a Firefox 3.0/Mac User-Agent string, you'll need to select the "Firefox 3.0 (Mac)" radio button in the Tools/User Agent Switcher Firefox menu. Once you've done that, you should be able to navigate to the NBC Olympics video pages (although you still won't be able to view the video content quite yet... we're still working on writing the code to make that work).

Friday, August 8, 2008

Time-off This Past Week

Took this week off from work to go visit family and did a bit of cycling in my quest to "get back into shape" (or some semblance of shape, anyway).

Started off doing an 8.5-mile hill climb to the top of the eastern hill at Blue Hills Reservation in Milton, Mass which is about a 6% grade or so. Nothing awe-inspiring, but a bit of a workout for me. On Monday I continued with a 10-mile ride accompanied by relatives up in Rochester, NH - no major hills this time but quite a bit of ups and downs, so it was still a workout since I pushed fairly hard. Tuesday and Wednesday I did some 21-mile rides pushing hard and by the end of Wednesday's ride (in the pouring rain, btw) my legs could push no farther so I decided to take Thursday off. Traveled back home and cleaned as much of the sand/grit out of my bike as I could, de-greased and re-lubed my chain and derailleurs (looks like I picked up a little rust on my chain, but nothing major). Then today I went out to take on Blue Hills again, this time managing to push farther, so ended up climbing the eastern hill twice, once from each side. Coming from the west, the incline averages a good 7 or 8% iirc, so not too shabby. That puts me at about 70 miles so far this week.

Tuesday, July 29, 2008

Tachikoma 3D Model?

Dear lazy web,

I'm hoping someone might have some handy links to a (preferably free) 3D CG model of a tachikoma that I'd be able to use in a s3kr3t side-project of mine (oops, I guess it's not so secret anymore, huh?). Since I'm planning on animating the tachikoma, it would preferably have hooks for this or be componentized or something (not sure how 3D CG models are "built" to allow animation, so... excuse my ignorance).

Failing that, a technical blueprint would be greatly helpful if I am forced to model my own.

Credit will be given to the modeler (or draftsman) in my project if it ever reaches the point where I release it to the public (which I'll do if/when it becomes more than a pipedream... currently just in the planning stages and the project is likely beyond my abilities, but we'll see).

Thursday, July 24, 2008

Moonlight 2.0 Hackathon

Starting this past Monday, we've begun a 2-week Moonlight 2.0 Hackathon to try and get as much of the 2.0 API implemented as we can (1.0 is basically done, but could still use some performance optimizations).

I've been reworking our c++ Collection implementation to support the Silverlight 2.0 collections of doubles and Points (Silverlight 1.0 only allowed collections of DependencyObjects) and I'll then move on to updating the c++-side code that used to use generic double and Point arrays to use my new DoubleCollection and PointCollection classes respectively.

I've also taken the liberty to implement some of the additional functionality that Silverlight 2.0 has added to the base Collection class (aka PresentationFrameworkCollection` on the .NET-side of things) like Contains() and IndexOf() (we used to get these 2 for free with my c++ List implementation, but I'm no longer using that as the base) as well as changing the way collection change-notification is done.

Once that is finished, I'll be binding them in managed land.

Meanwhile, Chris Toshok and Rolf have been refactoring the way DependencyProperties get registered since Silverlight 2.0 allows the programmer to create new DependencyObject derivatives and register new DependencyProperty's on those objects.

They'll also be making the type-system per-AppDomain rather than global like it is right now and will be looking into allowing each AppDomain to have multiple Surfaces so that applications like LunarEclipse will be able to work.

Friday, July 18, 2008

PulseAudio: My Last Post On The Topic?

I wasn't going to post anymore on PulseAudio now that Lennart has fixed the core issue that was breaking audio for me on my system (there was a deadlock condition in PulseAudio - my libesd and libgnome fixes were merely a fix to make those libs handle the PA deadlock more gracefully).

But, since Lennart has now attacked me, I guess I should respond:

A few comments:

1. Not directly related to PulseAudio itself. Also, finding errors in code that is related to esd is not exactly the most difficult thing in the world.

Considering that the GNOME desktop and many other applications speak to pulseaudio via libesd, it is still relevant. Until all applications are ported to use PulseAudio directly, libesd is still a part of the "PulseAudio stack".

OK, Jeffrey found a real bug, but I wouldn't say this is really enough to make all the fuss about. Or is it?

Actually, I believe you mean three in the PulseAudio code itself (granted, the pavucontrol bug was already fixed and in a public release - shame on my distro for shipping 0.9.5 instead of 0.9.6, but I don't remember even flaming you about this one - I only reported it as a bug to try and help you make PA better). Yes, the deadlock one was fixed a while ago (assuming the deadlock you fixed is the same one I am experiencing, which it probably is) but it is not in any released version of PA. Nor is the "won't play short sound clips" bug fix.

Hence, not mine nor my distro's fault.

You also completely fail to grasp how frustrating that deadlock bug (which was really the core of my issues) was for anyone using GNOME on a system where that deadlock occurs. Because when that deadlock happens, the entire system locks up! You cannot launch firefox, you cannot start any new GNOME apps, etc. You can't even close gnome-terminal.

Now, I will grant you that part of the blame certainly lies in libesd (hence why I fixed it), but to say it is completely unrelated is bologna. My ESD fixes were not random fixes so I could point fingers, they were fixes so that my system didn't hang whenever PA did.

5. A valid bug, but not really in PulseAudio. Mostly caused because the ALSA API and PA API don't really match 100%.

This seems like a design failure - if the plan is to make applications that use ALSA directly "Just Work(tm)" through PA, then shouldn't PA have been designed to map better to ALSA?

I still consider this to be a PA "bug" even if the fix was in ALSA(?).

Many people (like Jeffrey) wonder why have software mixing at all if you have hardware mixing?

Actually, that's not at all what I was wondering. I had wondered why PA was needed at all because ALSA already did mixing for me.

Now, I will admit to being naieve about the full list of goals that PA is trying to achieve (which are worthy goals), but I think they all take a backseat to having sound actually work reliably in a basic setup (apps playing audio through local speakers).

Jeffrey thinks that audio mixing is nothing for userspace. Which is basically what OSS4 tries to do: mixing in kernel space. However, the future of PCM audio is floating points. Mixing them in kernel space is problematic because (at least on Linux) FP in kernel space is a no-no. Also, the kernel people made clear more than once that maths/decoding/encoding like this should happen in userspace. Quite honestly, doing the mixing in kernel space is probably one of the primary reasons why I think that OSS4 is a bad idea. The fancier your mixing gets (i.e. including resampling, upmixing, downmixing, DRC, ...) the more difficulties you will have to move such a complex, time-intensive code into the kernel.

Fair enough.

I still don't like it (but I guess there's nothing you can do about it if the kernel folk are holding you back), but so long as it Just Works(tm), in the end I don't care.

Not every time your audio breaks it is alone PulseAudio's fault. For example, the original flame of Jeffrey's was about the low volume that he experienced when running PA. This is mostly due to the suckish way we initialize the default volumes of ALSA sound cards. Most distributions have simple scripts that initialize ALSA sound card volumes to fixed values like 75% of the available range, without understanding what the range or the controls actually mean. This is actually a very bad thing to do. Integrated USB speakers for example tend export the full amplification range via the mixer controls. 75% for them is incredibly loud. For other hardware (like apparently Jeffrey's) it is too low in volume. How to fix this has been discussed on the ALSA mailing list, but no final solution has been presented yet. Nonetheless, the fact that the volume was too low, is completely unrelated to PulseAudio.

This is actually partly also openSUSE's fault in 2 ways:

1. the volume control in the panel launches pavucontrol instead of gnome-volume-control. Since I need to use gnome-volume-control in order to adjust the master volume of the ALSA device, and the fact that the volume shown in pavucontrol was set to MAX, led me to be confused.

2. the keyboard volume controls do not properly adjust the master volume (I think I may have wrongly blamed this on PA in one of my posts, but don't feel like re-reading my posts to confirm - if I did, apologies).

OTOH Ubuntu didn't exactly do a stellar job. They didn't do their homework. Adopting PA in a distribution is a fair amount of work, given that it interfaces with so many different things at so many different places. The integration with other systems is crucial. The information was all out there, communicated on the wiki, the mailing lists and on the PA IRC channel. But if you join and hang around on neither, then you won't get the memo.

If distros are getting this wrong, then maybe there needs to be better communication so that things like what happened to Ubuntu don't keep happening.

FWIW, I have personally read over the "The Perfect PulseAudio Setup" wiki page and afaict, openSUSE 11 followed the recommended setup but there were still problems (obviously).

(Although the problems I experienced were bugs in code rather than bugs in setup...mostly)

[2] In fact, Flash 9 can not be made fully working on PulseAudio. This is because the way Flash destructs it's driver backends is racy. Unfixably racy, from external code. Jeffrey complained about Flash instability in his second post. This is unfair to PulseAudio, because I cannot fix this. This is like complaining that X crashes when you use binary-only fglrx.

Sort of like it's unfair that Evolution users complain when Evolution doesn't work with some broken server. But users don't care that it's a broken server if other clients work with it, so it's still on the Evolution developer's shoulders to work around those bugs.

Sort of like how historically, hardware makers haven't been giving Linux kernel devs the specifications to make their hardware work, and so it is left up to F/LOSS developers to bear the burden of making it work anyway.

No one ever said life was fair and unfortunately, a LOT of users are going to expect Flash to work. If/when it doesn't, they will complain and in a manner of speaking, it is PulseAudio's fault because it did work pre-PA.

Wednesday, July 16, 2008

Gtk+ 3.0: The Things That Make You Go "Hmmm?"

Lots of discussion about Gtk+ 3.0 lately, unfortunately there's no real discussion going on (heh).

There's the camp that is excited about API/ABI breakage and there's the camp that isn't so enthused.

I have to take the side of the "not so enthused", alongside Miguel de Icaza[1][2], Havoc Pennington and Morten Welinder: it's crazy insane to go breaking API/ABI for the sake of cleaning up the API without adding new functionality that could not be achieved by simply extending the current 2.x series.

There are definitely things that can be done to the current gtk code base without having to break the current API/ABI to achieve them, so in my humble opinion that's what should be done. Granted, eventually there will have to be a Gtk+ 3.0 which breaks API/ABI in order to add some new functionality that isn't possible to implement in 2.x, but that road can be crossed when we get there. No sense breaking it now if it isn't absolutely needed.

Even more worrying is the lack of promise to not break API/ABI again after 3.0. In fact, I've read on one of the mailing lists or somewhere that they plan on breaking API/ABI every 4-5 years after 3.0, ripping out any API's they marked as deprecated in the meantime. This is crazy. One-Flew-Over-the-Cukoo's-Nest crazy (or, if you prefer, I'm-Cukoo-For-Cocoa-Puffs crazy).

For large applications (Gnumeric, Evolution, Mozilla, Eclipse, etc), porting from 1.2 to 2.0 was a huge PITA but we all did it because (besides being young and stupid) 2.0 made a long-term commitment to maintain API/ABI compatibility going forward in addition to the fact that it added a ton of much-needed functionality (accessibility, UTF-8, model/view, and really good text layout w/ support for RTL and CJK).

If 2.0 was just 1.2 + sealed structs and the removal of some deprecated APIs, no one would have bothered with 2.0.

To respond to a point on Tim Janik's blog:

3.0 will ABI-incompatibly remove all deprecated and private APIs. Of course, the above described deprecation scheme only scales well if deprecated APIs are really removed from the code base at some point.

One reason that ISV's are typically on-board with Microsoft is that Microsoft has gone to great lengths to maintain backward compatibility with their old APIs. Joel Spolsky talks about it a bit in How Microsoft Lost the API War (specifically, the section The Two Forces at Microsoft).

The Windows testing team is huge and one of their most important responsibilities is guaranteeing that everyone can safely upgrade their operating system, no matter what applications they have installed, and those applications will continue to run, even if those applications do bad things or use undocumented functions or rely on buggy behavior that happens to be buggy in Windows n but is no longer buggy in Windows n+1.

I'm not going to argue that Gtk/GNOME should go to these extensive lengths to make sure buggy apps continue to work, but there's no reason that G*_DISABLE_DEPRECATED can't simply mean "we're not going to bugfix these anymore, what you see is what you get". You don't have to ever remove them, especially since it likely doesn't require much maintenance to keep them around. The port to Gtk+ 3.0 would require some work to make sure all the internals of things like GtkCList use accessor methods to anything that they use internally, but after that the job should be done and over with.

(ABI might not be a bit hard, but you could maintain API - thus meaning a simple recompile would work)

On the *giggle* *giggle* You should be using Mono/Gtk# *giggle* *giggle*-side, any application built against Gtk# 2.x will still work with Gtk# 3.x w/o any changes :-)

Oh, and Gtk# has DataBinding support (which, btw, does not require API/ABI breakage to Gtk+ nor sealed structs like some people seem to think on this morning's #gnome-hacker's discussion *cough* *cough*).

PulseAudio: I Told You So

To those who told me that my PulseAudio problems were my fault and/or my distro's fault, you were wrong[1][2][3][4][5]*. I told you so.

The first 2 bugs were fixed by me last week (libgnome and libesd) and simply worked around the 4th bug linked above. On that note, I released esound-0.2.39 with my fixes (along with a number of other patches that had been sitting in bugzilla god-knows how long collecting dust).

The 3rd bug is just a crash in the pavucontrol which is apparently fixed in a new version (fairly minor annoyance since by the time I tried to use pavucontrol, pulseaudio was already deadlocked iirc).

The 4th bug (which was the most serious of the ones I've been experiencing) was a deadlock in the pulseaudio daemon (it is supposedly fixed now, but it is not in any public release yet). Just because you didn't experience it, doesn't mean the bug was my fault or my distro's fault ;-)

The 5th bug was found by Geoff Norton, Rolf Kvinge, and myself today while trying to get moonlight to work well with PulseAudio after much head scratching wondering why sound wasn't playing for certain short mp3's (turns out that it's reproducible with any mp3 player if you try to play a short enough audio stream where the decoded stream is less than ~22 kilobytes iirc, but we didn't notice that until just after filing the bug).

So the good news is that the PulseAudio devs have fixed the more serious issues I've found so far, but I'm still annoyed by the attitude of the people pushing PulseAudio of "well, we are forcing people to use PulseAudio so that bugs get found by the users".

For normal applications, this is less annoying... but when it is something as low-level/fundamental as PulseAudio (or heaven forbid, the linux kernel), it should be QA'd thoroughly before dumping it on the users.

Anywho, hopefully openSUSE/Ubuntu/Fedora will push my patches and the PA dev's patches soonish, so that should really help minimize more unnecessary user suffering.

* there are other bugs that I filed as well, but I'm too lazy to go digging for them since the "My Bugs" link on the PulseAudio Trac doesn't actually work.

Sunday, July 13, 2008

Good News on the Evolution Front

Novell has just recently (end of last week) announced that Evolution will no longer require copyright assignment. This is awesome news that I've been hoping would happen for a quite a while now.

Evolution's license is also changing to be dual-licensed under the LGPLv2 and LGPLv3.

Thanks to Michael Meeks and Srini for making this happen!

Thursday, July 10, 2008

PulseAudio Again...

Just had another exciting PulseAudio crash where my sound card got continually spammed until my ears bled and then it spammed the sound card some more.

Seriously though, my coworkers were a bit annoyed by my laptop blaring really loud and obnoxious noise for the past hour (made worse because my keyboard volume control keys do not actually work with pulseaudio which is apparently a known issue with Lenovo T61 laptops I'm told).

However, this did provide me with a most excellent opportunity to test the patches I posted in my previous blog entry about my pain and suffering at the hands of PulseAudio.

Yet again, I was unable to launch any new apps - though instead of blocking in esd_open_sound(), they blocked in esd_send_auth(), so I went and patched that up to use non-blocking IO, build new packages, installed them, and tested and HALLELUJAH! they work!

Yes, ladies and gentlemen, I can now actually use my system even when PulseAudio takes a proverbial dump on my desktop.

I've submitted my patches upstream:

  • bug #542391: libesd should not block if the daemon dies/hangs/whatever
  • bug #542296: gnome_sound needs to handle pa hangs more gracefully

Unfortunately, these patches do not solve the root problem (pulseaudio suckage), but at least they allow me to get work done when pulseaudio craps out on me (which only happens every few hours if I'm lucky).

Oh, and if you are on 32bit x86 openSUSE 11 systems, you can grab some packages I made up at http://www.gnome.org/~fejj/pulseaudio/. I've also uploaded the raw patches in case anyone wants to push them in their own distro packages.

Sex With Dead People Deemed Illegal

Ok, so as I'm riding the MBTA in to work this morning reading my Metro newspaper, I come across the article "Court: Sex with corpses is illegal"

Apparently until yesterday's ruling, sex with dead people was technically legal in the state of Wisconsin.

The last sentence of the article was, in my humble opinion, the most hilarious sentence I've ever read in my life:

In yesterday's 5-2 decision, the high court said Wisconsin law makes sex acts with dead people illegal because they are unable to give consent.

Yes folks, it was a 5-2 ruling. So 2 people ruled in favor of allowing sex with dead people. W. T. F.

But to make it even more hilarious, the reason they decided it was not kosher was that the dead people could not give consent. L. O. L.

We live in a funny world...

Wednesday, July 9, 2008

More PulseAudio Problems

I reinstalled PulseAudio to try and resolve the plethora of problems I've been having with it (that and openSUSE 11 doesn't actually ship th esound daemon anymore, so if I want full audio support I have no choice but to get the pulse-esound-compat package working short of forking the distro and maintaining my own set of esound/gnome/etc packages which is not my idea of fun).

Unfortunately, this has resulted in countless hours of frustration. After having read and followed the instructions at http://www.pulseaudio.org/wiki/PerfectSetup, it turns out that is exactly how my system was already configured. So no help there.

I don't know what the deal is, but I constantly have problems with PulseAudio.

For example, earlier today I was having an IM conversation with a friend in Pidgin and it locked up. I restarted Pidgin, but I wasn't getting audio events anymore. *shrug* Oh well.

At some point later I navigated my firefox window over to youtube to link a friend to some video. No surprise here, I get no sound and in fact, the flash video pauses as soon as the sound was supposed to start (which is a few seconds into the video).

At this point I close firefox and decide to reload it, thinking maybe that will solve it. Nope, I was wrong. Instead, firefox hangs before any windows pop up. Great.

Since I got flamed last time I blogged about PulseAudio for issuing a `killall -9 pulseaudio` command (btw, pulseaudio appears to have a --kill command-line option), I figured I'd log out and log back in again instead... but this was not to be. Instead, my desktop locks up (presumably because of the logout.wav).

Ok... so I Ctrl+Alt+F1 to the console, login as root, `init 3; init 5`. Since my system is configured to auto-login, when X came back up it tried to log me in. Instead, I get a hang.

Good grief, I guess I have no option but to reboot so that's what I do.

In the interest of saving other people from this catastrophe, I grabbed libgnome out of svn and wrote up the following patch to prevent it from hanging whenever PulseAudio decides to misbehave (which it seems quite prone to do).

Index: libgnome/gnome-sound.c
===================================================================
--- libgnome/gnome-sound.c (revision 3750)
+++ libgnome/gnome-sound.c (working copy)
@@ -30,8 +30,12 @@
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <sys/types.h>
 #include <unistd.h>
+#include <fcntl.h>
+#include <errno.h>
 #include <time.h>
+#include <poll.h>
 
 #ifdef HAVE_ESD
 #include <esd.h>
@@ -413,6 +417,55 @@
 }
 #endif
 
+#ifdef HAVE_ESD
+static int
+send_all (int fd, const char *buf, size_t buflen)
+{
+ struct pollfd pfd[1];
+ size_t nwritten = 0;
+ int flags, rv;
+ ssize_t n;
+ 
+ if ((flags = fcntl (fd, F_GETFL)) == -1)
+  return -1;
+ 
+ fcntl (fd, F_SETFL, flags | O_NONBLOCK);
+ 
+ pfd[0].events = POLLOUT;
+ pfd[0].fd = fd;
+ 
+ do {
+  do {
+   pfd[0].revents = 0;
+   rv = poll (pfd, 1, 100);
+  } while (rv == -1 && errno == EINTR);
+  
+  if (pfd[0].revents & POLLOUT) {
+   /* socket is ready for writing */
+   do {
+    n = write (fd, buf + nwritten, buflen - nwritten);
+   } while (n == -1 && errno == EINTR);
+   
+   if (n > 0)
+    nwritten += n;
+  } else if (pfd[0].revents & (POLLERR | POLLHUP)) {
+   /* we /just/ lost the esd connection */
+   esd_close (fd);
+   fd = -1;
+   break;
+  } else if (rv == -1 && errno == EBADF) {
+   /* socket is bad */
+   fd = -1;
+   break;
+  }
+ } while (nwritten < buflen);
+ 
+ if (fd != -1 && flags != -1)
+  fcntl (fd, F_SETFL, flags);
+ 
+ return fd;
+}
+#endif
 
 /**
  * gnome_sound_sample_load:
@@ -469,21 +522,23 @@
     * file, or event type, for later identification */
    s->id = esd_sample_cache (gnome_sound_connection, s->format, s->rate,
         size, (char *)sample_name);
-   write (gnome_sound_connection, s->data, size);
-   confirm = esd_confirm_sample_cache (gnome_sound_connection);
+   
+   gnome_sound_connection = send_all (gnome_sound_connection, (const char *) s->data, size);
+   if (gnome_sound_connection != -1)
+     confirm = esd_confirm_sample_cache (gnome_sound_connection);
+   
    if (s->id <= 0 || confirm != s->id)
      {
        g_warning ("error caching sample <%d>!\n", s->id);
        s->id = 0;
      }
-   g_free (s->data);
-   s->data = NULL;
  }
     }
 
   sample_id = s->id;
 
-  g_free(s->data); g_free(s);
+  g_free(s->data);
+  g_free(s);
 
   return sample_id;
 #else
@@ -521,9 +576,12 @@
 
   sample = gnome_sound_sample_load (buf, filename);
 
-  esd_sample_play(gnome_sound_connection, sample);
-  fsync (gnome_sound_connection);
-  esd_sample_free(gnome_sound_connection, sample);
+  if (gnome_sound_connection != -1 && sample > 0)
+    {
+      esd_sample_play(gnome_sound_connection, sample);
+      fsync (gnome_sound_connection);
+      esd_sample_free(gnome_sound_connection, sample);
+    }
 #endif
 }

This patch should hopefully help prevent the typical gnome apps from hanging when trying to play audio, but I don't really have any surefire way of testing that it actually works.

Update 2008-07-10 10:42am I've now also written the following patch for libesd so that it doesn't hang for god knows how long when trying to open a connection to the sound server (in my case, pulse-esound-compat):

diff -up esound-0.2.38.orig/esdlib.c esound-0.2.38/esdlib.c
--- esound-0.2.38.orig/esdlib.c 2008-07-10 10:10:31.000000000 -0400
+++ esound-0.2.38/esdlib.c 2008-07-10 10:32:23.000000000 -0400
@@ -20,9 +20,11 @@
 #include <arpa/inet.h>
 #include <errno.h>
 #include <sys/wait.h>
+#include <poll.h>
 
 #include <sys/un.h>
 
+
 #ifndef SUN_LEN
 #define SUN_LEN(ptr) ((size_t) (((struct sockaddr_un *) 0)->sun_path)  \
                      + strlen ((ptr)->sun_path))
@@ -408,6 +410,43 @@ int esd_resume( int esd )
     return ok;
 }
 
+static int
+connect_timeout (int sockfd, const struct sockaddr *serv_addr, size_t addrlen, int timeout)
+{
+ struct pollfd pfd[1];
+ int flags, rv;
+ 
+ if ((flags = fcntl (sockfd, F_GETFL)) == -1)
+  return -1;
+ 
+ fcntl (sockfd, F_SETFL, flags | O_NONBLOCK);
+ 
+ if (connect (sockfd, serv_addr, addrlen) == 0) {
+  fcntl (sockfd, F_SETFL, flags);
+  return 0;
+ }
+ 
+ if (errno != EINPROGRESS)
+  return -1;
+ 
+ pfd[0].fd = sockfd;
+ pfd[0].events = POLLIN | POLLOUT;
+ 
+ do {
+  pfd[0].revents = 0;
+  rv = poll (pfd, 1, timeout);
+ } while (rv == -1 && errno == EINTR);
+ 
+ if (pfd[0].revents & (POLLIN | POLLOUT)) {
+  /* success, we are now connected */
+  fcntl (sockfd, F_SETFL, flags);
+  return 0;
+ }
+ 
+ /* took too long / error connecting, either way: epic fail */
+ return -1;
+}
+
 /**
  * esd_connect_tcpip: make a TCPIP connection to ESD
  * @host: specifies hostname and port to connect to as "hostname:port"
@@ -512,7 +551,7 @@ esd_connect_tcpip(const char *host)
            goto error_out;
          }
 
-         if ( connect( socket_out, res->ai_addr, res->ai_addrlen ) != -1 ) 
+         if ( connect_timeout ( socket_out, res->ai_addr, res->ai_addrlen, 1000 ) != -1 ) 
            break;
 
          close ( socket_out );
@@ -596,9 +635,9 @@ esd_connect_tcpip(const char *host)
     socket_addr.sin_family = AF_INET;
     socket_addr.sin_port = htons( port );
   
-    if ( connect( socket_out,
-    (struct sockaddr *) &socket_addr,
-    sizeof(struct sockaddr_in) ) < 0 )
+    if ( connect_timeout ( socket_out,
+      (struct sockaddr *) &socket_addr,
+      sizeof(struct sockaddr_in), 1000 ) < 0 )
  goto error_out;
 
     }
@@ -650,8 +689,7 @@ esd_connect_unix(void)
     socket_unix.sun_family = AF_UNIX;
     strncpy(socket_unix.sun_path, ESD_UNIX_SOCKET_NAME, sizeof(socket_unix.sun_path));
   
-    if ( connect( socket_out,
-    (struct sockaddr *) &socket_unix, SUN_LEN(&socket_unix) ) < 0 )
+    if ( connect_timeout ( socket_out, (struct sockaddr *) &socket_unix, SUN_LEN(&socket_unix), 100 ) < 0 )
  goto error_out;
   
     return socket_out;

Hopefully after these 2 patches get applied, my system at least won't become unusably hung when pulseaudio decides to crap out on me, but these patches doesn't solve the root of the problem :(

Monday, July 7, 2008

4th of July Fireworks from VMWare Office

I watched the 4th of July fireworks from the 11th Floor offices of VMWare in Cambridge this year with some friends.



dscn0265.jpg, originally uploaded by jstedfast.


dscn0343.jpg, originally uploaded by jstedfast.


dscn0349.jpg, originally uploaded by jstedfast.


dscn0400.jpg, originally uploaded by jstedfast.


dscn0401.jpg, originally uploaded by jstedfast.

Update: I should note that these photos were made possible thanks to Alan McGovern's 256 MB CF card he mailed me last week. So thanks, Alan! ;-)

Wednesday, June 25, 2008

Ohloh Stats

So, I was told this morning that I'm on Ohloh.net's top 15 "Most Experienced C Programmers" list.

Sure enough, there I am:


Updated: Turns out, after I had consolidated my accounts, I'm now #9 on the list.

PulseAudio: A Solution In Search of a Problem

I upgraded the Linux OS on my laptop yesterday so that I had a more up-to-date GNOME installation such that I could actually build Evolution from SVN and hack it a bit to solve some annoyances I've been having. However, this is for another blog post.

So... PulseAudio. W. T. F.

All day yesterday I'd been wondering why my sound volume was so low even though I had cranked it all the way up to 100%. Since I was mostly focused getting work done yesterday, I didn't spend any time investigating until last night when I was forced to do so.

I don't remember the exact order of things, but I did watch an episode of Harsh Realm in Xine while also IM'ing some friends in Pidgin, but the sound was so low I could barely hear anything. So a friend on IRC suggested I kill pulseaudio and restart it as this supposedly would help fix the "wonkyness" (evidently I'm not the only one with PulseAudio problems).

This worked for a while, long enough to play the DVD iirc, but afterward I had no sound. So I thought "well, I guess this is just more PulseAudio 'wonkyness', so let me kill pulseaudio again". This time when I killed it, my sound card got flooded with every audio event that had happened over the hour or so I was running Xine. Yay. (that was a sarcastic yay in case you couldn't tell)

Audio worked again for a bit. At one point, a friend IM's me a link so I clicked it and no browser started up (which seemed a bit odd). Tried it again, still nothing. So I figured "ok, I'll just launch firefox from my gnome-terminal and see if that prints any errors". No errors, but it did just hang. Oh goodie. Ctrl-C'd and killed pulseaudio again. At this point my friend IM'd me again which made Pidgin completely lock up:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26422 fejj      20   0 3054m 1.3g  27m D  198 33.2  18:55.68 pidgin

So the issues of memory usage are perfect for another rant blog post, but the point is that pidgin is completely locked. I can't even kill -9 it (I tried multiple times).

I had to Ctrl+Alt+F1, login as root, and `init 3`

At this point I was so frustrated that I uninstalled each and every package with pulseaudio in the name before going back to X.

Low and behold, after that my audio works flawlessly (well, the Pidgin IM audio events make some crackling which I don't recall happening before I reinstalled, but I can live with it - especially since other audio seems crisp).

So yea... the stuff that the Linux Hater says is all true. He even complained about audio under Linux a while back (couldn't remember which entry it was with a quick look over the entry names).

The point of this story is that PulseAudio appears to be a solution in search of a problem (quite common in the land of Linux afaict).

As I was bitching about this to a hacker friend of mine, he asked me "isn't PulseAudio for sound mixing or something? So multiple programs can all play audio at the same time?".

Nope, because ALSA seemed to do mixing just fine for me in the past.

A quick Google search for PulseAudio reveals that it is supposed to allow fancy stuff like redirecting sound to another host machine for playing.

...just as I suspected: a solution in search of a problem.

Seriously though, who the hell needs that feature on their desktop/laptop machines? No one, that's who. The only people who might need that are thin clients - so why install it by default? Why not just have the sysadmins for those thin-client networks set this up?

PulseAudio is a clusterfuck for basic desktop audio needs afaict.

Sigh.

Update: Apparently I offended the PulseAudio developer with my "it's a solution in search of a problem" statement. For that, I apologize. A number of people have explained what exactly it is supposed to be solving. Unfortunately, none of the problems it is solving seem to be problems I need solved (maybe my sound cards in all my machines have hardware mixing support? I don't know. Or maybe dmix is plenty good enough for me).

However, I still get the feeling that PA is not quite ready for wide desktop adoption because people on other distros are having some of the same problems I had and this is not good.

I'd expect a similar rant from people if Evolution's current IMAP provider was replaced with my IMAP4 provider. As nice as my IMAP4 provider replacement is for my usage, there are no doubt new problems that it might currently have over the old implementation that would break things for some people. (Hence why I have not pushed for it to replace the old IMAP provider until I'm sure it is ready.)

Also, this was tagged as a rant. I had every right to be frustrated because things weren't working like they have worked Just Fine(tm) for me for years. I've had to put up with far worse personal attacks on myself and software I've been involved with over the years than anything I said on this blog (and lets face it, this was not a personal attack aimed at the PA devs - it was just me venting my frustration with a piece of software), so I think some people took this post way too personally.

Friday, June 20, 2008

It's a Beautiful Evening...


A Rainbow Over the Church, originally uploaded by jstedfast.

Looked up just after a thunderstorm tonight and caught this rainbow - just had to take a picture of it ;-)

Saturday, June 14, 2008

Calculating the Nearest Power of 2

The typical implementation for finding the nearest power of 2 for a given value is as follows:

static uint32_t
nearest_pow (uint32_t num)
{
    uint32_t n = 1;

    while (n < num)
        n <<= 1;

    return n;
}

This implementation's performance, unfortunately, suffers as the value of num increases. Luckily there is another approach that takes a constant time no matter how large the value:

static uint32_t
nearest_pow (uint32_t num)
{
    uint32_t n = num > 0 ? num - 1 : 0;

    n |= n >> 1;
    n |= n >> 2;
    n |= n >> 4;
    n |= n >> 8;
    n |= n >> 16;
    n++;

    return n;
}

A simple performance test might be:

int main (int argc, char **argv)
{
    uint32_t i, n = 0;

    for (i = 0; i < INT_MAX / 10; i++)
        n += nearest_pow (i);

    return n > 0 ? 1 : 0;
}

The run-time difference between the two implementations on my AMD Athlon (/proc/cpuinfo reports AMD Athlon(TM) XP 3200+ @ 2200.141 MHz) is impressive. For performance testing, I compiled with gcc -O2 which I figure is the typical default for most packaged software on Linux distributions.

The brain-dead approach has the following results:

[fejj@serenity cvs]$ time ./pow

real 0m12.034s
user 0m11.809s
sys 0m0.032s

The bitwise implementation is insanely fast:

[fejj@serenity cvs]$ time ./pow2

real 0m1.361s
user 0m1.304s
sys 0m0.008s

Now... to be fair, the if you are using small values for num, then it's possible that the brain-dead approach might be faster. Let's try the same main() for-loop again, but this time let's request nearest_pow() with a value of 1 each time. Since it is likely that the results will be far too fast to really compare, let's also bump up the number of iterations to UINT_MAX.

[fejj@serenity cvs]$ time ./pow

real 0m0.003s
user 0m0.000s
sys 0m0.004s
[fejj@serenity cvs]$ time ./pow2

real 0m0.002s
user 0m0.000s
sys 0m0.000s

Unfortunately, both are still far too fast to really compare performance. Let's try bumping up the value of num to see if we can find the point at which the while-loop approach starts to fall behind the bitwise approach. To start, let's try passing the value of 2 as the num argument:

[fejj@serenity cvs]$ time ./pow

real 0m0.002s
user 0m0.000s
sys 0m0.004s
[fejj@serenity cvs]$ time ./pow2

real 0m0.002s
user 0m0.000s
sys 0m0.000s

It looks like the bitwise approach may be faster than the while-loop approach for the value of 2, but it's a bit hard to tell for sure with only UINT_MAX loops. We'd have to switch to using a 64bit i to know for sure and I'm not sure it's that important. Let's try 3 and see what we get:

[fejj@serenity cvs]$ time ./pow

real 0m6.053s
user 0m5.968s
sys 0m0.004s
[fejj@serenity cvs]$ time ./pow2

real 0m0.003s
user 0m0.000s
sys 0m0.004s

Well, hot diggity... I think we have ourselves a winner. This suggests that for all values of num larger than 2, the performance of the while-loop approach will be outmatched by the bitwise approach and that for values less-than-or-equal to 2, the performance is nearly identical.

Update: Thanks to the anonymous commenter that noticed that my original main() program was allowing the compiler to optimize out the call to nearest_pow() in the bitwise implementation. As suggested, I updated the for-loop to accumulate the output and then used it after the loop to avoid this problem. It only seemed to change the results for the bitwise implementation in the first test, however (before the change, it reported 0.002s). Still, on my machine it is approx. 10x faster for the first test case and seems to lose no performance even in the optimal conditions for the while-loop implementation.

Update2: I was just pointed to the Linux kernel's fls() implementation for x86. Here is a new implementation using inline assembler for x86:

static uint32_t
nearest_pow (uint32_t num)
{
    int bit;

    __asm__("bsrl %1,%0\n\t"
            "jnz 1f\n\t"
            "movl $-1,%0\n"
            "1:" : "=r" (bit) : "rm" (num));

    return (1 << (bit + 1));
}

The results for the original INT_MAX / 10 iterations using i as the num argument yields the following results:

[fejj@serenity cvs]$ time ./pow3

real 0m1.335s
user 0m1.296s
sys 0m0.004s

The results seem negligibly faster than the C bitwise implementation and obviously less portable :(

Update3: A friend of mine, Stephane Delcroix, has just pointed me at a solution to this problem that he came up the other day:

static uint32_t
nearest_pow (uint32_t num)
{
    uint32_t j, k;
    (j = num & 0xFFFF0000) || (j = num);
    (k = j & 0xFF00FF00) || (k = j);
    (j = k & 0xF0F0F0F0) || (j = k);
    (k = j & 0xCCCCCCCC) || (k = j);
    (j = k & 0xAAAAAAAA) || (j = k);
    return j << 1;
}

The results of this implementation are as follows:

[fejj@serenity cvs]$ time ./pow4

real 0m1.249s
user 0m1.204s
sys 0m0.004s

This is actually faster than both the bitwise and the assembler implementations above!

There are two things to be aware of, though:

  • When num is 0, the value of 0 is returned (which may not be desirable depending on what you are doing with it)
  • If num is a power of 2, then instead of returning num, this implementation will return the next higher power of 2

Friday, June 13, 2008

Linux Haters

Okay, I have to join in and say that I, too, have been enjoying Linux Hater's Blog these past few days.

He makes a lot of good points and I like his humor ;-)

Saturday, June 7, 2008

GMime 2.3.2

Just released GMime 2.3.2 last night which took some comments about the GMimeHeaderIter API into consideration (like allowing iters to be on the stack).

It also now has support for multipart/encrypted PGP/MIME parts that have been both signed and encrypted in a single pass (support for this exists for both encrypting and decrypting).

Many more .NET API improvements as well.

Check it out!

Updated: Just released 2.3.3 after a weekend of hacking.

Wednesday, June 4, 2008

Netscape Plugin Tips & Tricks

If your Firefox plugin is scriptable, you should probably be validating the NPVariant arguments that you are passed in your invoke() callback. Unfortunately, this gets tedious and error-prone if you allow arguments to your methods to be of more than a single type (e.g. you accept a date as either a string or integer) or if you have optional arguments.

As I was debugging a crash in Moonlight, I discovered that we weren't properly checking argument-types properly in all cases (this particular bug was caused because the javascript had passed a bad argument to one of our methods which accepted either a string or null). To protect against this in a more robust way, I came up with the following helper functions:

typedef enum {
    MethodArgTypeNone   = (0),
    MethodArgTypeVoid   = (1 << NPVariantType_Void),
    MethodArgTypeNull   = (1 << NPVariantType_Null),
    MethodArgTypeBool   = (1 << NPVariantType_Bool),
    MethodArgTypeInt32  = (1 << NPVariantType_Int32),
    MethodArgTypeDouble = (1 << NPVariantType_Double),
    MethodArgTypeString = (1 << NPVariantType_String),
    MethodArgTypeObject = (1 << NPVariantType_Object),
    MethodArgTypeAny    = (0xff)
} MethodArgType;

static MethodArgType
decode_arg_ctype (char c)
{
    switch (c) {
    case 'v': return MethodArgTypeVoid;
    case 'n': return MethodArgTypeNull;
    case 'b': return MethodArgTypeBool;
    case 'i': return MethodArgTypeInt32;
    case 'd': return MethodArgTypeDouble;
    case 's': return MethodArgTypeString;
    case 'o': return MethodArgTypeObject;
    case '*': return MethodArgTypeAny;
    default:
        return MethodArgTypeNone;
    }
}

static MethodArgType
decode_arg_type (const char **in)
{
    MethodArgType type = MethodArgTypeNone;
    register const char *inptr = *in;
    
    if (*inptr == '(') {
        inptr++;
        while (*inptr && *inptr != ')') {
            type |= decode_arg_ctype (*inptr);
            inptr++;
        }
    } else {
        type = decode_arg_ctype (*inptr);
    }
    
    inptr++;
    *in = inptr;
    
    return type;
}

/**
 * check_arg_list:
 * @arglist: a string representing an arg-list token (see grammar below)
 * @args: NPVariant argument count
 * @argv: NPVariant argument vector
 *
 * Checks that the NPVariant arguments satisfy the argument count and
 * types expected (provided via @typestr).
 *
 * The @typestr argument should follow the following syntax:
 *
 * simple-arg-type ::= "v" / "n" / "b" / "i" / "d" / "s" / "o" / "*"
 *                     ; each char represents one of the following
 *                     ; NPVariant types: Void, Null, Bool, Int32,
 *                     ; Double, String, Object and wildcard
 *
 * arg-type        ::= simple-arg-type / "(" 1*(simple-arg-type) ")"
 *
 * optional-args   ::= "[" *(arg-type) "]"
 *
 * arg-list        ::= *(arg-type) (optional-args)
 *
 *
 * Returns: %true if @argv matches the arg-list criteria specified in
 * @arglist or %false otherwise.
 **/
static bool
check_arg_list (const char *arglist, uint32_t argc, const NPVariant *argv)
{
    const char *inptr = arglist;
    MethodArgType mask;
    uint32_t i = 0;
    
    /* check all of the required arguments */
    while (*inptr && *inptr != '[' && i < argc) {
        mask = decode_arg_type (&inptr);
        if (!(mask & (1 << argv[i].type))) {
            /* argv[i] does not match any of the expected types */
            return false;
        }
        
        i++;
    }
    
    if (*inptr && *inptr != '[' && i < argc) {
        /* we were not provided enough arguments */
        return false;
    }
    
    /* now check all of the optional arguments */
    inptr++;
    while (*inptr && *inptr != ']' && i < argc) {
        mask = decode_arg_type (&inptr);
        if (!(mask & (1 << argv[i].type))) {
            // argv[i] does not match any of the expected types
            return false;
        }
        
        i++;
    }
    
    if (i < argc) {
        /* we were provided too many arguments */
        return false;
    }
    
    return true;
}

An example usage might be check_arg_list ("(so)i(nso)[bb]", argc, argv). In this example, the first argument is expected to be either a string or object. The second argument would be an int32. The third argument would be null, a string, or an object. The 4th and 5th arguments are both of type bool if they are present.

Since the NPVariant struct is a widely-used concept in C programming, you might find my hack useful for other projects as well.

Enjoy!

Sunday, June 1, 2008

How To Survive Poisonous People

This is an interesting Google Tech Talk about how to survive poisonous people in F/LOSS communities. I probably find it interesting at least in part because there are a couple of very loud poisonous people constantly attacking the GNOME and Mono projects (both of which I'm involved with), but it's a great talk to listen to anyway.

I personally love the "Hostility" slide at around 29 minutes into the talk:

Hostility
  • Insults the status quo
  • Angrily demands help
  • Attempts to blackmail
  • Attempts to deliberately rile
  • Makes accusations of conspiracy (paranoia)

The bold/italic bit is a big hint-hint to the handful of people (not that they aren't also guilty of some of the others points) I'm referring to as the poisonous people around GNOME and Mono ;)

Saturday, May 31, 2008

Interesting link of the day

What IronRuby Running Rails *REALLY* Means and Why Miguel de Icaza Deserves The Credit

Grats to Miguel ;-)

GMime 2.3.1

I've been continuing to make improvements to the API over the past few days and have just made a 2.3.1 release.

The main focus of this release has been a redesign of the header API which has been nagging at me for quite a while now. To fix it, I've done the following:

  • Renamed GMimeHeader to GMimeHeaderList, GMimeHeader felt more like it should be a single header (name/value pair).
  • Introduced a new GMimeHeader, altho it is internal only at the moment. I had originally planned on making it public (hence the rename of the old GMimeHeader struct).
  • Introduced a GMimeHeaderIter which can be used to iterate forward and backward over the headers, allowing you to change header values (but not names). It also allows removal of headers which makes it possible to remove any header you want, whereas the old API would only let you remove a header by name (which was a problem if there was more than 1 by the same name).
  • Got rid of g_mime_header_foreach() because that API sucked and has been replaced by the far more useful GMimeHeaderIter API.

The GMimeHeaderIter API looks like this:

GMimeHeaderIter *g_mime_header_iter_copy (GMimeHeaderIter *iter);
void g_mime_header_iter_free (GMimeHeaderIter *iter);

bool g_mime_header_iter_equal (GMimeHeaderIter *iter1, GMimeHeaderIter *iter2);

bool g_mime_header_iter_is_valid (GMimeHeaderIter *iter);

bool g_mime_header_iter_first (GMimeHeaderIter *iter);
bool g_mime_header_iter_last (GMimeHeaderIter *iter);

bool g_mime_header_iter_next (GMimeHeaderIter *iter);
bool g_mime_header_iter_prev (GMimeHeaderIter *iter);

gint64 g_mime_header_iter_get_offset (GMimeHeaderIter *iter);
const char *g_mime_header_iter_get_name (GMimeHeaderIter *iter);
bool g_mime_header_iter_set_value (GMimeHeaderIter *iter, const char *value);
const char *g_mime_header_iter_get_value (GMimeHeaderIter *iter);

/* if the iter is valid, removes the current header and advances 
   to the next - returns TRUE on success or FALSE otherwise */
bool g_mime_header_iter_remove (GMimeHeaderIter *iter);

Currently, the way to get a GMimeHeaderIter is to call g_mime_header_list_get_iter() which returns a newly allocated iter. I'm not sure if this is the best API or not tho... some other thoughts I've had are:

GMimeHeaderIter *g_mime_header_iter_new (void);
bool g_mime_header_list_get_iter (GMimeHeaderList *list, GMimeHeaderIter *iter);

The second function would initialize iter and return TRUE, or, if list was empty, it could return FALSE.

Another option would be to just have:

GMimeHeaderIter *g_mime_header_iter_new (GMimeHeaderList *list);

Then, if list is empty you'd just get back an invalidated iter. If, later, headers were added to list, then perhaps the iter could auto-magically become valid again. This would more-or-less work the same as the current code - except that the way to instantiate a GMimeHeaderIter is different.

To be honest, now that I've written these 2 ideas down, I think I prefer the first idea.

So... iterators. Since you can have multiple iterators for the same GMimeHeaderList, it is important to invalidate them on at least two occasions:

  • when the GMimeHeaderList is destroyed
  • when the header that an iter points to is removed; either by a call to g_mime_header_list_remove() or via a call to g_mime_header_iter_remove() on another iter instance.

You'll be pleased to note that only iters that reference a header that got removed are invalidated. This fact seems to be an argument in favor of my first idea above, as it would allow re-use of iters once they get invalidated.

Thursday, May 29, 2008

GMime 2.3.0

GMime 2.3.0 has been released which marks the first development release of GMime since 2002. A new age has begun.

This new development release gets rid of off_t from all public interfaces, replacing them with gint64 instead so that there is no API/ABI breakage when building it with Large File support. But more importantly (to me), it also fixes a number of the API inconsistencies/uglyness that I discovered while looking over the GMime-Sharp bindings.

The GMime-Sharp bindings still have a long way to go to be beautiful, but they are becoming more of a priority as I am more interested in using GMime-Sharp myself. If you are a user of the GMime-Sharp bindings, I'd love for you to try out GMime 2.3.0 and give me some feedback and how to improve the API even more. I, myself, will continue to scrutinize the API and make adjustments as I go - in the end, I hope to have the most beautiful MIME .NET API ever created.

At first, I was only going to fix GMime's public interfaces to use gint64 instead of off_t and save the rest of my changes for a "3.0" or at least a "2.6", but I decided that since I was breaking the ABI anyway - I might as well do some cleanup and I think developers worldwide will thank me after their initial pain of porting their software to the new API (for which I hope to write a script to help in their troubles).

Largely these changes are renaming of functions, sometimes shuffling them to become methods on a base class, etc.

I got rid of GMimePartEncodingType and created a GMimeContentEncoding enum instead, complete with a g_mime_content_encoding_to_string() and g_mime_content_encoding_from_string(). I also created a new state object called GMimeEncoding (may rename to GMimeEncoder?) which simplifies the base64/QuotedPrintable/UU encoder/decoder APIs a little in that it handles initializing the state variables for you as you encode or decode to any of those encodings.

As I was updating code to use these new enums/structs, I was able to simplify GMimeFilterBasic to not need its own GMimeFilterBasicType struct - instead, you pass the encoding and a bool to encode vs decode to the new g_mime_filter_basic_new() constructor. This has helped simplify a lot of code inside GMime and I'm sure it will help simply everyone elses code as well, because there is no longer a need to map GMIME_PART_ENCODING_BASE64 to GMIME_FILTER_BASIC_TYPE_BASE64_ENC or GMIME_FILTER_BASIC_TYPE_BASE64_DEC.

I've also fixed GMimeContentType and GMimeContentDisposition to notify their parent GMimeObject when they are changed so that the following C#ism becomes possible:

part.ContentType.SetParameter ("name", "fubar.txt");

Previously, you'd have to use the GMimeObject's Content-Type helper methods instead of using the Content-Type's methods directly, like so:

part.SetContentTypeParameter ("name", "fubar.txt");

There are quite a few other improvements like this that I've added to the GMime-Sharp API as well. Sure, they may be minor, but they make the API much more polished.

Wednesday, May 28, 2008

A Moment of Zen

And here it is, your Moment of Zen:

Those who assume the worst in other people, are by their very nature among the worst people.

Those who assume the best in other people, are by their very nature among the best people.

-- Jeffrey Stedfast

Saturday, May 17, 2008

Debian Language Benchmarks - SumFile

Since someone brought it up the other day on IRC, I figured I'd take a look at some of the Java vs C# benchmarks where Java was claimed to be faster than C#.

Scrolling through the list, there were 3 tests where Java was supposedly more than 2x faster than C# on Mono.

The SumFile test looked fairly simple and I figured the bottleneck there might be related to something silly being done with StandardInput (Miguel replaced the Int.Parse() routines with my more optimal implementations a while back, so I figured that was probably ok).

The first thing I tried to do was reproduce their findings, so I grabbed the Java and C# sources from the Debian site and compiled them.

Both programs read from StandardInput an integer value per line, so I wrote a simple program to generate some input for the test:

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
    unsigned int max, i;
    int sum = 0;

    max = atoi (argv[1]);

    for (i = 0; i <= max; i++) {
        printf ("%u\n", i);
        sum += i;
    }

    fprintf (stderr, "master value: %d\n", sum);

    return 0;
}

As you can see above, the program outputs the real sum of all the digits it is outputting to stderr so that we can compare the results of the C# and Java programs to make sure they are correct for 32bit signed integers.

Then I compiled the Java and C# versions using the same compile options mentioned on the Debian Language Shootout pages for each language:

javac -classpath '.' sumcol.java

and

gmcs sumcol.cs

I then ran each program using the following commands (using the same java and mono command-line options):

[fejj@moonlight benchmarks]$ time ./output 5000000 | java -server -Xbatch sumcol
master value: 1647668640
2147483647

real    0m4.606s
user    0m5.264s
sys     0m0.276s
[fejj@moonlight benchmarks]$ time ./output 5000000 | mono sumcol.exe
master value: 1647668640
1647668640

real    0m1.415s
user    0m2.136s
sys     0m0.228s

As you can see above, there's no way that Java is 2.3x faster than Mono... so I'm not sure where they are getting these results from. Not only that, but Java is getting the wrong answer!

My best guess is that Java does not allow integer wrapping, so I suppose that that is okay... but it's a bit odd.

For comparison's sake, here's what I get with the c implementation of sumfile:

[fejj@moonlight benchmarks]$ time ./output 5000000 | ./sumcol
master value: 1647668640
1647668640

real    0m1.043s
user    0m1.604s
sys     0m0.188s

Thinking that Java might be taking a performance hit from the bounds checking, I changed my original number-generating program to always print '1' on each line:

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{
    unsigned int i, max;
    int sum = 0;

    max = atoi (argv[1]);

    for (i = 0; i < max; i++) {
        printf ("1\n");
        sum++;
    }

    fprintf (stderr, "master value: %d\n", sum);

    return 0;
}

Running the C, Mono, and Java versions again, I get the following results:

[fejj@moonlight benchmarks]$ time ./output 5000000 | ./sumcol
master value: 5000000
5000000

real    0m0.601s
user    0m0.828s
sys     0m0.032s
[fejj@moonlight benchmarks]$ time ./output 5000000 | mono sumcol.exe
master value: 5000000
5000000

real    0m0.774s
user    0m0.876s
sys     0m0.064s
[fejj@moonlight benchmarks]$ time ./output 5000000 | java -server -Xbatch sumcol
master value: 5000000
5000000

real    0m0.751s
user    0m0.824s
sys     0m0.096s

Here we can see that Java is slightly faster than Mono when comparing the 'real' and 'user' times.

I figured that in order to get a more accurate reading, I should re-run using a larger value, so...

[fejj@moonlight benchmarks]$ time ./output 21474836 | java sumcol
master value: 21474836
21474836

real    0m2.625s
user    0m3.236s
sys     0m0.304s
[fejj@moonlight benchmarks]$ time ./output 21474836 | mono sumcol.exe 
master value: 21474836
21474836

real    0m3.225s
user    0m3.712s
sys     0m0.272s

Again, Java is faster according to 'real' and 'user' times.

What have we learned?

The Java program appears to be slightly faster if and only if we do not overflow the signed integer, otherwise Mono takes the prize.

If we revert back to using the original program I wrote to generate input and re-run using our most recent 'max' value, we find:

[fejj@moonlight benchmarks]$ time ./output 21474836 | java sumcol
master value: 370655678
2147483647

real    0m19.157s
user    0m22.757s
sys     0m0.568s
[fejj@moonlight benchmarks]$ time ./output 21474836 | mono sumcol.exe
master value: 370655678
370655678

real    0m6.004s
user    0m9.513s
sys     0m0.940s

Here we see that Mono clearly outperforms Java, no matter how you slice it.

Conclusion?

You can't compare the performance of the Java and C# programs without first understanding the implications of the input provided to them.

Had the Debian Language Shootout used a different input, the performance tests might have instead shown Java getting raped by Mono for this particular test. Basically, Java just got lucky that the inputs were in their favor.

Update: Oops, I forgot to give specifics of my Mono/Java versions and the specs of my laptop:

Mono: JIT compiler version 1.9 (/trunk/ r103228)

java -version gives the following output:

java version "1.6.0_06"
Java(TM) SE Runtime Environment (build 1.6.0_06-b02)
Java HotSpot(TM) Server VM (build 10.0-b22, mixed mode)

The hardware used was my Lenovo T61 Core2Duo @ 2.4GHz

This was running on OpenSuSE 10.3 with the latest updates.

Update 2: I was pointed to http://shootout.alioth.debian.org/download/sumcol-input.txt to use as input for this SumFile test. Below I've pasted the results:

[fejj@moonlight benchmarks]$ time java -server -Xbatch sumcol < ~/sumcol-input.txt 
500

real    0m0.137s
user    0m0.052s
sys     0m0.028s
[fejj@moonlight benchmarks]$ time mono sumcol.exe < ~/sumcol-input.txt 
500

real    0m0.078s
user    0m0.060s
sys     0m0.012s

Sorry guys, but even with this input, Java is not 2x faster than Mono.

Update 3: Since the above comparisons were all done using the StreamTokenizer version of the Java implementation, I figured I should go back and get the run times of the original Java implementation.

Using my original implementation of output.c, each line having a value 1 larger than the previous line, we get:

[fejj@moonlight benchmarks]$ time ./output 5000000 | java -server -Xbatch sumcol
master value: 1647668640
1647668640

real    0m1.492s
user    0m2.272s
sys     0m0.168s

Interesting note: this Java implementation does not get the wrong answer...

The results for "1" on ever line gives the following results:

[fejj@moonlight benchmarks]$ time ./output 5000000 | java -server -Xbatch sumcol
master value: 5000000
5000000

real    0m0.924s
user    0m0.984s
sys     0m0.176s

Repeating the trial run that totally killed Java's performance with the Tokenizer implementation, we get:

[fejj@moonlight benchmarks]$ time ./output 21474836 | java -server -Xbatch sumcol
master value: 370655678
370655678

real    0m6.564s
user    0m9.893s
sys     0m0.688s

And finally, the results for the sumcol-input.txt, we get:

[fejj@moonlight benchmarks]$ time java -server -Xbatch sumcol < ~/sumcol-input.txt 
500

real    0m0.107s
user    0m0.056s
sys     0m0.032s

Conclusion:

The original Java implementation gets much better results than the StreamTokenizer implementation which was supposedly faster according to the Benchmarks table, however, it's still not 2x faster than the Mono implementation (in fact, this version of the Java implementation is roughly on par with the C# implementation).

Update 4: One of the comments to this blog post requested I try the Java6 SumCol #3 program which includes its own buffering/tokenizer implementation.

Since this program "cheats" a bit, I figured I'd pit it against my own implementation of equal craftyness ;-)

You can find my C# implementation here.

The results are as follows:

[fejj@moonlight benchmarks]$ time ./output 21474836 | mono sumcol2.exe
master value: 370655678
370655678

real    0m4.996s
user    0m4.904s
sys     0m0.676s
[fejj@moonlight benchmarks]$ time ./output 21474836 | java -server -Xbatch sumcol3
master value: 370655678
370655678

real    0m5.117s
user    0m5.748s
sys     0m1.088s

Am I a crafty bugger or what? ;-)

Running these implementations against sumcol-input.txt, I get the following results:

[fejj@moonlight benchmarks]$ time mono sumcol2.exe < ~/sumcol-input.txt 
500

real    0m0.076s
user    0m0.064s
sys     0m0.016s
[fejj@moonlight benchmarks]$ time java -server -Xbatch sumcol3 < ~/sumcol-input.txt 
500

real    0m0.108s
user    0m0.064s
sys     0m0.040s

Damn I'm good! ;-)

Update 5: It's been suggested that the -server and -Xbatch args might make Java slower, so I've redone the tests (this time using sumcol-input.txt repeated 100,000 times)

Here are the results:

[fejj@moonlight benchmarks]$ time java -server sumcol < sumcol-input100000.txt 
50000000

real    0m19.025s
user    0m18.657s
sys     0m0.412s
[fejj@moonlight benchmarks]$ time java sumcol < sumcol-input100000.txt 50000000

real    0m20.643s
user    0m20.269s
sys     0m0.364s
[fejj@moonlight benchmarks]$ time mono sumcol.exe < sumcol-input100000.txt 
50000000

real    0m18.884s
user    0m18.245s
sys     0m0.604s

In the case of Java, it looks like using the -server flag performs a little bit better on this particular test than without.

Overall, the Java and Mono performance for this test are remarkably close.

In the end, I think I can feel pretty confidant that Mono's performance on this test is acceptable and it's not worth wasting any more time on it for now.

Update 6: Okay, one last update to show that a fully managed C# implementation can outperform the fastest C implementation submitted to the Debian Language Benchmarks:

[fejj@moonlight benchmarks]$ time mono sumcol2.exe < sumcol-input100000.txt 
50000000

real    0m2.906s
user    0m2.572s
sys     0m0.308s
[fejj@moonlight benchmarks]$ time ./sumcol < sumcol-input100000.txt 
50000000

real    0m13.967s
user    0m13.277s
sys     0m0.668s

Yes, ladies and gentlemen, that is C# wiping the floor with C.

Saturday, May 10, 2008

Wouldn't it be nice if...

Over the past week, I've been spending some time hacking on Evolution again because of my frustration with the current IMAP backend. This got me to wondering... why hasn't anyone stepped up to the plate and rewritten Evolution's IMAP code yet?

I think the reason can be summed up with the following 2 problems:

1. IMAP is hard

2. Coding something complicated like a multithreaded multiplexed IMAP client library in C is harder.

That got me and Michael Hutchinson wondering... wouldn't it be nice if we could write Camel provider plugins in C# or in any other managed language that we might prefer?

I think Camel, like Evolution itself, should allow developers to implement plugins in C# as well. I really think this might help lessen the burden of implementing new mail protocol backends for Camel/Evolution.

On that note, I've created a new svn module called camel-imap4 which can be built against your installed evolution-data-server devel packages.

Currently, however, it'll probably only work with e-d-s >= 2.23.x because some things (and assumptions) in Camel have changed recently.

One problem I'm having is that the symbol camel_folder_info_new() used to not exist in older versions of e-d-s, but recently that symbol was added and makes use of g_slice_alloc0(). The problem is that the way providers used to allocate CamelFolderInfo structures before was using g_new0() themselves. Why does this pose a problem? There's no guarantee that I'm aware of that you can mix and match g_malloc/g_slice_free or g_slice_alloc/g_free.

This makes it difficult for me to implement a plugin that builds and works with my installed version of Evolution (2.12) and also work with Evolution svn (2.23). This is quite unfortunate :(

While I'm at it, allow me to also propose some changes to the GChecksum API. Please, please, please make it so that we ned not allocate/free a GChecksum variable each time we need to checksum something?

I propose the ability to do the following:

GChecksum checksum;

g_checksum_init (&checksum, G_CHECKSUM_MD5);
g_checksum_update (&checksum, data, len);
g_checksum_get_digest (&checksum, digest, &len);

Then I'd like to be able to either call g_checksum_init() on checksum again or maybe have another function to clear state, maybe g_checksum_clear() which would allow me to once again use the same checksum variable for calculating the md5 of some other chunk of data.

Camel generates md5sums for a lot of data, sometimes in loops. Having to alloc/free every iteration is inefficient and tedious.

Update: It now builds and works with Evolution 2.12 (I haven't tested anything else). But the new and improved IMAP back-end for Evolution is now actually working. Whoohoo!

Code Snippet Licensing

All code posted to this blog is licensed under the MIT/X11 license unless otherwise stated in the post itself.