ZFS Source Tour

March 20, 2008 at 12:33 AM | categories: solaris, software engineering | View Comments

http://www.opensolaris.org/os/community/zfs/source/

Read and Post Comments

Firefox Bug on Long Tooltips

June 17, 2007 at 01:49 AM | categories: humor, software engineering | View Comments

The true test of Geekdom is whether you can read a bug report on a Saturday evening and find great humor in it. Here is your chance:

Bug 45375 – SeaMonkey-only bug: Long tooltips should wrap instead of being cropped (multiline tooltips)

Look for such "celebrity" appearances as the "xkcd" guy and others. My favorite part, is the bug reporter chiming in each year to wish the bug happy birthday, though there is an inconsistency when he changes the gender of the bug along the way. Maybe I should file a bug on the bug report?
Read and Post Comments

Regular Expressions, Lisp, SQL, Parsing, Domain Specific Languages

June 10, 2007 at 03:34 PM | categories: code, software engineering, lisp, philosophy, programming, unix | View Comments

I've been trying to code some more on Project Shelob (my web server) in my spare time. I'm to the point of needing a configuration file, so I can start up the server using different ports and directories for testing. Speaking of testing, I'm also to the point of needing automated test suites. I was refactoring some of the HTTP code, and when I got done, it was far more readable, and there was much rejoicing! Unfortunately, two days later I discovered I had introduced a subtle bug in keep-alive handling during a 404 event. Oops. Anyway, I decided to use JSON as my configuration language. Simple, accommodated everything I needed, and later I would be able to easily write an AJAX GUI front end to configure the whole thing. Should be slick, right? Not as easy as it might sound. Though I have written parsers by hand, I'd rather not. Ok, so I'm using C++, surely someone has written an easy to use open source library that I can just stick in my rules and get out a nice data structure, right? Well, kind of. There is Boost Spirit which would do everything that I want it to do, but it also required me translating the EBNF grammar of JSON into Boost's strange amalgamation of YACC and C++. Okay well and good, but surely there is something better? After some more searching, I run across ANTLR which seems to be the spiritual successor to LEX and YACC/Bison. It even has a nice Java GUI and someone had kindly done the ANTLR rules for JSON. Check out the graphical goodness:

Still, the C++ backend wasn't fully supported and required installing libraries and was complicated. Not 100% what I needed or wanted. All of which got me thinking about domain specific languages. Most programmers don't consider it, but SQL and Regular Expressions are good examples of Domain Specific Languages (DSL), as are lex and yacc/bison. Up till now, I've frowned on the whole idea of DSLs in general. It had always seemed like bad software engineering practice to invent a new language for each problem. After all, did we really want to learn an entirely new programming language with each assignment? Who is going to maintain the code? However, the facts point out that you have to learn an entire API anyway, and the API really just layers over what you're really trying to do with a language that wasn't quite expressive enough to do the job natively to begin with. Which of course leads me to LISP and through Martin Fowler who makes some good points here:
"One of the most obviously DSLy parts of the world is the Unix tradition of writing little languages. These are external DSL systems, that typically use Unix's built in tools to help with translation. While at university I played a little with lex and yacc - similar tools are a regular part of the Unix tool-chain. These tools make it easy to write parsers and generate code (often in C) for little languages. Awk is a good example of this kind of mini-language."
While I've been using SQL, regular expressions, awk, lex, and yacc for years, I'd never really classified them in my mind as DSLs. I've been well aware of the power of small specialized utilities aggregated together to perform a bigger task and why UNIX has been so successful at this, but I hadn't made the leap to apply this to my programming. Fowler continues:
"Lisp is probably the strongest example of expressing DSLs directly in the language itself.. Symbolic processing is embedded into the name as well as practice of lispers. Doing this is helped by the facilities of lisp - minimalist syntax, closures, and macros present a heady cocktail of DSL tooling. Paul Graham writes a lot about this style of development. Smalltalk also has a strong tradition of this style of development."
I've heard "grey-beards" and academics talk about the power of Lisp for years, and though I did some trivial functional programming in college, I've dismissed the rants of the Lisp guys as nothing more than rants. Today though, the ideas are crystallizing in my head, and I'm excited to explore this more.
Read and Post Comments

Understanding Monkey Patching

May 24, 2007 at 12:53 AM | categories: python, ruby, software engineering, rails | View Comments

I've noticed one of the problems I have writing this blog is that I prefer to have finished thoughts when I write up something, or at least to have a good understanding of a problem I am working on before committing it to 'paper'. Unfortunately, this doesn't lead to many updates. I'll try to break this habit a little. Recently I heard the term 'Monkey Patching' after one of the Seattle Patterns Group meetings, in relation to Ruby on Rails.

A Monkey-Patch (also called Monkey Patch, MonkeyPatch) is a way to extend or modify runtime code without altering the original source code for dynamic languages (e.g. Ruby and Python).
Today, reading a blog entry from Chad Fowler, the term came up again with a Python developer saying:
You can monkeypatch code in Python pretty easily, but we look down on it enough that we call it "monkeypatching". In Ruby they call it "opening a class" and think it's a cool feature. I will assert: we are right, they are wrong.
When I read that, I felt almost relieved, because I was thinking the same thing. I can see a limited use for it, but it seems like something you should only do in dire circumstances, that it would be detrimental to good software engineering practices. I can't prove this, nor am I totally convinced, but Ruby and Ruby on Rails in particular seems to play a little bit fast and loose. I suppose this fits in with Agility, but there does seem to be a mental divide between Python and Ruby people (even though they are really quite close linguistically). To date, I'm much more in the Python camp, but I've been deploying Rails apps at work, and I'll be delving more into Ruby as I go forward. I have heard rumors that Zope does Monkey Patching, and this convinces me even more. Zope has almost single handedly destroyed Python's reputation at my place of work. Thanks Zope!
Read and Post Comments

Linux Block I/O Scheduler Interview

January 31, 2007 at 01:33 PM | categories: software engineering, linux | View Comments

Kernel Trap has a great interview with the maintainer of the Linux Block IO layer. He discusses some of the limitations in the current I/O schedulers, and how they can be swapped out dynamically at runtime. I found the following particularly informative: "Splice has a host of applications. It can completely replace the bad hack that is sendfile(), which is an extremely limited zero copy interface for sending a file over the network. The neat thing about using pipes as the buffers, is that you have a known interface to work with and a way to tie things together intuitively. A good and easy to understand example is a live TV setup, where you have a driver for your TV encoder (lets call that /dev/tvcapture) and a driver for your TV decoder (lets call that /dev/tvout. Say you want to watch live TV while storing the contents to a file for pausing or rewind purposes, you could describe that as easy as:"

$ splice-in /dev/tvcapture | splice-tee out.mpg | splice-out /dev/tvout
"The first step will open /dev/tvcapture and splice that file descriptor to STDOUT. The second will duplicate the page references from the STDIN pipe, splicing the first to the output file and splicing the second to STDOUT. Finally, the last step will splice STDIN to a file descriptor for /dev/tvout. The data never needs to be copied around, we simply move page references around inside the kernel. It's like building with Lego blocks :-)"
Read and Post Comments

Cyclomatic complexity of Django

January 12, 2007 at 04:39 PM | categories: python, software engineering | View Comments

Gary Wilson has written up a post detailing the cyclomatic complexity of Django (the Python web dev framework).

Wikipedia defines cyclomatic complexity as:

Cyclomatic complexity is a software metric (measurement) in computational complexity theory. It was developed by Thomas McCabe and is used to measure the complexity of a program. It directly measures the number of linearly independent paths through a program's source code. The concept, although not the method, is somewhat similar to that of general text complexity measured by the Flesch-Kincaid Readability Test. Cyclomatic complexity is computed using a graph that describes the control flow of the program. The nodes of the graph correspond to the commands of a program. A directed edge connects two nodes if the second command might be executed immediately after the first command.

It isn't often that you see software engineering metrics applied to open source projects. I wonder why that is?

Read and Post Comments