Category Archives: code

Parentheses in Ruby

I’ve been doing a lot of “heart” and “mind” posts lately, and “code” is starting to feel neglected. If you’re not a programmer, feel free to give this one a miss.

I put off writing this post til the last minute because I want to talk about parentheses in Ruby, and I’d like to give you a definitive “do this” answer, but the reality is I just don’t know, so I’m gonna throw out some thoughts and let you guys tell me what you think.

Ruby Makes Parentheses Optional

Okay, that’s not news. But a lot of people come to Ruby from languages where parens are required, and they see that they’re optional in Ruby, and so they keep them in out of habit. And then they complain that Ruby is dumb for not having first-class functions.

What is often overlooked here is that optional parens are important. Matz chose to sacrifice first-class functions just so he could make parentheses optional. So when I started programming in Ruby, I made an effort to eschew them when possible. Now, “optional” means optional–it doesn’t mean banned and it doesn’t mean required. So there’s room for wiggle here.

Some Strong Styles Have Emerged

There are some very strong idioms that have emerged, however. Well, mainly just one: Never use empty parens. This is actually two rules: never put parens at the end of a method definition if the method takes no arguments, and never use parens when sending a message that takes no arguments. There’s a lot of really good reasons for doing this, and I won’t go into them here.

Generally Favoring Parens

I know some smart programmers who favor parens in ruby. Their general argument is simply “it makes the code more readable”.

My experience has been that this argument is indistinguishable from “it makes the code look like the other languages I’ve spent years learning to read”. This is not necessarily wrong. I think both styles have tradeoffs, and there’s a mental cost to be paid to learn to read a new style. Since this cost has to be paid up front, the tradeoff is really imbalanced in the short-term. But is this a false economy in the long run? I’m asking because I don’t know.

Generally Eschewing Parens

On the far other side of the spectrum lies “Seattle.rb style”. Josh Susser originally coined the term “Seatttle style” to mean “not using parens in method definitions”, but Ryan Davis and Seattle.rb have taken it and run with it to mean “never use parentheses unless the compiler requires it”.

I spent a couple weeks experimenting with this style. It definitely had an effect on my code. I never did get to where I like omitting parens from method definitions, and maybe I am subject to my own argument above about embracing the tradeoff. One thing I did find, however, was that trying to avoid parens everywhere else had a profoundly rewarding effect on how I felt about my code. The original argument from Matz about making parens optional was that parens are often just noise, and getting rid of this noise is important.

Parens and Readability

When I talked about this with the rest of the Ruby Rogues, I quickly found myself in a 3-against-1 battle (Katrina wasn’t available). Josh made the argument that omitting parens decreases readability. You have to read this entire line of code to know when the first method call is actually done:

puts array.delete hash.fetch :foo

And I have to agree. That line of code is horrible.

But… it’s not the lack of parens that make it horrible. That line of code is horrible all by itself. I don’t think this is really any better:

puts array.delete(hash.fetch(:foo))

(Another strong convention in Ruby is to never use parens with calls to puts; otherwise another layer of parens could be added, but they would be gratuitous.)

Now, some programmers will find that line of code more readable. My point is that this is a problem. The line looks a bit more comforting, but you still actually have to read the whole line to know what’s happening. This line is doing two manipulations on unrelated primitives followed by a side effect call to puts. This line desperately needs some intention-revealing variable, methods, and selectors.

I see code like this all the time in parenful code. It sort of gets a pass with the parens stuck on. We say “yeah, this could be refactored, but it’s readable enough for now.”

But scroll back up to the version without parens. That line is unforgiveable. It has to go.

How valuable is that? I’m asking; I feel like it’s a lot, but I had to spend a few weeks learning to read a whole new style before I got that value. So I don’t know the answer. I don’t if it’s worth it. What I do know is that by eschewing parens, I never ever write lines like the above. The code just won’t let me. It bugs me, and I refactor it, and then my code feels better to look at, and the whole program becomes much more readable. That’s immensely valuable to me. But I don’t know how to communicate that it’s worth the trouble–or even if it’s worth it at the end of the day.

Readability For Other Developers

Another good question is if one developer hasn’t learned this style, and another one has, should parens be the default to maximize readability? I grind my teeth whenever people start making “lowest common denominator” arguments because it’s really hard sometimes to see the line between “this is overcomplicated” and “let’s do something more stupid because it’s easier”.

So part of the experiment this week has been paying attention to how my coworkers react to my code. It’s hardly been a scientifically rigorous study, but given a sample size of one week and two code reviews, I have one data point. It comes from a coworker who strongly favors parentheses, but has been tolerant of my style. His exact words were:

“Your code is a pleasure to read.”

I am confident that he was not talking about my parenthesis usage, but rather about the clean, refactored, expressive code that resulted from me not liking the way some lines of code looked without parens stuck on.

It’s hardly a conclusive argument, but it’s one that encourages me to continue researching. I’m not ready to stand up and proclaim that parentheses are nothing but noise that covers up code smells, but I am giving serious thought to secretly believing it. 🙂

Weirich-Style Kung Fu

A hybrid style is emerging from these discussions, and it’s structurally based on “The Weirich Rule”, which is about blocks rather than parentheses, but the similarities are definitely there and I find the analogy appealing. In Ruby you can write blocks with braces or with do…end; the Weirich Rule states that if the block returns a value, using braces signals that intent to the reader. If the block exists for its side effects, using do…end signals this intent.

A Weirich-inspired rule for parentheses, then, look like this:

If you care about the return value, send the message using parens.

The people who have I have talked to–well, listened to–about this rule are passionate about it and convinced that it increases readability and communicates intent. I want to get behind this rule, I really do. it’s a clear, bright-line rule for when to use parentheses or not. It’s the sort of rule that might get included in a book about ruby, for example.

But…

But I still have a problem with this style: when I went full-on no-parens mode, I was forced by the ugliness of my code to refactor it to be simpler and cleaner. When I use this paren style, I can feel that pressure evaporate. I’ve tried this rule for a week, and and I can feel my code quality suffering. I’m not convinced it’s a good rule.

Maybe I just need more engineering discipline. On the other hand, however, I have found that coding styles that require more discipline tend to embarrass me when I don’t step up. I gravitate much more strongly to coding styles that encourage and support me in having more engineering discipline, because they result in me writing better code.

But that brings me back to square one, which is trying to convince everyone to try giving up parentheses all over the place, and I don’t know if I have enough tin foil hats.

So that’s where I’m at. Thoughts?

How to Build the Unbuildable pg Gem: Ubuntu 13.04, rvm, and openssl

A couple weeks ago I spent several hours trying to get the pg gem to build on Ubuntu 13.04. There’s a Stack Overflow answer, How to Install Gem PG on Ubuntu, that is wrong, but kept me busy for an hour or so. This post is mostly so I won’t forget to do this again next time. This solution is particularly applicable if you are getting an error message that says PGConnect is being called with the wrong number of arguments.

The Problem

Ubuntu ships with a version of ruby that is compiled against one set of openssl libraries. Rvm installs its own version of openssl. That’s all fine and good until you install the postgres server (which builds against the Ubuntu version of openssl) and then try to install the pg gem (which builds against the rvm version). In short, you end up with a version of the postgresql client that can’t connect to the postgresql server–in fact, it won’t even build. People will give you “helpful” advice like “remove your openssl libraries” (which makes it impossible to build the postgresql server) or “use rvmsudo to force rvm’s openssl libs” (which forces building an incompatible version of the gem).

The Solution

I don’t know when they added this, but I just wanna hug the rvm developers:

rvm autolibs enable

Run that, and from now on, rvm will be MUCH smarter about using system libs versus its own libs. That one command made the pg gem build without a hitch, and more importantly, it made it build against the system version of openssl, making it compatible with my postgresql server.

Thanks, rvm guys!

Teach Yourself a New Programming Language in 21 Minutes (Or 2-3 Years, It Depends)

You’re sitting at work, grinding out a bug in the legacy system, when your boss comes in and tells the team that you finally get the chance to rewrite the whole system–and even better, you get to do it in Clojure! (Or Scala or Erlang or Rust or Dart or some other Language You Only Know A Little About But Have Secretly Wanted To Learn For A While Now.)

Or maybe you’re happy with the language you’re using, but your VP of Software Architecture just spent $150,000 on a suite of Enterprise Tools which includes a module that will let your project scale infinitely into the cloud… all you have to do is learn Clojure. (Or Scala or Erlang or Rust or Dart or some other Language You’ve Only Heard a Few Mutterings About But Desperately Want To Avoid Learning.)

Either way, you’ve got a problem: you need to ramp up in this new language. And whether you want to become a super expert guru ninja rockstar in the language, or just learn enough of it to make it go away, you want to do it fast–and that means you want to avoid making the same mistakes I made learning Ruby and JavaScript over the years. Mistakes which I have learned to fix, mind you, and so without further ado I present:

Teach Yourself A New Programming Language In 21 Minutes (Or 2-3 Years, It Depends)

All you need to do to learn a new language is learn:

  • How the language encapsulates data
  • When the language invokes execution of code
  • Where do the semicolons and braces go

I’m not entirely kidding. This was my strategy for a decade and to this day if I need to get something bashed out quickly in a new language I’ll skim a language reference and let the compiler tell me when I make syntax errors.

In general, as long as you’re staying largely inside the world of what I call “BOLS” (Block-Oriented, Lexically-Scoped) Languages, such as C, C++, VB, Java, C#, PHP, Lua, Python, Ruby or perl, you can in fact learn enough pidgin to get by very very quickly with this method. Granted, in those last three languages it will be obvious to experienced programmers that you’re writing inelegant code. But you can get by, is what I’m saying.

If you’re the second kind of programmer I mentioned, you might be done. Just read this next section for a caveat and then you can hopefully stop there.

Don’t Stop There If You Can’t Stop There

As you’re learning the new programming language, ask yourself the two vital tradeoff questions:

  • Do I really want to learn this language bad enough to actually learn this language?
  • Can I afford the time and energy needed to fumble around being bad at this language?

See, this strategy is especially useful if you know you don’t plan to ever actually learn the language. I have written some pretty arcane bash scripts in my day, but to be honest I wrote my first bash script 10 years ago and in that time I’ve written less than 10,000 lines of bash scripting code. It’s just not worth learning to me, so I keep some files around with examples of the most obvious kinds of things I want to do, and when I need a new bash script–usually about twice a year–I have all the pieces I need right there. BAM. Ignorance is bliss, and laziness is, occasionally, brilliant time management.

But this strategy is especially awful if you know you don’t plan to ever actually learn the language, but you turn out to be wrong. It ends up that you find yourself using it on a regular basis, and hilariously, you don’t even notice this for years and years. This is true for me of elisp, the flavor of lisp used to program emacs. I’ve written elisp for years longer than I have bash, but maybe only twice as many lines of code. I find myself needing an elisp tweak on a weekly basis, and end up spending an hour researching how to do it. And two or three times a year I find a problem that I could solve elegantly in elisp, if only I knew how to express what I was thinking as lisp code. But I don’t solve the problem. I merely sigh, and learn to live without whatever cool new feature I was thinking of.

I wrote that paragraph in present tense because I still haven’t figured out that I really do need to actually learn that language. Shut up.

Sometimes work and politics can affect your decision as well. If you and your boss agree that the Next Big Thing will be written in Language Y, then you have the need and your manager has the afford.

(And sometimes these two forces are in conflict. I could write an entire blog post on the political machinations involved when you and your boss disagree on the do/don’t want or can/can’t afford questions. Skunkworking a cool language or shirking a lame one is a topic for another post, one I’ll probably never write, but basically it would be all about office politics. I’ve seen people get fired for a successful skunkworks project and others get promoted for sandbagging a project. People sure are complicated!)

Okay, NOW if you’re the second type of programmer, you can safely stop reading. If not, you’re pretty much out of luck for the “21 Minutes” part of learning a new language. But keep reading, because the 2-3 years bit isn’t until the very end. Most of the mistakes I’ve made learning a new language I have made in the first few days.

Truly Embracing A Language Takes Time, But You’re In A Hurry, So…

If you want to embrace a language, it’s going to take time. You’re going to have to internalize the language’s entire approach to solving problems. You’re going to have to learn its idiosyncracies and its warts, and you’re going to have to learn its power moves and elegant applications. That all takes time, but if you’ll permit me to point out my favorite pitfalls and some less-traveled paths around them, I can maybe show you a trick or two for leveraging your learning.

But first, good news/bad news. I’m writing this assuming you already know a programming language or three or seven. The good news is, the more languages you know, the faster you can recognize the basic syntax patterns and logic structures of a new language. But the bad news is: the more languages you know, the more they tend to blur into a common model of computation in your head. This can make you blind to the elegant weirdnesses of your new language. Resist the urge to judge weird things quickly; they often turn out to be the most powerful features of the language once you “go native”. If you see something you can’t stand, remind yourself that you haven’t seen everything there is to see, and give the new idiom a chance. Try it out and learn its tradeoffs. It’s okay to discard a bad idea in JavaScript once you understand why it’s a bad idea in JavaScript… but it’s never a good idea to discard a feature of a new language based on your instincts–because your instincts come from other languages, not this one. (Read up on The Blub Paradox if you haven’t heard of it before.)

TL;DR This Is Mostly About Your Blind Spots

I should have put that TL;DR up at the top, but if you don’t know by now that I’m that kind of jerk, you must be new. Welcome to my blog! Anyway, here’s the list. I’ve included illustrations about Ruby, Python and JavaScript, because those are the three languages where I stunted my own growth unnecessarily the longest.

  • As you start learning the core principles of the language, listen hard for hints and clues from its culture. Ruby has block syntax, but a rubyist often cares more about naming than blocks vs. Procs. Python has list comprehensions, but a pythonista often cares more about being able to quickly uncover all the working parts than to have a slick but magical-looking API. JavaScript has objects and polymorphism, but listen to a good JavaScript programmer and you’ll find them more interested in the functions themselves–and their prototypes.
  • Be ready to try totally new ways of thinking. Be ready to abandon bottom-up provability for Ruby’s top-down “programming by wishful thinking” approach. Be ready to trade off a more efficient algorithm in Python for one that is more readable and maintainable. Be alert to the pains you’ll feel trying to write object-oriented code in JavaScript–that’s JavaScript’s prototype system refusing to be hammered completely into an OOP-based inheritance model.
  • Learn where the minefields are. Ruby is a memory hog. Python’s primitives aren’t actually objects. All numbers in JavaScript are floating-point numbers–there are no integers.
  • Learn which minefields you can ignore or work around. Entire models of webservice design have been rethought and reinvented to compensate for Ruby’s apache-unfriendly execution model. Python isn’t THAT slow to begin with, but if you really need bare-metal speed it’s easy to write a C extension. Many JavaScript libraries provide “polyfills”–bits of code for old versions of JavaScript that implement features added to newer versions of the language so that you can write code against a stable JavaScript version and still have a prayer of it working in most browsers.
  • Most importantly, learn which minefields you CAN’T ignore. Treat them like minefields that you must commute through daily. Stop and pay close attention. Map them out carefully. For example, metaprogramming in Ruby is a very dangerous feature, but it’s not a defect; it can be used responsibly. Python’s whitespace enforcement makes it difficult to express certain ideas succinctly, but that whitespace enforcement produces a predictable rhythm to the trained pythonista’s eye, and it is most definitely a feature–don’t let your code fight it; learn to restructure your thinking. And while almost everybody considers JavaScript’s semicolon insertion rules to be eccentric bordering on insane, you MUST learn them if you want to avoid having your program suddenly stop working just because you deleted a comment or swapped the load order of two completely unrelated files.

You can make good inroads to these blind spots in a few days or weeks if you’re just aware of them. So here’s the hard part: Last of all, learn the idioms. From here on out it’s all about learning to think in the language. That’s going to take you a bit longer, and there’s nothing for it but to talk to other programmers, find some online references, maybe buy a cookbook… but mostly, it’s going to take writing code. Lots and lots of code. In my experience (both personal and observing coworkers and clients) this last part’s a doozy–plan on it taking a couple years or more.

Whaaat. You did say you really wanted to learn this language, didn’t you?

Private Accessors in Ruby

This is a post about writing private method accessors in Ruby, but it takes me a while to get around to saying that because there’s a ton of backstory, which is of course my favorite part. If you want to skip ahead to the code, I have provided a means of conveyance compatible with your hypermedia information consumption device. (Be warned: you’ll miss the part about the monkey.)

The Principle of Least Access

So there’s this great idea called the Principle of Least Access. It is found in computer science, information security, and other fields. It’s more frequently called the Principle of Least Privilege, but the names are interchangeable and when I learned it POLA was an acronym you could pronounce.

Yeah, that’s exactly the kind of motivation that I would choose to make a design decision over. There’s a reason POODR is my favorite book of all time, and “pronounceable acronym” is reason number two. You can guess what reason number one is. (Hint: It’s also number two).

Anyway, POLA can be summed up nicely as

A thing must be able to access only the information and resources that are necessary for its legitimate purpose.

In programming, this rule is often misquoted as simply “make everything private until something else needs it”. This isn’t a bad rule of thumb, but it’s not a great one, and at first blush it really puts private methods at odds with testing code.

But Testing Private Methods is Haaaaaard!

My relationship with POLA has been rocky. I used to think it was scripture. Then I thought it was idiotic. Then I came to ruby where everything is either public or can be accessed anyway. We even had a splendid bout of drama in the ruby community over whether or not you should test private methods. I fought bravely on the side of “yes you should”, even going so far as to write some code with a humorous name to make it easier to test those private methods.

The two tenets of my argument were that

  1. You should test things that can break. If you’re going to put really complicated logic in private methods in your code, then they can break and you should test them, and
  2. Testing private methods through the public interface adds impedance to the testing process, and results in low-value tests that are very hard to construct.

But you know what’s really funny? When I really got into it with a bunch of people, even the people furiously certain that their stance on testing private methods was correct, I found out something crying out loud funny:

Ruby programmers don’t write private methods.

Test ‘Em? Let’s Not Even Write ‘Em

For the most part, we just… don’t. We know that a determined user can get at our guts, so we make no pretense of security or safety in marking something protected or private. If we’re having a good design day, we extract the private functionality to a class. If we’re having a bad design day, we shrug and say “Eh” and just leave everything public. The important thing is, when we came in from C++ or Java, we had to check at the door this notion that our encapsulation would protect us, and for most of us that was the whole point of private methods to begin with.

I watched Sandi Metz talk about creating high-value tests a while ago, and she talks about private methods–and about testing them. Her take was that private methods are a good place to put code that is uncertain or subject to change on a whim; that way no external code would be affected. Since the code is so volatile, any tests that touch it will break often, so if the purpose of a test is to provide lasting reassurance, it doesn’t make sense to test them.

One of the reason I love Sandi so much is that she gets the concept that everything is a tradeoff. When she was on Ruby Rogues I asked her about the case where I sometimes will use tests to figure out complicated code. I’ll inch forward bit by bit, and lock each bit down with a short-term test. She considered this for a long moment, and then announced her new rule for testing private methods:

“Don’t test private methods unless you really need to. In which case, test them as much as you want.” –Sandi Metz

And THAT, dear readers, is how I found myself poring through POODR and pondering the fact that keeping everything public is a very, very bad idea.

Making Everything Public Is Killing Us

Every public method, variable and accessor on a class is a message that can be fired at that object from anywhere in the system. I know this will seem strange to some rubyists, but we really have to get away from this notion of making everything public–or, at least, making things public that other objects have no business knowing about. Let’s quote Sandi again:

Each [object in a hard-to-maintain system] exposes too much of itself and knows too much about its neighbors. This excess knowledge results in objects that are finely, explicitly, and disastrously tuned to do only the things that they do right now…

The roots of this new problem lie not in what each class does but in what it reveals. –Sandi Metz (emphasis hers)

The more I read the more I realize that each object has a sort of “surface area”, which is its public interface, where it can be interacted with. I’m used to referring to applications with too many dependencies as “monolithic”, literally meaning “one piece of stone”. This word makes me think of how all the dependencies in the system are interlocked and hard (as stone) to break apart. But Sandi has an even better metaphor: She draws this little picture of all these objects, and then she connects them with a kitten’s ball of yarn. Instead of talking about interlocking dependencies, she starts talking about the messages going back and forth, shooting all around the system. Viewing the system as an uncontrolled net of messages, Sandi pulls out an even better word: a woven mat.

So… okay. I’m convinced. It’s time to reduce the surface area of my classes. It’s time to start writing private and protected methods. More importantly, it’s time to start writing private accessors.

Okay, But There’s a Secret Reason I’ve Hated POLA All These Years

There’s a cost to the Principle of Least Access, however. It is simply this: if you start out making everything private, and then find out you need to make something public, there’s a refactoring cost. It’s not big, but it’s there, and it’s one of the kinds of hassle that I am very finely tuned to sniff out. I have literally spent 5 years learning how to design applications to not need private methods rather than writing them.

It’s not that you can’t do it. Here’s what it looks like in ruby:

class Monkey
  private
  attr_accessor :poo_count
  public

  def need_to_reload?
    poo_count < 1
  end

  def throw! victim
    self.poo_count -= 1
    victim.horrify
    onlookers.each {|onlooker| onlooker.amuse! }
    reload! if need_to_reload?
  end
end

Look at those first three lines inside Monkey. YUCK! Either we need to change ruby itself, or Sandi’s book needs to stop being so awesome.

(Admit it, I had you at “change ruby itself”, didn’t I.)

Adding Private Accessors to Ruby

As I write this blog post, I am dumbfounded that nobody else has done this already. What I’m about to show you is so obvious that I am convinced that it is proof that we as rubyists just don’t care much for private methods and accessors. Because this turns out to not be so hard at all:

module PrivateAttrAccessor
  def private_attr_accessor(*names)
    private
    attr_accessor *names
  end
end

class Monkey
  extend PrivateAttrAccessor
  private_attr_accessor :poo_count

  # ... rest of code unchanged
end

Also, if we’re willing to commit to using private and protected accessors, then we can move that call to extend PrivateAttrAccessor up into the Object class and now Monkey just looks like this:

class Monkey
  private_attr_accessor :poo_count

  # ... rest of code still hilarious
end

BAM. There you go. No, wait… HERE you go. That’s the full set of scoped accessors (protected_attr_reader, private_attr_writer, etc) ready to go. The file is 20 lines of documentation, 38 lines of code, and about 290 lines of test code. This feels like about the right ratio of code to test for code that is designed to be stabbed right into the very heart of ruby’s object system. If you are running ruby 2 (and if not, what is wrong with you?) you can just run the file and MiniTest will autorun and test the module. If you are running an out-of-date Ruby, such as 1.9.3, you can still run the file, just make sure you gem install minitest first.

I only had one real concern as I wrote that module: Inside the class, I am declaring everything in public scope, but inside the private accessor method, I tell ruby to use private scope. Would ruby stay stuck in private scope when it returned, or would the scope reset back to public?

Answer: It resets!

Sorry, sorry, that was pretty anticlimactic. I should have said something super profound about ruby’s access scoping and how it interacts with ruby’s lexical scope handling, but the short answer is I have no clue and the even better answer is I don’t need to have one. Rather than pick the ruby source code apart, I just wrote some unit tests specifically to ensure that the scope remains unchanged, and it does.

Give those scoped accessors a whirl and let me know what you think. If you just type ruby scoped_attr_accessor.rb minitest will execute its test suite. Or you can require the file in your project and it will quietly patch Object and you’re all set.

I haven’t touched the deep-downs of ruby with a monkeypatch in literally hours, so I’m unsure if this idea is awesome or just terrible. What do you think? Should I publish this as a gem or should I delete it, burn my laptop, and exile myself to Tibet? Update: I chose the non-Tibetan-exile option. Type gem install scoped_attr_accessor or get the source code here.

Start Small. Start Growable.

Languages Should Be Growable

One of the fun things in Computer Science is finding new and mind-blowing stuff that turns out to be 15 years or more old.

I just watched Guy Steele’s 1998 OOPSLA talk, Growing a Language. If you haven’t watched it, go watch the first ten minutes and you’ll be hooked for the rest of the talk. I don’t want to give anything away but there’s a huge bomb that he drops in the first ten minutes that not only kept me riveted for the rest of the talk, but then made me rewind it and watch it again. (Remember how you watched Sixth Sense, and the bomb gets dropped at the end and you had to watch it again? And then the director explained his use of the color red in the movie, and you had to go watch it a THIRD time to see that the bomb was constructed right there in front of your face the whole time? Yeah, Guy’s talk is like that.)

In his talk, Guy discusses whether a language should be large or small; a large language lets you say many things easily but requires that you and your listener both learn many words before you can say anything. A small language lets you both start talking immediately, but requires that you spend a lot of time creating new words before you can say anything interesting.

I won’t tell you whether Guy thinks you should create a large language or a small one–in fact, Guy won’t tell you either. But he does make it clear that a good small language must be growable, and a good large language must be both well-groomed and still growable. He even goes so far as to say that most good large languages are the ones that started small and grew over time, with good cultivation.

Bless his misguided heart, he then says that Java is a good language. I have to point out that this establishes his crazy-person street cred right there, but in a way, he’s right: Java DID start small enough to be easily understood. It grew slowly, and with careful curation. This was in 1998, and Guy then goes on to point out that Java is fundamentally broken and ungrowable unless they add some growth mechanisms to the language, such as generics and operator overloading. Most Java programmers today think of these as “having always been there” in the language, and they’re probably part of the reason Java is not only still around, but a dominant language in the industry today.

Applications Should Start Small… and be Growable

So I’m working on an new app right now, and I want to do some good OO design up front to ensure that the app looks and works well. But I’m stuck trying to figure out where to start. Funnily enough, I opened Sandi Metz’ book, POODR (Practical Object-Oriented Design In Ruby) for some guidance, and I found this astonishing guidance right there at the top of chapter two:

“What are your classes? How many should you have? Every decision seems both permanent and fraught with peril. Fear not. At this stage your first obligation is to take a deep breath and insist that it be simple. Your goal is to model your application, using classes, such that it does what it is supposed to do right now and is also easy to change later.”

Sounds familiar, doesn’t it?

Libraries Should Be Growable… or Well-Grown

I’m a fan of RSpec. If you’ll permit me stretching Guy’s language metaphor, it’s a big language for testing, with many words. I can test very complicated ideas without extending the language, and someone who has learned RSpec’s language can read my complicated ideas without learning new words.

MiniTest is a very tiny testing language. It has only a few words. As a programmer not used to constructing new words in my testing language, I initially found MiniTest to be insufferably repetitious. Each test used ten words or so, and nine of them were identical in all of my tests. When presented with this frustration, Ryan Davis shrugged with annoyance and snapped “so write an abstraction method!” It wasn’t until I watched Ryan write tests at MountainWest RubyConf this year that I realized that he does this all the time. This means that a) he was not kidding b) he was not being dismissive and c) that adding words to minitest’s language is in fact exactly how MiniTest expects to be used.

Interestingly, while I think RSpec’s large language is elegant and well-curated, many programmers feel that RSpec has grown in the wrong direction, or has at least become buried by overgrowth. Ryan felt that even Test::Unit had too much cruft in it, let alone RSpec, so rather than prune the language back, he started fresh, started small, and most importantly, started growable.

When Ryan spoke at MWRC, he created a new testing word that I felt did not make much sense. Even watching him define it I thought “Okay, I understand the abstraction, but that word is horrible. It doesn’t communicate what the word does at all.” That’s the drawback to small languages: naming things is hard, and small languages require you to name things from the start. Had I been pairing with him we’d have had a splendid argument about the name of the abstraction method he wrote. But that sort of fits into Guy’s logic as well: growth should be carefully curated. As you grow, you’ll create new words, and those words should be easy to learn and understand or you can’t communicate well.

Growth Should Be Curated

I’m gritting my teeth as I type this, but I have to own up to it: The growth of Java has been well curated. C# has also been well-groomed, even if I think the language designers have carefully and consistently solved all the wrong problems with the language.

I think PHP is probably the poster child for bad growth*. That’s in addition to its internal syntax inconsistencies; I’m just talking about the language’s internal methods. For a quick example, see how many different clumps of consistency you can find just in the string functions. For a longer example, read @eevee‘s rant, PHP: A Fractal of Bad Design. (TL;DR? Fine, just click on the rant but don’t read it–just scroll down to see HOW LONG it is.) PHP’s problem is twofold: they added things inconsistently, and they were unwilling to prune things back out of the core once they had been added. PHP has grown much faster than Java and C#, because the maintainers were willing to make mistakes rather than deliberate for years in committee, but like Java and C#, PHP hasn’t gone back and fixed mistakes once made.

In my opinion, Ruby sort of gets a B+ on growth curation. A lot of words are unnecessary synonyms (count and size and length, for example) while other words are occasionally synonyms that suddenly change meaning when you’re not expecting them to (Array#count and ActiveRecord::Base#count, for example). Some things in the language are pretty bizarre edge cases (The Kernel#test method, for example) and in the new Ruby 2 release one major feature (Refinements) was brought in under strident protest. But by and large the methods across the entire language are consistent with each other, and when they vary it is usually to be consistent with some external protocol that they are modeling. Ruby 2 was willing to break backward compatibility in order to fix mistakes and grow sufficiently. I also cut Ruby some slack because it’s growing extremely fast in comparison to other languages, and having the breaking changes in Ruby 1.9 for 3 years before committing to Ruby 2 let the community keep pace.

So Grow, But Grow Carefully

And that’s sort of my whole point here: Size and growth are key tenets at every level of abstraction: for a single application, a broadly applicable test suite, or an entire language, two rules I’m drilling into my head right now are

  1. Start Small. I can’t say “Always Start Small” because sometimes the problem you need to solve is big. But it is fair to say “Always Start As Small As Possible”.

  2. Start Growable. This one IS gospel for me. However big I choose to start, I think growability is essential for success.

  3. (Bonus Rule) Curate your growth, but don’t be so afraid of growing wrong that you become afraid to grow. Be willing to grow and then weed.

And if all else fails, start over with a new small thing and start growing again.


* Note: Every time I kick PHP’s tires I get hate mail from offended PHP programmers who assume I’ve never used the language. The fact that PHP programmers are the only group worse than rubyists when it comes to language fanboyism is a topic for another day, but for now, let me just say $$dispatcher->$$method. If you’ve never written your own framework in PHP, that little snippet of code means I am better at PHP than you. I have pushed out, much like an agonizing stool the day after an all-night taco binge, a little over a million lines of PHP code. It is the foul tongue of Mordor, but I have earned the right to dislike it purely on its lack of merit. If you like it, that’s fine. It’s your choice. I’m not saying PHP doesn’t have its good points or that I can’t write clean code in it. I’m just saying it’s not worth it to me.

Running SimpleCov when COVERAGE=true

I use this trick every time I start a new project, which is just often enough to go look it up but not often enough to commit it to memory, so here it is in blog form:

spec_helper.rb
if ENV['COVERAGE']
    require 'simplecov'
    SimpleCov.start 'rails'
end

Of course if you’re writing a non-rails app omit the ‘rails’ argument. If you’re not using RSpec, put it in test/minitest_helper.rb. (And, of course, if you’re using Test::Unit, stop using Test::Unit and switch to MiniTest!)

Now when you run your specs or tests as normal with ‘rspec spec’ or ‘rake test’ your regular tests will run normally. To run them and get a coverage graph, however, just run

COVERAGE=true rspec spec

or

COVERAGE=true rake test

And there you go. If memory serves me correctly, this trick is credit due to Jake Mallory, from a project we worked on together last year.

Software and Chicken Entrails

I had a collection of epiphanies today about Informational Software.

“Informational Software” is a term I use to describe software that helps you understand and make decisions about information. It is not a product and does not make your business money, but it can be used to help you understand your business and therefore, in theory, help you make money. For example, analytics software does not make you money, but you can use it to understand your traffic and hopefully to then minimize your costs and maximize your revenues. (Note that if you are selling an analytics package, this definition is still true, because in that case the software is a product, not an information tool*.)

Informational Software comes in two types: software that interprets trends and data and makes decisions for the user, and software that collects and reveals data so the user can make their own decisions. Expert systems are an example of the former, analytics packages are an example of the latter.

Let’s call this mess of data the chicken entrails. We want to read them and predict the future, right? Sure we do. That’s what chicken entrails are for.

Okay, enough definition. Here are the epiphanies:

First: If your users are untrained and the data is simple, your system can advise the user and/or make decisions for them. If they cannot read the entrails, or the entrails are too simple to bother the entrail readers, do not let/make them read the entrails.

Second: (Pay attention, this is the important bit) If your users are highly trained and the data is very complex, do not attempt to interpret the entrails for them. Just show them the entrails. Highly trained users who have asked you for information software do not want you to do their job, they want a tool to help them do their job better.

Third: My instinct is always write software that interprets entrails, regardless of the complexity and regardless of user knowledge. Learn to stop and figure out what it really needed.

Fourth: Interpreting complex data is really, really hard. On a small, knock-it-out project, it is almost certainly doomed to fail. Now reread the 2nd epiphany: On small, knock-em-out projects, it is completely unnecessary.

Right now I’m working on software that reveals entrails to some truly arcane masters of entrail reading. They have become masters because the information system currently available to them is literally designed to protect the data from their eyes. I have found two pieces of data, correlated them, and put them into a report, and they think I’m a super genius. Not because my software is smart, but because it is smart enough to be dumb.

I really like epiphanies where I suddenly realize that it is not necessary to be doomed to failure.

* And if you’re smart enough to reason “but what if you’re using your analytics tool to analyze the sales of your analytics tool”, I congratulate you on your cleverness. Can you also see the flaw in this reasoning?