What makes code readable?

I’ve been looking at some code I find pretty difficult to follow lately. Lots of reasons, not least of which is my inexperience. But I think there are other reasons, too.

In the process of remembering & unlearning Clipper syntax and replacing it with sql, I ran across this demo of aliases on the W3Schools site.

Assume we have a table called “Persons” and another table called “Product_Orders”. We will give the table aliases of “p” and “po” respectively.

Now we want to list all the orders that “Ola Hansen” is responsible for.

We use the following SELECT statement:

SELECT po.OrderID, p.LastName, p.FirstName
FROM Persons AS p,
Product_Orders AS po
WHERE p.LastName='Hansen' AND p.FirstName='Ola'

The same SELECT statement without aliases:

SELECT Product_Orders.OrderID, Persons.LastName, Persons.FirstName
FROM Persons,
Product_Orders
WHERE Persons.LastName='Hansen' AND Persons.FirstName='Ola'

As you’ll see from the two SELECT statements above; aliases can make queries easier to both write and to read.

This surprised me. Do y’all find the second easier to read?

14 thoughts on “What makes code readable?

  1. IndividualRich

    I disagree. I don’t see the alias making it at all easier to read or write the query, I see it as a way to shorten the amount of stuff needed to type a query. Btw, you might consider that the lack of any join like criteria is going to cause your query to return the wrong thing.

    Reply
    1. Angela Post author

      Yeah, I’m with you, Rich. I find the first way more readable. Thanks for the comment.

      Reply
  2. Jeff L.

    Certainly debatable.

    In the first example (using aliases), it’s darn obvious from which tables “order id,” “last name,” and “first name” come from. It’s also not that relevant with respect to the ultimate “output” of this code, which is a table with column headers. So the aliases are a bit of “information hiding” (and still available to you; all you need to do is simply scan down or up a line or two).

    Ultimately, in this example, as much as I like descriptive names, the full table name on each column reference is simply clutter here.

    Part of it also depends on my domain knowledge–if people are already comfortable with what’s in the domain, then the scoping detail doesn’t need to be so immediate. I’m already comfortable enough with the domain at this point after 5-second-familiarization. “po” is also a fairly common abbreviation, so that helps too.

    Fully qualifying all the variables is kind of like variable names in code where people insist on qualifying them with the class name:
    class Employee {
    private var employeeFirstName:String;
    private var employeeLastName:String;
    }

    Another argument for the aliases:
    Ott came up with the “Really Meaningful Names” Agile in a Flash card; the last bullet is “Match name length to scope.” The scope here is tiny, all the information is still available, so I’m ok with it.

    But I’m not thrilled with it. The clutter is really not that bad in this example.

    Then again, I’ve also seen plenty of SQL where things weren’t so obvious, the statements were much larger, and it was confusing what was what and what came from where. I shouldn’t have to hunt all over the place (e.g. 40 lines of SQL) to remind myself that “p” means “person” and “po” means “product owner.” Oh wait, it was “purchase order,” wasn’t it? (made you look!)

    Seems like a judgment call to me. I’d err on the side of spelling things out and clarity, but once clutter gets out of hand, I’ll happily use an alias as long as it doesn’t detract much from the immediacy of understanding.

    The nice part is that with a real IDE, it’s easy to do a safe rename from one form to the other.

    Reply
  3. Anthony Testi

    May I suggest:

    SELECT
      Product_Orders.OrderID,
      Persons.LastName,
      Persons.FirstName
    FROM
      Persons,
      Product_Orders
    WHERE
      Persons.LastName = ‘Hansen’
      AND Persons.FirstName = ‘Ola’

    This is normally How I format my SQL, I find the one field, and one where clause etc, per line of code the cleanest. But everyone has their own style.

    Reply
  4. Anthony Testi

    BTW. I am neutral on the use of the alias or not, typically I do not use alias but IMO they are fine to use.

    I find the ‘idea’ of each line having one ‘element’ of the SQL best as it allows for easy changes to SQL, reading etc (at least for me.) I use tools that auto-generate SQL and I then take a little time to hand format it.

    Reply
  5. Angela Post author

    Thinking about this, I’m noticing that when I’m reading for meaning, anything that’s not recognizable (by me) as a word slows me down. I need to have a meaning pop into my mind for each thing in order to have my thoughts flow when coding.

    Names — like c, nm, mdbr, cor — get in my way unless I have them in my personal vocabulary. I can stretch my vocabulary, but in cases where I’m never going to run into something again, it’s maybe more important that it not be shortened into something I can’t readily interpret.

    Reply
  6. Evan Cofsky

    1. Writing code as a description instead of a list of tasks. 2. Defining terms simply in code. 3. Choosing clear names for things when 1 and 2 fail.

    Reply
    1. Angela Post author

      It’s like you’re waving flan-covered cheesecake under my nose, and all I can have is salad.

      Reply
  7. Eric Weilnau

    I would write this as follows.

    SELECT [Product_Orders].[OrderID]
    , [Persons].[LastName]
    , [Persons].[FirstName]
    FROM [Persons]
    INNER JOIN [Product_Orders]
    ON …
    WHERE [Persons].[LastName] = ‘Hansen’
    AND [Persons].[FirstName] = ‘Ola’;

    The main reason that I include the delimeters and do not use aliases is to make it easier to find tables and columns in scripts in source control using a simple text search. I also prefer to not use abbreviations in table and column names so that it is easier to remember what the names are.

    Reply
  8. Tim Ottinger

    The second is more readable, because I don’t have to hold the alias in my head (or refer back to it) to read the code.

    Time to type is orthogonal to time to read. This is why we write less readable code and have to edit it. It is much easier to type “p” and “po” but the names are rather similar and neither is evocative.

    The problem is greater if you get to a long query or are reading many of them in a row. What if person is sometimes p and sometimes person and sometimes when self-joining it is ‘ps’ and ‘po’ (person:self, person:other), or sometimes purchase order is ‘p’.

    I use the heck out of aliases when there is a self-join in a query, but pretty much avoid them because I know they only save typing time, not reading time. I type fast, so reading is more precious to me.

    The more troubling effect, though, is that grep for ‘person’ or ‘product owner’ is less potent, and searching for ‘person.id’ is not going to be useful. Greppability counts for queries because they are entered in strings and don’t respond to rename refactorings.

    Reply
  9. Dave

    I find the second easier to read. I’ve seen a lot of really ugly and lengthy SQL with way too many non-intuitive aliases. I try to shy away from aliases that don’t provide long term value and that don’t control noise.

    I understand that aliases can sometimes speed up development – no one wants to type ‘Product_Orders’ over and over again and its much quicker to type ‘po’. The problems I’ve seen come from taking this quick approach not realizing that over time this code will change. Extra SQL gets tacked on, other tables get joined in, and few change the old ‘po’ to clarify the growing SQL statement. Most of the time they follow the ‘convention’ they stumbled across and add in their changes, tossing in a few more 2 letter aliases. The short term value is soon lost as reading the SQL becomes extremely difficult.

    I do find value however, in the ability to control noise carefully with an alias. In the above case I personally don’t find much noise in ‘Product_Orders’. The table name seems clear so I would personally leave it as is and opt for the second example. Sometimes though, you’ll stumble across some very unclear naming conventions. Lets pretend that instead of Product_Orders the table was named XXX_YY_PRODUCT_ORDERS (replace XXX_YY with some other legacy naming convention that was lost in some documentation that never got read…). When digging a little deeper you learn that every table you’re working with is prefixed with XXX_YY. In that case, I’d alias away the XXX_YY, removing the noise and trying to make the SQL much more readable.

    Just my two cents and I hope it made sense ;)
    ++dave;

    Reply
  10. Randy A MacDonald

    Not enough information to determine readability.
    Lacking more detail, string length is inversely proportional to readability, so I’d lean towards the p. po. of the first.

    At some level, p is no less meaningful than Product, and auto-complete can help prevent use of the wrong column.

    Mind you, p is not a real-world name (neither is Product for that matter) so some sort of definition sentence (p‚ÜźDurableGoods.MainWareHouse.Products) might also be in order.

    Reply
  11. rdm

    Readability has, I think, something to do with familiarity.

    Personally, because of SQL’s structure, I rely heavily on formatting (and am tremendously wasteful of whitespace). I doubt that many other people would be comfortable with SQL the way I prefer to format it, and this discomfort could easily be interpreted as a lack of readability.

    That said, I do use table aliases, because it’s easier to verify exact match on a short name than on a long name. When I am working with a dozen tables, and some of them have names which are minor variations on each other, this can be a significant time saver.

    So, here is how I would express the select statement:

    SELECT
    po.OrderID
    , p.LastName
    , p.FirstName
    FROM
    Persons as p
    , Product_Orders as po
    WHERE
    p.LastName='Hansen'
    AND p.FirstName='Ola'

    (I am going to hope, since there’s no preview here, that wrapping that select statement in the code tag will preserve leading indentation. Without that indentation, you would not be seeing how I work with SQL.)

    Reply
  12. rdm

    oh well… if you really want to see what I wrote, you can view source and search for the timestamp on my message and then stick my select statement in an editor and remove the line break tags…

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>