How to Stay Classy With Ruby Variables

Ruby provides a number of options for non-instance-specific variables – class variables (of the form: @@var), constants (in all caps: VAR), and class instance variables (@var).  Which one to use depends on the use case, and to some degree on personal preference.  Let’s explore this a bit.

Constants are meant to be – well, constant.  This is not technically enforced in Ruby; if you redefine a constant during program execution, it will display a warning, but not actually raise an error.  However, the semantic idea of a constant is that it should be defined once and not touched again. Variables, on the other hand, are variable.  They record a particular state that is likely to be redefined at some future point in time.

Now let’s examine some use cases and see where things start to get tricky.  To make things fun, let’s go to the zoo! In this case, I have an Animal class, which will be at the top of the hierarchy and have animal classes that descend from it.  Here are my requirements:

1) Since most of my animals are quadrupeds, I decide it makes sense for the class’s instances to default to 4 legs, and any subclass (let’s say Octopus) with a difference number of legs should change the default.

2) Both the Animal class and the individual animal-type classes should keep track of how many of each type of animal exists, so I can check Animal.all or Octopus.all.

Let’s get to work.  Following Sandi Metz’s recommendation, we’re going to build an Animal parent class with a post-initialization hook.  Hence, the Animal class’s initialize method will append the new item to an animals array, and then call an after_initialize method which will be accessible to the child classes.  We’ll start with just 2 animal types, octopus and llama:


class Animal
@@animals = []
@@legs = 4
def initialize(args={})
@@animals << self
after_initialize(args)
end
def after_initialize(args)
end
def legs
@@legs
end
def self.all
@@animals
end
end
class Octopus < Animal
@@octopi = []
@@legs = 8
def after_initialize(args={})
@@octopi << self
end
def self.all
@@octopi
end
end
class Llama < Animal
@@llamas = []
def after_initialize(args={})
@@llamas << self
end
def self.all
@@llamas
end
end
Llama.new.legs # => 8

view raw

animals.rb

hosted with ❤ by GitHub

Hmmmmmm, not exactly what we wanted.  How did we end up with an 8-legged llama?

8-legged llama

Well, it turns out that Ruby class variables don’t play very nicely with inheritance.  Essentially, they are shared by any class in an inheritance chain.  So when we defined @@legs in Octopus, it changed @@legs in Animal and, by extension, @@legs in Llama.  (Technically, if Animal doesn’t define a class variable, Llama and Octopus won’t share that variable.  But going down that path is just begging for trouble, because you never know when someone down the road will add @@legs to Animal and open up a huge can of worms.)  I have heard this described as “leaky inheritance,” though I have yet to see it in writing.

Class variables, it seems, are really best for situations when you want to have each member of an inheritance hierarchy to be able to access the same variable.  That might be useful for configuration.  For example, let’s say each animal has a speak method which it defines, and it can speak verbosely or concisely (for a Dog, “WOOF!  WOOF WOOF WOOF!” vs “WOOF!”).  Perhaps we want to change one setting in Animal and have that apply to all animals.  In that case, we would do something like this (irrelevant code removed for now):


class Animal
@@config = {}
def self.config
@@config
end
def speak
if self.class.config[:verbose] == true
verbose_speech
else
brief_speech
end
end
def verbose_speech
''
end
def brief_speech
''
end
end
class Dog < Animal
def verbose_speech
"WOOF! WOOF WOOF WOOF!"
end
def brief_speech
"WOOF!"
end
end
Dog.new.speak # => "WOOF!"
Animal.config[:verbose] = true
Dog.new.speak # => "WOOF! WOOF WOOF WOOF!"

So that works great.  But we need to do something about @@legs.  So here’s the next option, which works well.  Let’s change @@legs to a constant, LEGS:


# IRRELEVANT CODE FOR THIS EXAMPLE HAS BEEN REMOVED
class Animal
LEGS = 4
def legs
self.class::LEGS
end
end
class Octopus < Animal
LEGS = 8
end
class Llama < Animal
end
Octopus.new.legs # => 8
Llama.new.legs # => 4

view raw

animals_2.rb

hosted with ❤ by GitHub

Note how we now access LEGS as self.class::LEGS.  This is critical.  If we accessed it as LEGS without adding self.class::, we would be referencing the LEGS variable in the scope where the method was defined, i.e. Animal.  Instead, we tell the method to reference LEGS as it is defined within the scope of the current class.

Alright, we’ve taken care of legs, but let’s consider another issue.  What about our tallying object?  Right now, Animal has an @@animals variable which contains all the animals in our zoo.  This presents 2 problems:

1) What if a later programmer decides to call the container @@animals in the Elephant class?  Suddenly we’ve entered a world of hurt.

2) On a more fundamental level – does it make sense for Octopus to have access to @@animals, even theoretically?  It should be blissfully unaware of the Lions and Tigers and Bears throughout the zoo, and just know about the 8-legged ocean critters.  How can we make this happen?

We can solve problem 1 by simply replacing the array @@animals with a constant, ANIMALS.  Hence, subclasses that define their own array of animals won’t generate a conflict.  However, despite seeing others advocate for it, I don’t like that solution either, for 3 reasons:

1) Now we have a different problem.  If the designer of the Elephant class neglects to define an ANIMALS constant but still adds to the ANIMALS array, the parent class’s array will be affected.  This may be difficult to debug.

2) It’s true that Ruby doesn’t complain about changing the contents of a constant array, because the object hasn’t been fundamentally redefined.  That doesn’t mean it’s the right thing to do.  Others will disagree (and in fact Rails apparently does this all the time), but I maintain that constants should be constant and predictable.

3) Constants are easily accessible from outside the class.  Now, I know that everything in Ruby, even private methods, is accessible, but there’s a semantic point here.  Where in the Ruby core classes do you find constants?  My first thought is in the Math module, which contains the constants PI and E.  In other words, constants are meant to be values which are definitional to the class and/or should never change once defined.  PI and E are not going anywhere.  Similarly, it makes sense to say that Llama::LEGS is 4 and Octopus::LEGS is 8, since those are attributes that should apply in all but the most exceptional cases.  (My apologies to Larry the 3-legged llama.)

The animals array, on the other hand, is not at all fundamental.  It’s a variable that is changed frequently, and should be associated with the class, but not easily accessible from outside, and not shared with subclasses.

So what’s the right answer?  Well, let’s remind ourselves for a moment that everything in Ruby is an object.  It turns out that even classes are objects – instances of the Class class.  (Sidebar: if you want to really warp your brain, enter Class.class into IRB.  Yep, Class is an instance of itself!  Mind blown.)  So if classes are instances, surely they have instance variables, right?  Yes, they do.  And we can use them to implement a safe working version of our animals array!


# IRRELEVANT CODE EXCISED
class Animal
@animals = []
def self.animals
@animals
end
def initialize(args={})
Animal.animals << self
after_initialize(args)
end
def after_initialize(args)
end
def self.all
self.animals
end
end
class Lion < Animal
@animals = []
def after_initialize(args)
self.class.animals << self
end
end
Lion.new
Animal.all # => [#<Lion:0x0000010187de60>]
Animal.all.object_id # => 2160486800
Lion.all # [#<Lion:0x0000010187de60>]
Lion.all.object_id # => 2160342160

view raw

animals_3.rb

hosted with ❤ by GitHub

Note that this was a bit tricky.  We had to define a getter method for the animals array.  If we have a number of such variables, we would probably be best off using attr_accessor, but the call to attr_accessor has to be within the context of a class << self ... end (singleton class) block.

On the other hand, we’ve essentially established the animal-tracking system in the parent class, and we can take advantage of it in children by giving each its own @animals array as a class instance variable.

Alright, dear readers.  The time has come to leave you with the final, comprehensive version of our zoo.  Just don’t feed the animals!


class Animal
@animals = []
LEGS = 4
class << self
attr_reader :animals
alias :all :animals
end
def initialize(args={})
Animal.animals << self
after_initialize(args)
end
def after_initialize(args)
end
def legs
self.class::LEGS
end
end
class Octopus < Animal
@animals = []
LEGS = 8
def after_initialize(args={})
self.class.animals << self
end
end
class Llama < Animal
@animals = []
def after_initialize(args={})
self.class.animals << self
end
end
Octopus.new.legs # => 8
Llama.new.legs # => 4
Animal.all # => [#<Octopus:0x000001010ef220>,#<Llama:0x000001010a6868>]
Octopus.all # => [#<Octopus:0x000001010ef220>]
Llama.all # => [#<Llama:0x000001010a6868>]

As always, comments and thoughts are most welcome.  Stay classy, Rubyists!

Struct: Ruby’s Quickie Class

Let’s say you have Player and BasketballTeam classes that are defined and used as follows:


class Player
attr_accessor :name, :number
def initialize(name, number)
@name = name
@number = number
end
end
class BasketballTeam
attr_accessor :player1, :player2, :player3, :player4, :player5
def initialize(player1, player2, player3, player4, player5)
@player1 = player1
@player2 = player2
@player3 = player3
@player4 = player4
@player5 = player5
end
def starting_lineup
str = "Ladies and Gentlemen, here is the starting lineup!\n"
5.times do |num|
player = self.send("player#{num + 1}")
str += "\n##{player.number}, #{player.name}!\n"
end
str
end
end
team = BasketballTeam.new(Player.new("Magic Johnson", 15), Player.new("Michael Jordan", 9),
Player.new("Larry Bird", 7),Player.new("Charles Barkley", 14),Player.new("Patrick Ewing", 6))
puts team.starting_lineup

In this case, since there are always exactly 5 players, I don’t want to pull out an array every time and write team.players[0], and instead I’ve chosen to use 5 similarly named instance variables, so I can do team.player1. This looks nice, but also isn’t ideal. If I want to access player n, this starts to get ugly: team.send("player#{n}").

Well, here’s the good news: as usual, Ruby has a better way for you to do it. Introducing: the Struct class! Structs fall somewhere between full-fledged Ruby classes and arrays/hashes, and are excellent for generating classes which are mostly variable storage containers with a particular number of items, with a small number of methods. Here is how we would refactor our code from before:


Player = Struct.new(:name, :number)
BasketballTeam = Struct.new(:player1, :player2, :player3, :player4, :player5) do
def starting_lineup
"Ladies and Gentlemen, here is the starting lineup!\n" +
self.collect {|player| "\n##{player.number}, #{player.name}!\n"}.join
end
end
team = BasketballTeam.new(Player.new("Magic Johnson", 15), Player.new("Michael Jordan", 9),
Player.new("Larry Bird", 7),Player.new("Charles Barkley", 14),Player.new("Patrick Ewing", 6))
puts team.starting_lineup

Huh? Where did all the code go?

Struct.new is a really cool method that takes symbols as arguments and returns – no, it’s not an object, it’s a class!!! (Well, technically all Ruby classes are objects too, but we’re going to deliberately ignore that for now.) It takes each symbol, makes it an instance variable, gives it setter and getter methods, and adds it to the initialize method in the order specified. So it’s doing a lot of work for you, just for adding the symbol there. The optional block at the end (see how BasketballTeam is created with a block but Player isn’t?) specifies any methods you want to add to the struct. If you have a lot of these, Struct probably isn’t for you. But if it’s just one or two simple methods, then Struct may still be a good idea.

An examination of Struct’s instance methods reveals its similarity to Array and Hash. Here are my favorites:

Method Description and Correlatives
#members like Hash#keys, returns an array containing the instance variable names
#values like Hash#values, returns an array containing the instance variable values
#length, #size like Hash#size or Array#size, the number of instance variables
#each similar to Hash#each, goes through each instance variable’s value
#[member]
(e.g. team["player1"] or team[:player1])
similar to Hash#[], access by instance variable name
#[index]
(e.g. team[0])
similar to Array#[], access by variable index in #members

NOTE: You can also write team[0] = Player.new("Magic Johnson", 15)

Of course, you are also able to get team.player1 because it attr_accessor’ed everything for you.

Because Struct defines an #each method and includes Enumerable, you can use any of the Enumerable methods on its properties. So you can cycle, check if team.any? {|player| player.name == "Michael Jordan"}, inject, or find the team.max_by(&:number), among others. You can also modify all contained values pretty easily: team.each{|player| player.number += 1} (in case you needed to bump up everyone’s number for some reason). And if the IOC is insisting you sort your players by jersey number, just team.sort_by(&:number) and you’re all set! Patrick Ewing, with jersey #6, is now team[0], a.k.a. team.player1.

One downside of Struct as opposed to Arrays is that you can’t push/pop/unshift/shift, because the size is fixed from the beginning.

TL;DR A struct is somewhere between a regular object and a hash/array. It’s an awesome data structure when you

  • know exactly what it needs to hold
  • want to be able to access your data in a variety of useful ways
  • need to define just a small number of custom methods (or none at all)
  • and just don’t want to write much boilerplate code while doing it!

P.S. Check out this post from Steve Klabnik about how incorporating structs into your regular class definitions can make your debugging much easier due to Struct’s handy #to_s method.

P.P.S. Robert Klemme helpfully notes that, unlike hashes, struct[“something”] will raise an error if there is no @something variable. This can be helpful if you want to detect certain types of input problems.

P.P.P.S. Here’s the output from the code above (using structs or regular classes), if you’re desperately interested:

Ladies and Gentlemen, here is the starting lineup!

#15, Magic Johnson!

#9, Michael Jordan!

#7, Larry Bird!

#14, Charles Barkley!

#6, Patrick Ewing!