Книга: Metaprogramming Ruby 2
Назад: Attribute Methods in Action
Дальше: A Lesson Learned

A History of Complexity

Instead of looking at the current implementation of attribute methods, let me go all the way back to 2004—the year that Rails 1.0.0 was unleashed on an unsuspecting world.

Rails 1: Simple Beginnings

In the very first version of Rails, the implementation of attribute methods was just a few lines of code:

 
module​ ActiveRecord
 
class​ Base
 
def​ initialize(attributes = nil)
 
@attributes = attributes_from_column_definition
 
# ...
 
end
 
 
def​ attribute_names
 
@attributes.keys.sort
 
end
 
 
alias_method :respond_to_without_attributes?, :respond_to?
 
 
def​ respond_to?(method)
 
@@dynamic_methods ||= attribute_names +
 
attribute_names.collect { |attr| attr + ​"="​ } +
 
attribute_names.collect { |attr| attr + ​"?"​ }
 
@@dynamic_methods.include?(method.to_s) ?
 
true :
 
respond_to_without_attributes?(method)
 
end
 
 
def​ method_missing(method_id, *arguments)
 
method_name = method_id.id2name
 
 
if​ method_name =~ read_method? && @attributes.include?($1)
 
return​ read_attribute($1)
 
elsif​ method_name =~ write_method?
 
write_attribute($1, arguments[0])
 
elsif​ method_name =~ query_method?
 
return​ query_attribute($1)
 
else
 
super
 
end
 
end
 
 
def​ read_method?() /^([a-zA-Z][-_\w]*)[^=?]*$/ ​end
 
def​ write_method?() /^([a-zA-Z][-_\w]*)=.*$/ ​end
 
def​ query_method?() /^([a-zA-Z][-_\w]*)\?$/ ​end
 
 
def​ read_attribute(attr_name) ​# ...
 
def​ write_attribute(attr_name, value) ​#...
 
def​ query_attribute(attr_name) ​# ...

Take a look at the initialize method: when you create an ActiveRecord::Base object, its @attributes instance variable is populated with the name of the attributes from the database. For example, if the relevant table in the database has a column named description, then @attributes will contain the string "description", among others.

Now skip down to method_missing, where those attribute names become the names of Ghost Methods (). When you call a method such as description=, method_missing notices two things: first, description is the name of an attribute; and second, the name of description= matches the regular expression for write accessors. As a result, method_missing calls write_attribute("description"), which writes the value of the description in the database. A similar process happens for query accessors (that end in a question mark) and read accessors (that are just the same as attribute names).

In Chapter 3, , you also learned that it’s generally a good idea to redefine respond_to? (or respond_to_missing?) together with method_missing. For example, if I can call my_task.description, then I expect that my_task.respond_to?(:description) returns true. The ActiveRecord::Base#respond_to? method is an Around Alias () of the original respond_to?, and it also checks whether a method name matches the rules for attribute readers, writers, or queries. The overridden respond_to? uses a Nil Guard () to calculate those names only once, and store them in an @@dynamic_methods class variable.

I stopped short of showing you the code that accesses the database, such as read_attribute, write_attribute, and query_attribute. Apart from that, you’ve just looked at the entire implementation of attribute methods in Rails 1. By the time Rails 2 came out, however, this code had become more complex.

Rails 2: Focus on Performance

Do you remember the explanation of method_missing in Chapter 3, ? When you call a method that doesn’t exist, Ruby walks up the chain of ancestors looking for the method. If it reaches BasicObject without finding the method, then it starts back at the bottom and calls method_missing. This means that, in general, calling a Ghost Method () is slower than calling a normal method, because Ruby has to walk up the entire chain of ancestors at least once.

In most concrete cases, this difference in performance between Ghost Methods and regular methods is negligible. In Rails, however, attribute methods are called very frequently. In Rails 1, each of those calls also had to walk up ActiveRecord::Base’s extremely long chain of ancestors. As a result, performance suffered.

The authors of Rails could solve this performance problem by replacing Ghost Methods with Dynamic Methods ()—using define_method to create read, write, and query accessors for all attributes, and getting rid of method_missing altogether. Interestingly, however, they went for a mixed solution, including both Ghost Methods and Dynamic Methods. Let’s look at the result.

Ghosts Incarnated

If you check the source code of Rails 2, you’ll see that the code for attribute methods moved from ActiveRecord::Base itself to a separate ActiveRecord::AttributeMethods module, which is then included by Base. The original method_missing has also become complicated, so we will discuss it in two separate parts. Here is the first part:

 
module​ ActiveRecord
 
module​ AttributeMethods
 
def​ method_missing(method_id, *args, &block)
 
method_name = method_id.to_s
 
 
if​ self.class.private_method_defined?(method_name)
 
raise NoMethodError.new(​"Attempt to call private method"​, method_name, args)
 
end
 
 
# If we haven't generated any methods yet, generate them, then
 
# see if we've created the method we're looking for.
 
if​ !self.class.generated_methods?
 
self.class.define_attribute_methods
 
if​ self.class.generated_methods.include?(method_name)
 
return​ self.send(method_id, *args, &block)
 
end
 
end
 
 
# ...
 
end
 
 
def​ read_attribute(attr_name) ​# ...
 
def​ write_attribute(attr_name, value) ​# ...
 
def​ query_attribute(attr_name) ​# ...

When you call a method such as Task#description= for the first time, the call is delivered to method_missing. Before it does its job, method_missing ensures that you’re not inadvertently bypassing encapsulation and calling a private method. Then it calls an intriguing-sounding define_attribute_methods method.

We’ll look at define_attribute_methods in a minute. For now, all you need to know is that it defines read, write, and query Dynamic Methods () for all the columns in the database. The next time you call description= or any other accessor that maps to a database column, your call isn’t handled by method_missing. Instead, you call a real, non-ghost method.

When you entered method_missing, description= was a Ghost Method (). Now description= is a regular flesh-and-blood method, and method_missing can call it with a Dynamic Dispatch () and return the result. This process takes place only once for each class that inherits from ActiveRecord::Base. If you enter method_missing a second time for any reason, the class method generated_methods? returns true, and this code is skipped.

The following code shows how define_attribute_methods defines non-ghostly accessors.

 
# Generates all the attribute related methods for columns in the database
 
# accessors, mutators and query methods.
 
def​ define_attribute_methods
 
return​ ​if​ generated_methods?
 
columns_hash.each ​do​ |name, column|
 
unless​ instance_method_already_implemented?(name)
 
if​ self.serialized_attributes[name]
 
define_read_method_for_serialized_attribute(name)
 
elsif​ create_time_zone_conversion_attribute?(name, column)
 
define_read_method_for_time_zone_conversion(name)
 
else
 
define_read_method(name.to_sym, name, column)
 
end
 
end
 
 
unless​ instance_method_already_implemented?(​"​#{name}​="​)
 
if​ create_time_zone_conversion_attribute?(name, column)
 
define_write_method_for_time_zone_conversion(name)
 
else
 
define_write_method(name.to_sym)
 
end
 
end
 
 
unless​ instance_method_already_implemented?(​"​#{name}​?"​)
 
define_question_method(name)
 
end
 
end
 
end

The instance_method_already_implemented? method is there to prevent involuntary Monkeypatches (): if a method by the name of the attribute already exists, then this code skips to the next attribute. Apart from that, the previous code does little but delegate to other methods that do the real work, such as define_read_method or define_write_method.

As an example, take a look at define_write_method. I’ve marked the most important lines with arrows:

*
def​ define_write_method(attr_name)
*
evaluate_attribute_method attr_name,
*
"def ​#{attr_name}​=(new_value);write_attribute('​#{attr_name}​', new_value);end"​,
*
"​#{attr_name}​="
*
end
 
*
def​ evaluate_attribute_method(attr_name, method_definition, method_name=attr_name)
 
unless​ method_name.to_s == primary_key.to_s
 
generated_methods << method_name
 
end
 
 
begin
*
class_eval(method_definition, __FILE__, __LINE__)
 
rescue​ SyntaxError => err
 
generated_methods.delete(attr_name)
 
if​ logger
 
logger.warn ​"Exception occurred during reader method compilation."
 
logger.warn ​"Maybe ​#{attr_name}​ is not a valid Ruby identifier?"
 
logger.warn err.message
 
end
 
end
 
end

The define_write_method method builds a String of Code () that is evaluated by class_eval. For example, if you call description=, then evaluate_attribute_method evaluates this String of Code:

 
def​ description=(new_value);write_attribute(​'description'​, new_value);​end

Thus the description= method is born. A similar process happens for description, description?, and the accessors for all the other database columns.

Here’s a recap of what we’ve covered so far. When you access an attribute for the first time, that attribute is a Ghost Method (). ActiveRecord::Base#method_missing takes this chance to turn the Ghost Method into a real method. While it’s there, method_missing also dynamically defines read, write, and query accessors for all the other database columns. The next time you call that attribute or another database-backed attribute, you find a real accessor method waiting for you, and you don’t have to enter method_missing anymore.

However, this logic doesn’t apply to each and every attribute accessor, as you’ll discover by looking at the second half of method_missing.

Attributes That Stay Dynamic

As it turns out, there are cases where Active Record doesn’t want to define attribute accessors. For example, think of attributes that are not backed by a database column, such as calculated fields:

 
my_query = ​"tasks.*, (description like '%garage%') as heavy_job"
 
task = Task.find(:first, :select => my_query)
 
task.heavy_job? ​# => true

Attributes like heavy_job can be different for each object, so there’s no point in generating Dynamic Methods () to access them. The second half of method_missing deals with these attributes:

 
module​ ActiveRecord
 
module​ AttributeMethods
 
def​ method_missing(method_id, *args, &block)
 
# ...
 
 
if​ self.class.primary_key.to_s == method_name
 
id
 
elsif​ md = self.class.match_attribute_method?(method_name)
 
attribute_name, method_type = md.pre_match, md.to_s
 
if​ @attributes.include?(attribute_name)
 
__send__(​"attribute​#{method_type}​"​, attribute_name, *args, &block)
 
else
 
super
 
end
 
elsif​ @attributes.include?(method_name)
 
read_attribute(method_name)
 
else
 
super
 
end
 
end
 
 
private
 
# Handle *? for method_missing.
 
def​ attribute?(attribute_name)
 
query_attribute(attribute_name)
 
end
 
 
# Handle *= for method_missing.
 
def​ attribute=(attribute_name, value)
 
write_attribute(attribute_name, value)
 
end

Look at the code in method_missing above. If you’re accessing the object’s identifier, then it returns its value. If you’re calling an attribute accessor, then it calls the accessor with either a Dynamic Dispatch () (for write or query accessors) or a direct call to read_attribute (for read accessors). Otherwise, method_missing sends the call up the chain of ancestors with super.

I don’t want to waste your time with unnecessary details, so I only showed you part of the code for attribute methods in Rails 2. What you’ve seen, however, shows that both the feature and its code became more complicated in the second major version of Rails. Let’s see how this trend continued in the following versions.

Rails 3 and 4: More Special Cases

In Rails 1, attribute methods were implemented with a few dozen lines of code. In Rails 2, they had their own file and hundreds of lines of code. In Rails 3, they spanned nine files of source code, not including tests.

As Rails applications became larger and more sophisticated, the authors of the framework kept uncovering small twists, performance optimizations, and corner cases related to attribute methods. The code and the number of metaprogramming tricks it used grew with the number of corner cases. I’ll show you only one of those corner cases, but even this single example is too long to fit in this chapter, so I will just show you a few snippets of code as quickly as I can. Brace yourself.

The example I picked is one of the most extreme optimizations in modern Rails. We’ve seen that Rails 2 improved performance by turning Ghost Methods () into Dynamic Methods (). Rails 4 goes one step further: when it defines an attribute accessor, it also turns it into an UnboundMethod and stores it in a method cache. If a second class has an attribute by the same name, and hence needs the same accessor, Rails 4 just retrieves the previously defined accessor from the cache and binds it to the second class. This way, if different attributes in separate classes happen to have the same name, then Rails defines only a single set of accessor methods and reuses those methods for all attributes. (I’m as surprised as you are that this optimization has a visible effect on performance—but in the case of Rails, it does.)

I’ll start with code from deep inside the attribute methods implementation:

 
module​ ActiveRecord
 
module​ AttributeMethods
 
module​ Read
 
extend ActiveSupport::Concern
 
 
module​ ClassMethods
 
if​ Module.methods_transplantable?
 
def​ define_method_attribute(name)
 
method = ReaderMethodCache[name]
 
generated_attribute_methods.module_eval { define_method name, method }
 
end
 
else
 
def​ define_method_attribute(name)
 
# ...
 
end
 
end

This code defines a method named define_method_attribute. This method will ultimately become a class method of ActiveRecord::Base, thanks to the mechanism we discussed in Chapter 10, . Here, however, comes a twist: define_method_attribute is defined differently depending on the result of the Module.methods_transplantable? method.

Module.methods_transplantable? comes from the Active Support library, and it answers one very specific question: can I bind an UnboundMethod to an object of a different class? In , I mentioned that you can only do that from Ruby 2.0 onward, so this code defines define_method_attribute in two different ways depending on whether you’re running Rails on Ruby 1.9 or 2.x.

Assume that you’re running Ruby 2.0 or later. In this case, define_method_attribute retrieves an UnboundMethod from a cache of methods, and it binds the method to the current module with define_method. The cache of methods is stored in a constant named ReaderMethodCache.

(The call to generated_attribute_methods might look confusing—it returns a Clean Room () that serializes method definitions happening in different threads.)

Let’s go see how ReaderMethodCache is initialized. The long comment gives an idea of how tricky it must have been to write this code:

 
module​ ActiveRecord
 
module​ AttributeMethods
 
module​ Read
 
ReaderMethodCache = Class.new(AttributeMethodCache) {
 
private
 
# We want to generate the methods via module_eval rather than
 
# define_method, because define_method is slower on dispatch.
 
# Evaluating many similar methods may use more memory as the instruction
 
# sequences are duplicated and cached (in MRI). define_method may
 
# be slower on dispatch, but if you're careful about the closure
 
# created, then define_method will consume much less memory.
 
#
 
# But sometimes the database might return columns with
 
# characters that are not allowed in normal method names (like
 
# 'my_column(omg)'. So to work around this we first define with
 
# the __temp__ identifier, and then use alias method to rename
 
# it to what we want.
 
#
 
# We are also defining a constant to hold the frozen string of
 
# the attribute name. Using a constant means that we do not have
 
# to allocate an object on each call to the attribute method.
 
# Making it frozen means that it doesn't get duped when used to
 
# key the @attributes_cache in read_attribute.
 
def​ method_body(method_name, const_name)
 
<<-EOMETHOD
 
def ​#{method_name}
 
name = ::ActiveRecord::AttributeMethods::AttrNames::ATTR_​#{const_name}
 
read_attribute(name) { |n| missing_attribute(n, caller) }
 
end
 
EOMETHOD
 
end
 
}.new

ReaderMethodCache is an instance of an anonymous class—a subclass of AttributeMethodCache. This class defines a single method that returns a String of Code (). (If you’re perplexed by the call to Class.new, take a look back at . If you don’t understand the EOMETHOD lines, read about “here documents” in .)

Let’s leave ReaderMethodCache for a moment and move to the definition of its superclass AttributeMethodCache:

 
module​ ActiveRecord
 
module​ AttributeMethods
 
AttrNames = Module.new {
 
def​ self.set_name_cache(name, value)
 
const_name = ​"ATTR_​#{name}​"
 
unless​ const_defined? const_name
 
const_set const_name, value.dup.freeze
 
end
 
end
 
}
 
 
class​ AttributeMethodCache
 
def​ initialize
 
@module = Module.new
 
@method_cache = ThreadSafe::Cache.new
 
end
 
def​ [](name)
 
@method_cache.compute_if_absent(name) ​do
 
safe_name = name.unpack(​'h*'​).first
 
temp_method = ​"__temp__​#{safe_name}​"
 
ActiveRecord::AttributeMethods::AttrNames.set_name_cache safe_name, name
 
@module.module_eval method_body(temp_method, safe_name),
 
__FILE__, __LINE__
 
@module.instance_method temp_method
 
end
 
end
 
 
private
 
def​ method_body; raise NotImplementedError; ​end
 
end

First, look at AttrNames: it’s a module with one single method, set_name_cache. Given a name and a value, set_name_cache defines a conventionally named constant with that value. For example, if you pass it the string "description", then it defines a constant named ATTR_description. AttrNames is somewhat similar to a Clean Room (); it only exists to store constants that represent the names of attributes.

Now move down to AttributeMethodCache. Its [] method takes the name of an attribute, and it returns an accessor to that attribute as an UnboundMethod. It also takes care of at least one important special case: attribute accessors are Ruby methods, but not all attributes names are valid Ruby method names. (You can read one counterexample in the comment to ReaderMethodCache#method_body above.) This code solves that problem by decoding the attribute name to an hexadecimal sequence and creating a conventional safe method name from it.

Once it has a safe name for the accessor, AttributeMethodCache#[] calls method_body to get a String of Code that defines the accessor’s body, and it defines the accessor inside a Clean Room named simply @module. (We discussed additional arguments to method_eval, such as __FILE__ and __LINE__, in .) Finally, AttributeMethodCache#[] gets the newly created accessor method from the Clean Room and returns it as an UnboundMethod.

On subsequent calls, AttributeMethodCache#[] won’t need to define the method anymore. Instead, @method_cache.compute_if_absent will store the result and return it automatically. This policy shaves some time off in cases where the same accessor is defined on multiple classes.

To close the loop, look back at the code of ReaderMethodCache. By overriding method_body and returning the String of Code for a read accessor, ReaderMethodCache turns the generic AttributeMethodCache into a cache for read accessors. As you might expect, there is also a WriterMethodCache class that takes care of write accessors.

Is your head spinning a little after this long explanation? Mine is. This example shows how deep and complex attribute methods have become, how many special cases they have covered, and how much they’ve changed since their simple beginnings. Now we can draw some general conclusions.

Назад: Attribute Methods in Action
Дальше: A Lesson Learned