A History of Complexity

Instead of looking at the current implementation of attribute methods, let me go all the way back to 2004—the year that Rails 1.0.0 was unleashed on an unsuspecting world.

Rails 1: Simple Beginnings

In the very first version of Rails, the implementation of attribute methods was just a few lines of code:


	module ActiveRecord
	class Base
	def initialize(attributes = nil)
	@attributes = attributes_from_column_definition
	# ...
	end

	def attribute_names
	@attributes.keys.sort
	end

	alias_method :respond_to_without_attributes?, :respond_to?

	def respond_to?(method)
	@@dynamic_methods \|\|= attribute_names +
	attribute_names.collect { \|attr\| attr + "=" } +
	attribute_names.collect { \|attr\| attr + "?" }
	@@dynamic_methods.include?(method.to_s) ?
	true :
	respond_to_without_attributes?(method)
	end

	def method_missing(method_id, *arguments)
	method_name = method_id.id2name

	if method_name =~ read_method? && @attributes.include?($1)
	return read_attribute($1)
	elsif method_name =~ write_method?
	write_attribute($1, arguments[0])
	elsif method_name =~ query_method?
	return query_attribute($1)
	else
	super
	end
	end

	def read_method?() /^([a-zA-Z][-_\w])[^=?]$/ end
	def write_method?() /^([a-zA-Z][-_\w])=.$/ end
	def query_method?() /^([a-zA-Z][-_\w])\?$/ end*

	def read_attribute(attr_name) # ...
	def write_attribute(attr_name, value) #...
	def query_attribute(attr_name) # ...

Take a look at the initialize method: when you create an ActiveRecord::Base object, its @attributes instance variable is populated with the name of the attributes from the database. For example, if the relevant table in the database has a column named description, then @attributes will contain the string "description", among others.

Now skip down to method_missing, where those attribute names become the names of Ghost Methods (). When you call a method such as description=, method_missing notices two things: first, description is the name of an attribute; and second, the name of description= matches the regular expression for write accessors. As a result, method_missing calls write_attribute("description"), which writes the value of the description in the database. A similar process happens for query accessors (that end in a question mark) and read accessors (that are just the same as attribute names).

In Chapter 3, , you also learned that it’s generally a good idea to redefine respond_to? (or respond_to_missing?) together with method_missing. For example, if I can call my_task.description, then I expect that my_task.respond_to?(:description) returns true. The ActiveRecord::Base#respond_to? method is an Around Alias () of the original respond_to?, and it also checks whether a method name matches the rules for attribute readers, writers, or queries. The overridden respond_to? uses a Nil Guard () to calculate those names only once, and store them in an @@dynamic_methods class variable.

I stopped short of showing you the code that accesses the database, such as read_attribute, write_attribute, and query_attribute. Apart from that, you’ve just looked at the entire implementation of attribute methods in Rails 1. By the time Rails 2 came out, however, this code had become more complex.

Rails 2: Focus on Performance

Do you remember the explanation of method_missing in Chapter 3, ? When you call a method that doesn’t exist, Ruby walks up the chain of ancestors looking for the method. If it reaches BasicObject without finding the method, then it starts back at the bottom and calls method_missing. This means that, in general, calling a Ghost Method () is slower than calling a normal method, because Ruby has to walk up the entire chain of ancestors at least once.

In most concrete cases, this difference in performance between Ghost Methods and regular methods is negligible. In Rails, however, attribute methods are called very frequently. In Rails 1, each of those calls also had to walk up ActiveRecord::Base’s extremely long chain of ancestors. As a result, performance suffered.

The authors of Rails could solve this performance problem by replacing Ghost Methods with Dynamic Methods ()—using define_method to create read, write, and query accessors for all attributes, and getting rid of method_missing altogether. Interestingly, however, they went for a mixed solution, including both Ghost Methods and Dynamic Methods. Let’s look at the result.

Ghosts Incarnated

If you check the source code of Rails 2, you’ll see that the code for attribute methods moved from ActiveRecord::Base itself to a separate ActiveRecord::AttributeMethods module, which is then included by Base. The original method_missing has also become complicated, so we will discuss it in two separate parts. Here is the first part:


	module ActiveRecord
	module AttributeMethods
	def method_missing(method_id, *args, &block)
	method_name = method_id.to_s

	if self.class.private_method_defined?(method_name)
	raise NoMethodError.new("Attempt to call private method", method_name, args)
	end

	# If we haven't generated any methods yet, generate them, then
	# see if we've created the method we're looking for.
	if !self.class.generated_methods?
	self.class.define_attribute_methods
	if self.class.generated_methods.include?(method_name)
	return self.send(method_id, *args, &block)
	end
	end

	# ...
	end

	def read_attribute(attr_name) # ...
	def write_attribute(attr_name, value) # ...
	def query_attribute(attr_name) # ...

When you call a method such as Task#description= for the first time, the call is delivered to method_missing. Before it does its job, method_missing ensures that you’re not inadvertently bypassing encapsulation and calling a private method. Then it calls an intriguing-sounding define_attribute_methods method.

We’ll look at define_attribute_methods in a minute. For now, all you need to know is that it defines read, write, and query Dynamic Methods () for all the columns in the database. The next time you call description= or any other accessor that maps to a database column, your call isn’t handled by method_missing. Instead, you call a real, non-ghost method.

When you entered method_missing, description= was a Ghost Method (). Now description= is a regular flesh-and-blood method, and method_missing can call it with a Dynamic Dispatch () and return the result. This process takes place only once for each class that inherits from ActiveRecord::Base. If you enter method_missing a second time for any reason, the class method generated_methods? returns true, and this code is skipped.

The following code shows how define_attribute_methods defines non-ghostly accessors.


	# Generates all the attribute related methods for columns in the database
	# accessors, mutators and query methods.
	def define_attribute_methods
	return if generated_methods?
	columns_hash.each do \|name, column\|
	unless instance_method_already_implemented?(name)
	if self.serialized_attributes[name]
	define_read_method_for_serialized_attribute(name)
	elsif create_time_zone_conversion_attribute?(name, column)
	define_read_method_for_time_zone_conversion(name)
	else
	define_read_method(name.to_sym, name, column)
	end
	end

	unless instance_method_already_implemented?("#{name}=")
	if create_time_zone_conversion_attribute?(name, column)
	define_write_method_for_time_zone_conversion(name)
	else
	define_write_method(name.to_sym)
	end
	end

	unless instance_method_already_implemented?("#{name}?")
	define_question_method(name)
	end
	end
	end

The instance_method_already_implemented? method is there to prevent involuntary Monkeypatches (): if a method by the name of the attribute already exists, then this code skips to the next attribute. Apart from that, the previous code does little but delegate to other methods that do the real work, such as define_read_method or define_write_method.

As an example, take a look at define_write_method. I’ve marked the most important lines with arrows:


*	def define_write_method(attr_name)
*	evaluate_attribute_method attr_name,
*	"def #{attr_name}=(new_value);write_attribute('#{attr_name}', new_value);end",
*	"#{attr_name}="
*	end

*	def evaluate_attribute_method(attr_name, method_definition, method_name=attr_name)
	unless method_name.to_s == primary_key.to_s
	generated_methods << method_name
	end

	begin
*	class_eval(method_definition, __FILE__, __LINE__)
	rescue SyntaxError => err
	generated_methods.delete(attr_name)
	if logger
	logger.warn "Exception occurred during reader method compilation."
	logger.warn "Maybe #{attr_name} is not a valid Ruby identifier?"
	logger.warn err.message
	end
	end
	end

The define_write_method method builds a String of Code () that is evaluated by class_eval. For example, if you call description=, then evaluate_attribute_method evaluates this String of Code:

​def​ description=(new_value);write_attribute(​'description'​, new_value);​end​

Thus the description= method is born. A similar process happens for description, description?, and the accessors for all the other database columns.

Here’s a recap of what we’ve covered so far. When you access an attribute for the first time, that attribute is a Ghost Method (). ActiveRecord::Base#method_missing takes this chance to turn the Ghost Method into a real method. While it’s there, method_missing also dynamically defines read, write, and query accessors for all the other database columns. The next time you call that attribute or another database-backed attribute, you find a real accessor method waiting for you, and you don’t have to enter method_missing anymore.

However, this logic doesn’t apply to each and every attribute accessor, as you’ll discover by looking at the second half of method_missing.

Attributes That Stay Dynamic

As it turns out, there are cases where Active Record doesn’t want to define attribute accessors. For example, think of attributes that are not backed by a database column, such as calculated fields:


	my_query = "tasks., (description like '%garage%') as heavy_job"*
	task = Task.find(:first, :select => my_query)
	task.heavy_job? # => true

Attributes like heavy_job can be different for each object, so there’s no point in generating Dynamic Methods () to access them. The second half of method_missing deals with these attributes:


	module ActiveRecord
	module AttributeMethods
	def method_missing(method_id, *args, &block)
	# ...

	if self.class.primary_key.to_s == method_name
	id
	elsif md = self.class.match_attribute_method?(method_name)
	attribute_name, method_type = md.pre_match, md.to_s
	if @attributes.include?(attribute_name)
	__send__("attribute#{method_type}", attribute_name, *args, &block)
	else
	super
	end
	elsif @attributes.include?(method_name)
	read_attribute(method_name)
	else
	super
	end
	end

	private
	# Handle ? for method_missing.*
	def attribute?(attribute_name)
	query_attribute(attribute_name)
	end

	# Handle = for method_missing.*
	def attribute=(attribute_name, value)
	write_attribute(attribute_name, value)
	end

Look at the code in method_missing above. If you’re accessing the object’s identifier, then it returns its value. If you’re calling an attribute accessor, then it calls the accessor with either a Dynamic Dispatch () (for write or query accessors) or a direct call to read_attribute (for read accessors). Otherwise, method_missing sends the call up the chain of ancestors with super.

I don’t want to waste your time with unnecessary details, so I only showed you part of the code for attribute methods in Rails 2. What you’ve seen, however, shows that both the feature and its code became more complicated in the second major version of Rails. Let’s see how this trend continued in the following versions.

Rails 3 and 4: More Special Cases

In Rails 1, attribute methods were implemented with a few dozen lines of code. In Rails 2, they had their own file and hundreds of lines of code. In Rails 3, they spanned nine files of source code, not including tests.

As Rails applications became larger and more sophisticated, the authors of the framework kept uncovering small twists, performance optimizations, and corner cases related to attribute methods. The code and the number of metaprogramming tricks it used grew with the number of corner cases. I’ll show you only one of those corner cases, but even this single example is too long to fit in this chapter, so I will just show you a few snippets of code as quickly as I can. Brace yourself.

The example I picked is one of the most extreme optimizations in modern Rails. We’ve seen that Rails 2 improved performance by turning Ghost Methods () into Dynamic Methods (). Rails 4 goes one step further: when it defines an attribute accessor, it also turns it into an UnboundMethod and stores it in a method cache. If a second class has an attribute by the same name, and hence needs the same accessor, Rails 4 just retrieves the previously defined accessor from the cache and binds it to the second class. This way, if different attributes in separate classes happen to have the same name, then Rails defines only a single set of accessor methods and reuses those methods for all attributes. (I’m as surprised as you are that this optimization has a visible effect on performance—but in the case of Rails, it does.)

I’ll start with code from deep inside the attribute methods implementation:


	module ActiveRecord
	module AttributeMethods
	module Read
	extend ActiveSupport::Concern

	module ClassMethods
	if Module.methods_transplantable?
	def define_method_attribute(name)
	method = ReaderMethodCache[name]
	generated_attribute_methods.module_eval { define_method name, method }
	end
	else
	def define_method_attribute(name)
	# ...
	end
	end

This code defines a method named define_method_attribute. This method will ultimately become a class method of ActiveRecord::Base, thanks to the mechanism we discussed in Chapter 10, . Here, however, comes a twist: define_method_attribute is defined differently depending on the result of the Module.methods_transplantable? method.

Module.methods_transplantable? comes from the Active Support library, and it answers one very specific question: can I bind an UnboundMethod to an object of a different class? In , I mentioned that you can only do that from Ruby 2.0 onward, so this code defines define_method_attribute in two different ways depending on whether you’re running Rails on Ruby 1.9 or 2.x.

Assume that you’re running Ruby 2.0 or later. In this case, define_method_attribute retrieves an UnboundMethod from a cache of methods, and it binds the method to the current module with define_method. The cache of methods is stored in a constant named ReaderMethodCache.

(The call to generated_attribute_methods might look confusing—it returns a Clean Room () that serializes method definitions happening in different threads.)

Let’s go see how ReaderMethodCache is initialized. The long comment gives an idea of how tricky it must have been to write this code:


	module ActiveRecord
	module AttributeMethods
	module Read
	ReaderMethodCache = Class.new(AttributeMethodCache) {
	private
	# We want to generate the methods via module_eval rather than
	# define_method, because define_method is slower on dispatch.
	# Evaluating many similar methods may use more memory as the instruction
	# sequences are duplicated and cached (in MRI). define_method may
	# be slower on dispatch, but if you're careful about the closure
	# created, then define_method will consume much less memory.
	#
	# But sometimes the database might return columns with
	# characters that are not allowed in normal method names (like
	# 'my_column(omg)'. So to work around this we first define with
	# the __temp__ identifier, and then use alias method to rename
	# it to what we want.
	#
	# We are also defining a constant to hold the frozen string of
	# the attribute name. Using a constant means that we do not have
	# to allocate an object on each call to the attribute method.
	# Making it frozen means that it doesn't get duped when used to
	# key the @attributes_cache in read_attribute.
	def method_body(method_name, const_name)
	<<-EOMETHOD
	def #{method_name}
	name = ::ActiveRecord::AttributeMethods::AttrNames::ATTR_#{const_name}
	read_attribute(name) { \|n\| missing_attribute(n, caller) }
	end
	EOMETHOD
	end
	}.new

ReaderMethodCache is an instance of an anonymous class—a subclass of AttributeMethodCache. This class defines a single method that returns a String of Code (). (If you’re perplexed by the call to Class.new, take a look back at . If you don’t understand the EOMETHOD lines, read about “here documents” in .)

Let’s leave ReaderMethodCache for a moment and move to the definition of its superclass AttributeMethodCache:


	module ActiveRecord
	module AttributeMethods
	AttrNames = Module.new {
	def self.set_name_cache(name, value)
	const_name = "ATTR_#{name}"
	unless const_defined? const_name
	const_set const_name, value.dup.freeze
	end
	end
	}

	class AttributeMethodCache
	def initialize
	@module = Module.new
	@method_cache = ThreadSafe::Cache.new
	end
	def [](name)
	@method_cache.compute_if_absent(name) do
	safe_name = name.unpack('h'*).first
	temp_method = "__temp__#{safe_name}"
	ActiveRecord::AttributeMethods::AttrNames.set_name_cache safe_name, name
	@module.module_eval method_body(temp_method, safe_name),
	__FILE__, __LINE__
	@module.instance_method temp_method
	end
	end

	private
	def method_body; raise NotImplementedError; end
	end

First, look at AttrNames: it’s a module with one single method, set_name_cache. Given a name and a value, set_name_cache defines a conventionally named constant with that value. For example, if you pass it the string "description", then it defines a constant named ATTR_description. AttrNames is somewhat similar to a Clean Room (); it only exists to store constants that represent the names of attributes.

Now move down to AttributeMethodCache. Its [] method takes the name of an attribute, and it returns an accessor to that attribute as an UnboundMethod. It also takes care of at least one important special case: attribute accessors are Ruby methods, but not all attributes names are valid Ruby method names. (You can read one counterexample in the comment to ReaderMethodCache#method_body above.) This code solves that problem by decoding the attribute name to an hexadecimal sequence and creating a conventional safe method name from it.

Once it has a safe name for the accessor, AttributeMethodCache#[] calls method_body to get a String of Code that defines the accessor’s body, and it defines the accessor inside a Clean Room named simply @module. (We discussed additional arguments to method_eval, such as __FILE__ and __LINE__, in .) Finally, AttributeMethodCache#[] gets the newly created accessor method from the Clean Room and returns it as an UnboundMethod.

On subsequent calls, AttributeMethodCache#[] won’t need to define the method anymore. Instead, @method_cache.compute_if_absent will store the result and return it automatically. This policy shaves some time off in cases where the same accessor is defined on multiple classes.

To close the loop, look back at the code of ReaderMethodCache. By overriding method_body and returning the String of Code for a read accessor, ReaderMethodCache turns the generic AttributeMethodCache into a cache for read accessors. As you might expect, there is also a WriterMethodCache class that takes care of write accessors.

Is your head spinning a little after this long explanation? Mine is. This example shows how deep and complex attribute methods have become, how many special cases they have covered, and how much they’ve changed since their simple beginnings. Now we can draw some general conclusions.

Назад: Attribute Methods in Action

Дальше: A Lesson Learned