Книга: Metaprogramming Ruby 2
Назад: Coding Your Way to the Weekend
Дальше: Quiz: Checked Attributes (Step 1)

Kernel#eval

Where you learn that, when it comes right down to it, code is just text.

You already learned about instance_eval and class_eval (in , and , respectively). Now you can get acquainted with the third member of the *eval family—a Kernel Method () that’s simply named eval. Kernel#eval is the most straightforward of the three *eval methods. Instead of a block, it takes a string that contains Ruby code—a Spell: for short. Kernel#eval executes the code in the string and returns the result:

 
array = [10, 20]
 
element = 30
 
eval(​"array << element"​) ​# => [10, 20, 30]

Executing a literal string of Ruby code is a pretty pointless exercise, but the power of eval becomes apparent when you compute your Strings of Code on the fly. Here’s an example.

The REST Client Example

REST Client (installed with gem install rest-client) is a simple HTTP client library. It includes an interpreter where you can issue regular Ruby commands together with HTTP methods such as get:

=> 
restclient http://www.twitter.com
 
> html_first_chars = get(​"/"​)[0..14]
 
=> ​"<!DOCTYPE html>"

If you look in the gem’s source, you will see that get and the three other basic HTTP methods are defined on the Resource class:

 
module​ RestClient
 
class​ Resource
 
def​ get(additional_headers={}, &block) ​# ...
 
def​ post(payload, additional_headers={}, &block) ​# ...
 
def​ put(payload, additional_headers={}, &block) ​# ...
 
def​ delete(additional_headers={}, &block) ​# ...

To make get and its siblings available in the interpreter, REST Client defines four top-level methods that delegate to the methods of a Resource at a specific URL. For example, here is how the top-level get delegates to a Resource (returned by the r method):

 
def​ get(path, *args, &b)
 
r[path].get(*args, &b)
 
end

You might expect to find this definition of get in the source code, together with similar definitions for put, post, and delete. However, here comes a twist. Instead of defining the four methods separately, REST Client defines all of them in one shot by creating and evaluating four Strings of Code () in a loop:

 
POSSIBLE_VERBS = [​'get'​, ​'put'​, ​'post'​, ​'delete'​]
 
 
POSSIBLE_VERBS.each ​do​ |m|
 
eval ​<<-end_eval
 
def ​#{m}​(path, *args, &b)
 
r[path].​#{m}​(*args, &b)
 
end
 
end_eval
 
end

The code above uses an exotic syntax known as a here document, or heredoc for short. What you’re seeing after the eval is just a regular Ruby string, although it’s not delimited by the usual quotes. Instead, it starts with a <<- sequence followed by an arbitrary termination sequence—in this case, end_eval. The string ends on the first line that contains only the termination sequence, so this particular string spans the lines from the def to the first end included. The code uses regular string substitution to generate and eval four Strings of Code, one each for the definitions of get, put, post, and delete.

Most Strings of Code feature some kind of string substitution, as in the example above. For an alternate way to use eval, you can evaluate arbitrary Strings of Code from an external source, effectively building your own simple Ruby interpreter.

If you want to use Kernel#eval to its fullest potential, you should also learn about the Binding class.

Binding Objects

A Binding is a whole scope packaged as an object. The idea is that you can create a Binding to capture the local scope and carry it around. Later, you can execute code in that scope by using the Binding object in conjunction with eval.

You can create a Binding with the Kernel#binding method:

 
class​ MyClass
 
def​ my_method
 
@x = 1
 
binding
 
end
 
end
 
 
b = MyClass.new.my_method

You can think of Binding objects as “purer” forms of closures than blocks because these objects contain a scope but don’t contain code. You can evaluate code in the captured scope by passing the Binding as an additional argument to eval:

 
eval ​"@x"​, b ​# => 1

Ruby also provides a predefined constant named TOPLEVEL_BINDING, which is just a Binding of the top-level scope. You can use it to access the top-level scope from anywhere in your program:

 
class​ AnotherClass
 
def​ my_method
 
eval ​"self"​, TOPLEVEL_BINDING
 
end
 
end
 
 
AnotherClass.new.my_method ​# => main

One gem that makes good use of bindings is Pry, which you met in . Pry defines an Object#pry method that opens an interactive session inside the object’s scope, similar to what irb does with nested sessions. You can use this function as a debugger of sorts: instead of setting a breakpoint, you add a line to your code that calls pry on the current bindings, as shown in the following code.

 
# code...
 
require ​"pry"​; binding.pry
 
# more code...

The call to binding.pry opens a Ruby interpreter in the current bindings, right inside the running process. From there, you can read and change your variables at will. When you want to exit the interpreter, just type exit to continue running the program. Thanks to this feature, Pry is a great alternative to traditional debuggers.

Pry is not the only command-line interpreter that uses bindings. Let’s also look at irb, the default Ruby command-line utility.

The irb Example

At its core, irb is just a simple program that parses the standard input or a file and passes each line to eval. (This type of program is sometimes called a Spell: .) Here’s the line that calls eval, deep within irb’s source code, in a file named workspace.rb:

 
eval(statements, @binding, file, line)

The statements argument is just a line of Ruby code. But what about those three additional arguments to eval? Let’s go through them.

The first optional argument to eval is a Binding, and irb can change this argument to evaluate code in different contexts. This happens, for example, when you open a nested irb session on a specific object, by typing irb followed by the name of an object in an existing irb session. As a result, your next commands will be evaluated in the context of the object, similar to what instance_eval does.

What about file and line, the remaining two optional arguments to eval? These arguments are used to tweak the stack trace in case of exceptions. You can see how they work by writing a Ruby program that raises an exception:

 
# this file raises an Exception on the second line
 
x = 1 / 0

You can process this program with irb by typing irb exception.rb at the prompt. If you do that, you’ll get an exception on line 2 of exception.rb:

<= 
ZeroDivisionError: divided by 0
 
from exception.rb:2:in `/'

When irb calls eval, it calls it with the filename and line number it’s currently processing. That’s why you get the right information in the exception’s stack trace. Just for fun, you can hack irb’s source and remove the last two arguments from the call to eval (remember to undo the change afterward):

 
eval(statements, @binding) ​# , file, line)

Run irb exception.rb now, and the exception reports the file and line where eval is called:

<= 
ZeroDivisionError: divided by 0
 
from /Users/nusco/.rvm/rubies/ruby-2.0.0/lib/ruby/2.0.0/irb/workspace.rb:54:in `/'

This kind of hacking of the stack trace is especially useful when you write Code Processors—but consider using it everywhere you evaluate a String of Code () so you can get a better stack trace in case of an exception.

Strings of Code vs. Blocks

In , you learned that eval is a special case in the eval* family: it evaluates a String of Code () instead of a block, like both class_eval and instance_eval do. However, this is not the whole truth. Although it’s true that eval always requires a string, instance_eval and class_eval can take either a String of Code or a block.

This shouldn’t come as a big surprise. After all, code in a string is not that different from code in a block. Strings of Code can even access local variables like blocks do:

 
array = [​'a'​, ​'b'​, ​'c'​]
 
x = ​'d'
 
array.instance_eval ​"self[1] = x"
 
 
array ​# => ["a", "d", "c"]

Because a block and a String of Code are so similar, in many cases you have the option of using either one. Which one should you choose? The short answer is that you should probably avoid Strings of Code whenever you have an alternative. Let’s see why.

The Trouble with eval()

Strings of Code are powerful, no doubt about that. But with great power comes great responsibility—and danger.

To start with, Strings of Code don’t always play well with your editor’s syntax coloring and autocompletion. Even when they do get along with everyone, Strings of Code tend to be difficult to read and modify. Also, Ruby won’t report a syntax error in a String of Code until that string is evaluated, potentially resulting in brittle programs that fail unexpectedly at runtime.

Thankfully, these annoyances are minor compared to the biggest issue with eval: security. This particular problem calls for a more detailed explanation.

Code Injection

Assume that, like most people, you have trouble remembering what each of the umpteen methods of Array does. As a speedy way to refresh your memory, you can write an eval-based utility that allows you to call a method on a sample array and view the result (call it the array explorer):

 
def​ explore_array(method)
 
code = ​"['a', 'b', 'c'].​#{method}​"
 
puts ​"Evaluating: ​#{code}​"
 
eval code
 
end
 
 
loop { p explore_array(gets()) }

The infinite loop on the last line collects strings from the standard input and feeds these strings to explore_array. In turn, explore_array turns the strings into method calls on a sample array. For example, if you feed the string "revert()" to explore_array, the method will evaluate the string "[’a’, ’b’, ’c’].revert()". It’s time to try out this utility:

=> 
find_index("b")
<= 
Evaluating: ['a', 'b', 'c'].find_index("b")
 
1
=> 
map! {|e| e.next }
<= 
Evaluating: ['a', 'b', 'c'].map! {|e| e.next }
 
["b", "c", "d"]

Now imagine that, being a sharing kind of person, you decide to make this program widely available on the web. You hack together a quick web page, and—presto!—you have a site where people can call array methods and see the results. (To sound like a proper startup, you might call this site “Arry” or maybe “MeThood.”) Your wonderful site takes the Internet by storm, until a sneaky user feeds it a string like this:

=> 
object_id; Dir.glob("*")
<= 
['a', 'b', 'c'].object_id; Dir.glob("*") => [your own private information here]

The input is an inconsequential call to the array, followed by a command that lists all the files in your program’s directory. Oh, the horror! Your malicious user can now execute arbitrary code on your computer—code that does something terrible like wipe your hard disk clean or post your love letters to your entire address book. This kind of exploit is called a code injection attack.

Defending Yourself from Code Injection

How can you protect your code from code injection? You might parse all Strings of Code () to identify operations that are potentially dangerous. This approach may prove ineffective, though, because there are so many possible ways to write malicious code. Trying to outsmart a determined hacker can be dangerous to both your computer and your ego.

When it comes to code injection, some strings are safer than others. Only strings that derive from an external source can contain malicious code, so you might simply limit your use of eval to those strings that you wrote yourself. This is the case in . In more complicated cases, however, it can be surprisingly difficult to track which strings come from where.

With all these challenges, some programmers advocate banning eval altogether. Programmers tend to be paranoid about anything that might possibly go wrong, so this eval ban turns out to be a pretty popular choice. (Actually, we’re not paranoid. It’s the government putting something in the tap water that makes us feel that way.)

If you do away with eval, you’ll have to look for alternative techniques on a case-by-case basis. For an example, look back at the eval in . You could replace it with a Dynamic Method () and Dynamic Dispatch ():

 
POSSIBLE_VERBS.each ​do​ |m|
 
define_method m ​do​ |path, *args, &b|
 
r[path].send(m, *args, &b)
 
end
 
end

In the same way, you could rewrite the Array Explorer utility from , by using a Dynamic Dispatch in place of eval:

 
def​ explore_array(method, *arguments)
 
[​'a'​, ​'b'​, ​'c'​].send(method, *arguments)
 
end

Still, there are times when you might just miss eval. For example, this latest, safer version of the Array Explorer wouldn’t allow your web user to call a method that takes a block. If you want to describe a Ruby block in a web interface, you need to allow the user to insert arbitrary Strings of Code.

It’s not easy to hit the sweet spot between too much eval and no eval at all. If you don’t want to abstain from eval completely, Ruby does provide some features that make it somewhat safer. Let’s take a look at them.

Tainted Objects and Safe Levels

Ruby automatically marks potentially unsafe objects—in particular, objects that come from external sources—as tainted. Tainted objects include strings that your program reads from web forms, files, the command line, or even a system variable. Every time you create a new string by manipulating tainted strings, the result is itself tainted. Here’s an example program that checks whether an object is tainted by calling its tainted? method:

 
# read user input
 
user_input = ​"User input: ​#{gets()}​"
 
puts user_input.tainted?
=> 
x = 1
<= 
true

If you had to check every string for taintedness, then you wouldn’t be in a much better position than if you had simply tracked unsafe strings on your own. But Ruby also provides the notion of safe levels, which complement tainted objects nicely. When you set a safe level (which you can do by assigning a value to the $SAFE global variable), you disallow certain potentially dangerous operations.

You can choose from four safe levels, from the default 0 (“hippie commune,” where you can hug trees and format hard disks) to 3 (“military dictatorship,” where every object you create is tainted by default). A safe level of 2, for example, disallows most file-related operations. Any safe level greater than 0 also causes Ruby to flat-out refuse to evaluate tainted strings:

 
$SAFE = 1
 
user_input = ​"User input: ​#{gets()}​"
 
eval user_input
=> 
x = 1
<= 
SecurityError: Insecure operation - eval

Ruby 2.0 and earlier also had a safe level of 4 that didn’t even allow you to exit the program freely. For various reasons, this extreme safe level turned out to be not as secure as people assumed it would be, so it has been removed in Ruby 2.1.

To fine-tune safety, you can explicitly remove the taintedness on Strings of Code before you evaluate them (you can do that by calling Object#untaint) and then rely on safe levels to disallow dangerous operations such as disk access.

By using safe levels carefully, you can write a controlled environment for eval. Such an environment is called a Spell: . Let’s take a look at a Sandbox taken from a real-life library.

The ERB Example

The ERB standard library is the default Ruby template system. This library is a Code Processor () that you can use to embed Ruby into any file, such as this template containing a snippet of HTML:

 
<p>​​<strong>​Wake up!​</strong>​ It's a nice sunny <%= Time.new.strftime(​"%A"​) %>.​</p>

The special <%= ... > tag contains embedded Ruby code. When you pass this template through ERB, the code is evaluated:

 
require ​'erb'
 
erb = ERB.new(File.read(​'template.rhtml'​))
 
erb.run
<= 
<p><strong>Wake up!</strong> It's a nice sunny Friday.</p>

Somewhere in ERB’s source, there must be a method that takes a snippet of Ruby code extracted from the template and passes it to eval. Sure enough, here it is:

 
class​ ERB
 
def​ result(b=new_toplevel)
 
if​ @safe_level
 
proc {
 
$SAFE = @safe_level
 
eval(@src, b, (@filename || ​'(erb)'​), 0)
 
}.call
 
else
 
eval(@src, b, (@filename || ​'(erb)'​), 0)
 
end
 
end
 
#...

new_toplevel is a method that returns a copy of TOPLEVEL_BINDING. The @src instance variable carries the content of a code tag, and the @safe_level instance variable contains the safe level required by the user. If no safe level is set, the content of the tag is simply evaluated. Otherwise, ERB builds a quick Sandbox (): it makes sure that the global safe level is exactly what the user asked for and also uses a Proc as a Clean Room () to execute the code in a separate scope. (Note that the new value of $SAFE applies only inside the Proc. Contrary to what happens with other global variables, the Ruby interpreter takes care to reset $SAFE to its former value after the call.)

“Now,” Bill says, finally wrapping up his long explanation, “you know about eval and how dangerous it can be. But eval is great to get code up and running quickly. That’s why you can use this method as a first step to solve your original problem: writing the attribute generator for the boss.”

Назад: Coding Your Way to the Weekend
Дальше: Quiz: Checked Attributes (Step 1)