Ruby

From DesigningPatterns

Jump to: navigation, search

Contents

Background

Ruby is an interpreted, multi-platform scripting language. It is considered to be a "very high level language", and the closest comparison (and competitor) probably is Python. It is used heavily in the web space (particularly via Ruby on Rails) but also is used for system administration. Ruby was influenced heavily by LISP.

Characteristics

General

  • Everything is an expression in Ruby and returns a value (even branching and looping constructs); there are no statements.* The case of a variable name is significant (i.e., the first letter of a global variable always is capitalized).
  • Ruby has very weak scoping. In particular, the scope of an entity can be within a method, within a class, or global; there is no notion of scoping within an if statement or while loop (as in C(++) and Java). On the other hand, Ruby does scope variables within blocks.
  • Ruby is fully garbage collected.
  • Ruby has lots of predefined global variables, very similar to Perl.
  • Ruby supports a BEGIN keyword to execute code before starting at the firt line of the file and an END keyword to execute code after the interpreter has finished with the file. The semantics of these are a bit weird, so they probably should not be used.
  • There are no pointers.
  • Objects almost always are manipulated through references (and arguments are pass by reference). Variable assignment actually changes a reference to point to a new object. The one exception is immediate values, but this does not create different semantics since the immediate value classes are not mutable.
  • Parallel assignment is a trip. Just don't do it!
  • The Ruby equivalent of a do {} while loop is on its way to being deprecated and so should not be used.
  • require and load execute within a top-level context, whatever the context in which they were invoked (this differs from #include statements in C and C++ which execute inline with the lexical context).
  • All instances (except for immediate values) have class (or class-like) entities associated with them called eigenclasses (or virtual classes or singleton classes). Singleton methods are methods of this eigenclass. These eigenclasses have super classes.
  • Including a module introduces a new link in a classes inheritance hierachy between the class and its current "parent" (which either is its super class or a previously included module). This link is apparent in the constant and method resolution algorithms, since a method in an included module will be found before a method in a parent class. In addition, including a module makes an instance respond to is_a? for that module.
  • The require function only adds the working directory to the load path, not the directory of the file that is being included (this contrasts with the behavior of the C pre-processor).

Type Properties

  • There are no type declarations or static type checking. Unlike C++, the concept of type is not based on the class of an instance but rather the methods that it supports (which may vary from instance to instance due to singleton methods). The idea of inheritance in order to share type is absent; inheritance shares methods. An object's type (the functionality that it supports) can be altered at run-time.
  • Ruby's integers have infinite capacity. This is implemented by transparently promoting Fixnum instances to Bignum instances.
  • Indexing into an integer by bit number is allowed, with '0' denoting the least significant bit.
  • Floating point literals in Ruby must have digits before and after the decimal point (i.e., 0.1 rather than .1 or 1.0 rather than 1.). I think that this is because 1. also could be a broken attempt to call a method on the Fixnum instance '1'.
  • Ruby string literals are mutable, which is quite different than C, C++, and Java. This leads to the Ruby interpreter creating a new String instance for every string literal, which means that String literal lines should not be unnecessarily used (in loops, for instance, when the literal could be placed above the loop).
  • In Ruby 1.8, character literals were interpreted as Fixnums (?A == 65). In Ruby 1.9, they are interpreted as one character Strings (?A == "A") and so no longer have any real purpose in the language.
  • String handling was rewritten in Ruby 1.9 in order to provide full support for internationalization with the String class. In particular, Strings no longer always are byte sequences but may contain multi-byte characters. In addition, each String has an Encoding and support is provided for transcoding. Indexing into a multi-byte String is not an O(1) operation.
  • A single array can contain different types of elements.
  • The Array class defines vector, stack, and set operations.
  • The default hash() method returns an object's object_id, and the default hash equality comparison returns true if and only if two objects are the same instance. If a new kind of equality is defined (by overriding the eql? method, then a new hash() must be written so that equal keys always will have the same hash code.
  • If an object is used as a key to a hash table and then modified in such a way as to change the hash code, then the hash table will be corrupted until the next call to rehash, since the key's bucket no longer will correspond to its hash code. Ruby makes a special exception for the String class, however, and creates a private copy of a String used as a key.
  • Symbol instances are immutable, unique values and so very useful for hash keys and enumerated values.

Object Orientation

  • Every data item in Ruby is an object, including literals.
  • Ruby does not provide any direct access to class attributes (members).
  • Classes are themselves objects in Ruby. Thus, adding a class method is the same as adding a singleton method to the particular instance of the Class class.
  • Ruby does not have ensure that base classes have been constructed properly (unlike C++); sub-classes explicitly must call super in their initialize methods in order to initialize the base class.
  • Even though every piece of data is an instance of an object, Ruby optimizes small, frequently used pieces of data (like standard integers) via immediate values.
  • Global methods (methods defined outside of any class) are treated as private instance methods of the Object class. Thus, since private methods never have an explicit receiver, since all code is executed with an object, and since all objects descend from Object, global methods can be used without qualification anywhere.
  • Instance variables cannot be accessed directly outside of a class (can one instance of a class manipulate the instance variables of another instance, however?).
  • Singleton methods are methods added only to an instance. Class methods actually are implemented as singleton methods for the class' Class instance.
  • Classes can be re-opened for the addition of instance methods, variables, and aliases.
  • A given instance can be "frozen" with the freeze keyword, so that its contents cannot be changed.
  • Constructors (the initialize() method) are inherited, which is very different than in Java or C++ in which constructors are not inherited.
  • A parent class' constructor is not automatically called by a child class' constructor; the child class must call the parent class' constructor explicitly (via super).
  • There are no destructors.
    • It is possible to define a finalize method (similar to Java) that is called by the garbage collector (ObjectSpace.define_finalizer).
  • All instance methods can be polymorphically overriden in Ruby, which is similar to Java but different than C++ (where polymorphic methods must be declared virtual).
  • private methods can be polymorphically overriden, which is a definite flaw in Ruby; this means that child classes inadvertently can interfere with parent implementations (by accidentally declaring a method with the same name as a private parent method that never was meant to be overriden).
  • Class can be polymorphically overriden in Ruby, which is different than in C++ which does not allow this.
  • Instance variables are not inherited but instead an instance variable used by a parent class will come into existence when the child class calls into the parent. Note, however, that if a child class uses an instance variable with the same name, then it will reference the instance variable of the parent. Thus, in Ruby, since access specifiers cannot be applied to variables, it is very important to ensure that child classes do not step on instance variables of parent classes.
  • Not allowing access specifiers to be applied to data is a definite flaw in Ruby; this means that child classes inadvertently can interfere with parent implementations (C++ and Java protect against this by allowing classes to declare member data private).
  • Constant access is not virtual, but instead is associated with the lexical scope of the reference point.
  • Class instance variables are instance variables of the class' Class instance.
  • dup and clone perform shallow copies of all instance data. Ruby allows a copy constructor (initialize_copy) to be defined; this is invoked after the shallow copy of data.
    • clone copies the object's internal state (such as whether it is frozen) in addition to its content; dup just copies the content.
    • dup and clone throw exceptions when called on Ruby Immediate Values. This seems like a language bug, but the policy is discussed here.
  • Unlike C++, Ruby does not force initialization of member data.
  • The instance methods of modules can be included in classes (yielding instance methods), instances (yielding singleton methods, via Object#extend), or Class instances (yielding class methods).
  • Mixing in modules provides a form of multiple inheritance.

Idioms

  • The classical Ruby iterator is an internal iterator. Ruby 1.9, however, introduces external iterators in the form of Enumerators. In general, an Enumerator simply is a proxy to an Enumerable object that only offers the Enumerable interface (1.8 offered these as well). In 1.9, these objects also can be used as external iterators (they contain state). Weirdly enough, Ruby choose to have the end of iteration marked by an exception, rather than using some flag. The external iterators only provide forward iteration.
  • Ruby "blocks" are blocks of code passed into methods. Ruby "bodies" are the contents of a module, class, or control expression.
  • Ruby may have an approximation of RAII by passing blocks (and closing resources after executing blocks)
    • This doesn't handle dealing with multiple resources at once (i.e., two files)
  • External iterators behave in an undefined way in the presence of concurrent modifications.
  • There is full support for exceptions; the exception handling is quite similar to that of C++ or Java. Some notable twists:
    • all exceptions must descend from Exception.
    • For whatever reason, Ruby uses raise and rescue instead of throw and catch.
    • There is an ensure clause for certain kinds of blocks; the code in such clauses always is executed when control leaves the block (due to any reason). Some of the code that would have been put into a C++ destructor probably should go here (closing files, unlocking mutexes, etc.).
    • C++ programs essentially will abort if an exception is thrown in the middle of exception handling; Ruby handles this by replacing the old exception with the new exception.
    • It is possible to suppress an exception by calling return in an ensure block.
  • Ruby is very influenced by and offers a great deal of support for functional programming.
    • One example of this influence is the preference for iterators over loops.
    • Through Proc instances, Ruby allows easy composition and partial application of functions. C++, by contrast, does this through the functional templates and function objects (bind1st, etc.).
  • The abbreviated assignment operators cannot be redefined; they always are expanded by the interpreter into a sequence of assignment and standard operators. This often makes abbreviated assignemnt operators less efficient, because they force the creation of new object instances (i.e., x += 1 is expanded into x = x + 1, and x + 1 creates a new instance).
  • The Singleton instance is constructed on-demand, although you could force early creation (for thread-safety or because the initialization is expensive) by just calling instance right after the class definition.
  • Classes and modules can be used for namespacing purposes.
  • Classes and modules can be nested within each other.
  • Modules generally either are created for namespacing or for mixin, although the module_method module allows a module to do both.
  • Ruby and C++ have diametrically opposed notions of "constants".
    • Ruby constants (denoted by their first character being a capital letter) are constant in that they always point to the same thing; the thing that they point to can be changed, however. A great example of this is the fact that every class is referenced by its name, which is a constant. Thus, if I create a Tony class, Tony is a constant that always points to class Tony; I can, however, use Tony in expressions that change the class Tony (calling a class method that changes class instance variables, for instance). Ruby also can prevent the contents of a variable from being modified with freeze, although is does not seem to be that common.
    • C++ constants generally prevent the contents of a variable from being modified, but are allowed to point to a different object. So, const Tony* pTony can be changed to point to a different Tony instance but, using pTony I cannot modify what is being pointed to (at least, I only can call const methods through pTony). C++ does provide syntax that prevents changing what pTony points to (Tony* const pTony for a constant pointer to a variable instance or const Tony* const pTony for a constant pointer to a constant instance), but this is quite rare in practice (for one thing, the syntax is pretty messy).
    • One disadvantage of the Ruby constant semantics is that getters (such as those created by attr_reader) provide non-constant access to member variables. Such getters return references to the member variables, and the contents of the member variable can be modified through that reference (if the variable's class is mutable). The only way around this would be to dup the member variable in the getter, which would be inefficient. By contrast, in C++ one simply can return a const pointer or reference.
  • The Struct class is Ruby's answer to C++'s std::pair class (and, of course, n-tuple classes).
    • Ruby's works with an arbitrary number of arguments, but of course is not type-safe.
    • A new n-tuple template class needs to be created for each increment to n in C++.
  • Ruby has some support for threaded programming.
    • In Ruby 1.8, threads are implemented completely within the Ruby interpreter (N:1, thus cannot take advantage of multiple CPUs, since there only is one kernel thread executing).
    • In Ruby 1.9, Ruby threads are mapped to kernel threads (1:1) and so theoretically can take advantage of multiple CPUs, except that the Ruby interpreter only allows a single thread to execute at any given time (basically, the interpreter has a giant lock around it).
    • Given the above, Ruby threads are useful for doing I/O in the background but not for compute-bound tasks; in addition, a Ruby process never can use more than 1 CPU.
    • JRuby might have better threading support.
    • The Mutex class provides RAII semantics via the synchronize method.

Methods

  • Method invocations (that is, a method and the object on which it is to be invoked) are represented by instances of Method; the C++ version of this is the bound pointer to method. Methods (without the receiver object) are represented by instances of UnboundMethod; the C++ version of this is the pointer to member method.
  • Due to the dynamic nature of Ruby, much of the language is implemented through Ruby method calls (i.e., iterator loops, access specifiers, etc.)
  • Many of Ruby's operators are implemented as methods; the + operator on integers, for instance, actually is a Fixnum method. Ruby grants tremendous freedom in overloading operators (i.e., operator [] can have multiple arguments).
  • Default method arguments are supported. Unlike C++, however, default arguments are evaluated at invocation time (not compile-time) and can be expressions (which can reference member variables). The Ruby 1.8 interpreter only evaluates default arguments if they are necessary; since default arguments are evaluated at invocation time, a frequently used default argument that is expensive to calculator could be a tricky performance issue.
  • Function overloading is not supported.
  • Parentheses are not required for method invocation in most cases but are in some corner cases. These probably always should be used for clarity (except when operators are involved, of course).
  • The alias keyword allows a method to be referred to by a different name.
  • Methods that end with = generally are setters, and the interpreter allows a space between the method name and the = during invocation (so that it looks as if an instance variable is being set).
  • Methods that end with ? are predicates by convention.
  • Methods that end with ! are "dangerous". Most commonly, "dangerous" just means that they modify the object on which they're called. This seems like a Ruby approximation to const correctness.
  • Methods actually can be undefined with the undef keyword, even inherited methods.
  • Variable length argument lists are supported; the method definition asks that additional arguments be coalesced into an array.
  • Named arguments are supported via hashes; at the cost of the creation and use of a hash, this allows arguments to a method to be specified in any order (and, naturally, for some arguments to be skipped).
  • Method resolution actually is cached, since the algorithm is quite involved. This constrasts with C++ or Java which do the lookup at compile-time (although, in both cases, this lookup could resolve in a polymorphic call which actually only will be resolved in run-time; even in this scenario, however, there is not really a run-time "resolution" algorithm that must be run).
  • super inside a singleton method will call the instance method of the same name. Unlike super in Java, super in Ruby is not tied to the inheritance hierarchy but rather calls the method one level higher in the method resolution algorithm.

Blocks

  • Code blocks can be associated with any method invocation; methods that use blocks are called "iterators" (even if the method does not actually iterate over anything). The blocks can be anonymous or named (in which case, they are the last argument).
  • Arguments passing to a block has similar semantics to parallel assignment.
  • Blocks are represented by instances of Proc. Proc instances can be "procs" or "lambdas". Procs are more block-like, in that the number of arguments passed to a proc when it is invoked do not have to be the same as the number of arguments specified by the proc. In addition, return, break, and next behave in procs as they do in blocks. In fact, calling return or break in a proc is dangerous, because a LocalJumpError will be thrown if control already has left the method in which the proc initially was defined. By contrast, a lambda has similar semantics to a method. In particular, the number of arguments passed to a lambda do have to match the number of arguments expected by a lamba. Also return, break, and next all cause the lambda to return.
  • Proc instances are closures, in that they retain the variable bindings (the mapping between a variable name and its storage in which it stores a reference to its value) that they referenced when they were defined. Note that they retain references to variable bindings, and not to what the variables point to. Thus, if a variable is changed to point to a new object, the Proc instance also will reference the new object.
    • Each invocation of a method creates a new slate of local variable bindings.

Reflection

  • eval() executes in a temporary scope; it can access variables in the containing scope but any variables defined within the eval() code cease to exist after eval() returns.
  • Named local, global, instance, and class variables may be tested for existence, obtained, and set.
  • Named instance and class methods may be tested for existence, obtained, deleted, and invoked.

Metaprogramming

  • A common technique is to redefine method_missing() instance method in order to be able to handle arbitrary method calls.
  • A similar technique is to define a const_missing() class method in order to be able to handle arbitrary class constant references.
  • Several hooks are defined (inherited, when a class is extended; method_added, when an instance method is added; method_missing, when an undefined method is called) that can be redefined to change Ruby's behavior when handling certain events.
  • Creating a proxy object is trivial in Ruby; just delegate all calls from the proxy to the actual object in method_missing. Note that such proxy classes should extend BasicObject in Ruby 1.9, which provides only the absolutely required methods that an instance needs to support; before Ruby 1.9, it was necessary to undefine the methods that the proxy inherited from Object, so that such calls would be forwarded to the actual object.
  • Alias chaining allows behavior transparently to be added to a method, without altering the method. One alias chaining technique is to alias the method to a different name and then to define a new method with the same name (overwriting the old method) that adds behavior and calls the alias to the old method (C++ interestingly enough can do this too for a function defined in a library; one assigns the address of the function to a function pointer and defines the new version to call the function pointer to the old). Another technique is to define a singleton method with the same name as the instance method that calls the original method via super after adding behavior.

Extensions

  • Ruby has built-in documentation (vi ri).
  • Ruby has a debugger (ruby -debug)
  • Ruby has an interactive interpreter (irb), which can be invoked from one's code with the ruby-breakpoint gem.
  • Ruby has a fully integrated unit testing framework (Test::Unit).
  • There is a set of tools built over the unit testing framework, ZenTest.
  • Ruby has a profiler (ruby -rprofile).
  • Ruby has a timer (Benchmark), that provide show user, system, and wall clock timings.

Uses

  • System administration scripts (seems to be a viable replacement for Perl).

Customization

gem update --system
gem install rails
gem install relative
gem install erbextensions
gem install configtoolkit
gem install hoe
gem install rake
gem install rdoc
gem install mysql
gem install mysql-2.7
gem install mysql
gem install sys-host
gem install sys-uptime
gem install sys-cpu
gem install sys-filesystem
gem install sys-admin
gem install sys-uname
gem install sys-proctable
gem install sys-uname
gem install sys-cpu
gem install sys-uptime
gem install rails
gem install tlsmail
gem install sys-proctable
  • Apply this patch, this patch, and this patch to rdoc.
  • Set the @template variable to nil (rather than to html) in rdoctask.rb in the rdoc gem.
  • Install the hanna rdoc template gem from here. Symbolic link lib/rdoc/generator/html to lib/hanna in the hanna gem installation.
  • Build and install the designingpatterns_hacks gem.

Ruby Gems

To paraphrase www.rubygems.org, a gem is a packaged Ruby application or library. Gems are managed on your computer using the gem tool. This tool allows you to view/download gems from the RubyForge repository which is open source.

To generate rdocs for all locally installed gems, and serve them as web docs, run as root:

gem rdoc --all
gem server &

To view web documentation for all locally installed gems, point your browser to your local machine (eg.):

durian.home:8808

To update RubyGems, run

gem update --system

To update a specific gem or all gems, run

gem update gem-name
gem update --all

To locate a specific gem, run

gem which gem-name

hoe does work on Windows systems but requires the HOME environment variable to be set (System/Advanced/Environment Variables).

Tools

  • Ruby comes with an interactive interpreter, 'irb' (similar, but less powerful, than the Common LISP REPL)
  • There are online Ruby interpreters, such as this
  • Ruby has __END__, just like Perl.

Links

Research Points

  • What does Object::new (or BasicObject::new in Ruby 1.9) do?
  • The chicken + eggs problem implicit in all instances descending from Object but it being impossible to define a class (including Object?) without defining a Class instance.
Personal tools