Learn Ruby and more: Structure of Ruby Programs

Share it Please

Here you will learn the basic structure of Ruby programs. It starts with the lexical structure, covering tokens and the characters that comprise them. Next, it covers the syntactic structure of a Ruby program, explaining how expressions, control structures, methods, classes, and so on are written as a series of tokens.

1. Lexical Structure

The Ruby interpreter parses a program as a sequence of tokens. Tokens include comments, literals, punctuation, identifiers, and keywords. This section introduces these types of tokens and also includes important information about the characters that comprise the tokens and the whitespace that separates the tokens.

1.1. Comments

Comments in Ruby begin with a # character and continue to the end of the line. If a # character appears within a string or regular expression literal, then it is simply part of the string or regular expression and does not introduce a comment:

# This entire line is a comment
x = "#This is a string" # And this is a comment
y = /#This is a regular expression/ # Here's another comment

1.2. Literals

Literals are values that appear directly in Ruby source code. They include numbers, strings of text, and regular expressions. For example :

1 # An integer literal
1.0        # A floating-point literal
'one'      # A string literal
"two"       # Another string literal
/three/        # A regular expression literal

1.3. Punctuation

Ruby uses punctuation characters for a number of purposes. Most Ruby operators are written using punctuation characters, such as + for addition, * for multiplication, and || for the Boolean OR operation. Punctuation characters also serve to delimit string, regular expression, array, and hash literals, and to group and separate expressions, method arguments, and array indexes.

1.4. Identifiers

An identifier is simply a name. Ruby uses identifiers to name variables, methods, classes, and so forth. Ruby identifiers consist of letters, numbers, and underscore characters, but they may not begin with a number. Identifiers may not include whitespace or nonprinting characters, and they may not include punctuation characters except as described here.

Identifiers that begin with a capital letter A–Z are constants, and the Ruby interpreter will issue a warning (but not an error) if you alter the value of such an identifier. Class and module names must begin with initial capital letters. The following are identifiers:

i
x2
old_value
_internal                  # Identifiers may begin with underscores
PI                           # Constant

By convention, multiword identifiers that are not constants are written with underscores
like_this, whereas multiword constants are written LikeThis or LIKE_THIS.

Punctuation in identifiers
Punctuation characters may appear at the start and end of Ruby identifiers. They have the following meanings:

$    Global variables are prefixed with a dollar sign.
@   Instance variables are prefixed with a single at sign, and class variables are prefixed with two at signs.

? As a helpful convention, methods that return Boolean values often have names that end with a question mark.

! Method names may end with an exclamation point to indicate that they should be used cautiously.

=   Methods whose names end with an equals sign can be invoked by placing the method name, without the equals sign, on the left side of an assignment operator.

Here are some example identifiers that contain leading or trailing punctuation characters:

$files            # A global variable
@data           # An instance variable
@@counter # A class variable
empty?    # A Boolean-valued method or predicate
sort!    # An in-place alternative to the regular sort method
timeout=    # A method invoked by assignment

1.5 Keywords

The following keywords have special meaning in Ruby and are treated specially by the Ruby parser:

__LINE__       case         ensure    not          then
__ENCODING__   class         false        or            true
__FILE__          def           for           redo        undef
BEGIN            defined?   if              rescue     unless
END                        do           in              retry        until
alias                         else          module     return      when
and                          elsif          next         self           while
begin                        end          nil            super        yield
break

2. Syntactic Structure

The basic unit of syntax in Ruby is the expression. The Ruby interpreter evaluates expressions, producing values. The simplest expressions are primary expressions, which represent values directly. Operators are used to perform computations on values, and compound expressions are built by combining simpler subexpressions with operators:

1 # A primary expression
x                   # Another primary expression
x = 1             # An assignment expression
x = x + 1       # An expression with two operators

Expressions can be combined with Ruby’s keywords to create statements, such as the if statement for conditionally executing code and the while statement for repeatedly executing code:

if x < 10 then                   # If this expression is true
x = x + 1                         # Then execute this statement
end                                 # Marks the end of the conditional

2.1 Block Structure in Ruby

Ruby programs have a block structure. Module, class, and method definitions, and most of Ruby’s statements, include blocks of nested code. These blocks are delimited by keywords or punctuation and, by convention, are indented two spaces relative to the delimiters. There are two kinds of blocks in Ruby programs. One kind is formally called a “block.” These blocks are the chunks of code associated with or passed to iterator methods:

3.times { print "Ruby! " }

In this code, the curly braces and the code inside them are the block associated with the iterator method invocation 3.times. Formal blocks of this kind may be delimited with curly braces, or they may be delimited with the keywords do and end:

1.upto(10) do |x|
print x
end

do and end delimiters are usually used when the block is written on more than one line.

3. File Structure

There are only a few rules about how a file of Ruby code must be structured. These rules are related to the deployment of Ruby programs and are not directly relevant to the language itself.

First, if a Ruby program contains a “shebang” comment, to tell the (Unix-like) operating system how to execute it, that comment must appear on the first line.

Second, if a Ruby program contains a “coding” comment (as described in §2.4.1), that comment must appear on the first line or on the second line if the first line is a shebang.

Third, if a file contains a line that consists of the single token __END__ with no whitespace before or after, then the Ruby interpreter stops processing the file at that point. The remainder of the file may contain arbitrary data that the program can read using the IO stream object DATA. (See Chapter 10 and §9.7 for more about this global constant.)

Ruby programs are not required to fit in a single file. Many programs load additional Ruby code from external libraries, for example. Programs use require to load code from another file. require searches for specified modules of code against a search path, and prevents any given module from being loaded more than once.

The following code illustrates each of these points of Ruby file structure:
#!/usr/bin/ruby -w           shebang comment
# -*- coding: utf-8 -*-     coding comment
require 'socket'                load networking library
...                                    program code goes here
__END__                      mark end of code
...                                   program data goes here

4. Program Execution

Ruby is a scripting language. This means that Ruby programs are simply lists, or scripts, of statements to be executed. By default, these statements are executed sequentially, in the order they appear. Ruby’s control structures alter this default execution order and allow statements to be executed conditionally or repeatedly, for example.

Programmers who are used to traditional static compiled languages like C or Java may find this slightly confusing. There is no special main method in Ruby from which execution begins. The Ruby interpreter is given a script of statements to execute, and it begins executing at the first line and continues to the last line.

The Ruby interpreter is invoked from the command line and given a script to execute. Very simple one-line scripts are sometimes written directly on the command line. More commonly, however, the name of the file containing the script is specified. The Ruby interpreter reads the file and executes the script.

Learn Ruby and more

Friday, July 25, 2014

Structure of Ruby Programs