Easy Tutorial
❮ Ruby Date Time Ruby Operator ❯

Ruby String

In Ruby, the String object is used to store or manipulate a sequence of one or more bytes.

Ruby strings are divided into single-quoted strings (') and double-quoted strings ("), with the difference being that double-quoted strings support more escape characters.

Single-Quoted Strings

The simplest form of a string is a single-quoted string, which is a string stored within single quotes:

'This is a string in a Ruby program'

If you need to use a single quote within a single-quoted string, you need to escape it with a backslash (), so that the Ruby interpreter does not consider this single quote as the end of the string:

'Won\'t you read O\'Reilly\'s book?'

The backslash can also escape another backslash, so that the second backslash itself is not interpreted as an escape character.

Here are some features related to strings in Ruby.

Double-Quoted Strings

In double-quoted strings, we can use #{} with the hash sign and curly braces to evaluate the value of an expression:

Embedding variables in a string:

Example

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

name1 = "Joe"
name2 = "Mary"
puts "Hello #{name1}, where is #{name2}?"

The output of the above example is:

Hello Joe, where is Mary?

Performing mathematical operations within a string:

Example

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

x, y, z = 12, 36, 72
puts "The value of x is #{x}"
puts "The value of x + y is #{x + y}"
puts "The average of x + y + z is #{(x + y + z)/3}"

The output of the above example is:

The value of x is 12
The value of x + y is 48
The average of x + y + z is 40

Ruby also supports string variables initiated with %q and %Q, where %q uses single-quote rules and %Q uses double-quote rules, followed by a start delimiter such as (! [ { etc. and a matching end delimiter such as } ] ).

The character following q or Q is the delimiter. The delimiter can be any non-alphanumeric single-byte character such as: [, {, (, <, !, etc. The string will be read until a matching end delimiter is found.

Example

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

desc1 = %Q{Ruby strings can use '' and ""}
desc2 = %q|Ruby strings can use '' and ""|

puts desc1
puts desc2

The output of the above example is:

Ruby strings can use '' and ""
Ruby strings can use '' and ""

Escape Characters

The table below lists the escape characters that can be used with a backslash symbol for escaping or non-printable characters.

Note: Within a string enclosed in double quotes, escape characters are parsed. Within a string enclosed in single quotes, escape characters are not parsed and are output as is.

Backslash Symbol Hexadecimal Character Description
\a 0x07 Bell
\b 0x08 Backspace
\cx Control-x
\C-x Control-x
\e 0x1b Escape
\f 0x0c Form feed
\M-\C-x Meta-Control-x
\n 0x0a Newline
\nnn Octal notation, where n ranges from 0.7
\r 0x0d Carriage return
\s 0x20 Space
\t 0x09 Tab
\v 0x0b Vertical tab
\x Character x
\xnn Hexadecimal notation, where n ranges from 0.9, a.f, or A.F

Character Encoding

Ruby's default character set is ASCII, where characters can be represented by a single byte. If you use UTF-8 or other modern character sets, characters may be represented by one to four bytes.

You can change the character set at the beginning of the program using $KCODE, as shown below:

$KCODE = 'u'

Here are the possible values for $KCODE.

Encoding Description
a ASCII (same as none). This is the default.
e EUC.
n None (same as ASCII).
u UTF-8.

Built-in String Methods

We need an instance of a String object to call String methods. Here is how to create an instance of a String object:

new [String.new(str="")]

This will return a new string object containing a copy of str. Now, using the str object, we can call any available instance methods. For example:

Example

#!/usr/bin/ruby

myStr = String.new("THIS IS TEST")
foo = myStr.downcase

puts "#{foo}"

This will produce the following result:

this is test

Below are some public string methods (assuming str is a String object):

Serial Number Method & Description
1 str % arg <br>Formats the string using the format specification. If arg contains more than one substitution, then arg must be an array. For more information on format specifications, see "Kernel Module" under sprintf.
2 str * integer <br>Returns a new string containing integer copies of str. In other words, str is repeated integer times.
3 str + other_str <br>Concatenates other_str to str.
4 str << obj <br>Appends an object to the string. If the object is a fixed number Fixnum in the range 0.255, it is converted to a character. Compare it with concat.
5 str <=> other_str <br>Compares str with other_str, returning -1 (less than), 0 (equal), or 1 (greater than). The comparison is case-sensitive.
6 str == obj <br>Checks the equality of str and obj. Returns false if obj is not a string, true if str <=> obj returns 0.
7 str =~ obj <br>Matches str against the regular expression pattern obj. Returns the starting position of the match, or false if there is no match.
8 str[position] # Note that the return is the ASCII code, not the character <br>str[start, length] <br>str[start..end] <br>str[start...end] <br>Substrings using index
9 str.capitalize <br>Converts the first letter of the string to uppercase and the rest to lowercase.
10 str.capitalize! <br>Same as capitalize, but returns nil if no changes are made.
11 str.casecmp <br>Case-insensitive string comparison.
12 str.center <br>Centers the string.
13 str.chomp <br>Removes the record separator ($/), usually \n, from the end of the string. Does nothing if no record separator is present.
14 str.chomp! <br>Same as chomp, but str is modified and returned.
15 str.chop <br>Removes the last character in str.
16 str.chop! <br>Same as chop, but str is modified and returned.
17 str.concat(other_str) <br>Concatenates other_str to str.
18 str.count(str, ...) <br>Counts one or more character sets. If multiple character sets are given, it counts the intersection of these sets.
19 str.crypt(other_str) <br>Applies a one-way cryptographic hash to str. The parameter is a two-character string, with each character ranging from a.z, A.Z, 0.9, . or /.
20 str.delete(other_str, ...) <br>Returns a copy of str with all characters in the intersection of its arguments deleted.
21 str.delete!(other_str, ...) <br>Same as delete, but str is modified and returned.
22 str.downcase <br>Returns a copy of str with all uppercase letters replaced with lowercase letters.
23 str.downcase! <br>Same as downcase, but str is modified and returned.
24 str.dump <br>Returns a version of str with all non-printable characters replaced by \nnn notation, and all special characters escaped.
25 str.each(separator=$/) { substr block } <br>Splits str using the argument as the record separator (default is $/), and passes each substring to the supplied block.
26 str.each_byte { fixnum block } <br>Passes each byte in str to the block, returning each byte as a decimal representation of the byte.
27 str.each_line(separator=$/) { substr block } <br>Splits str using the argument as the record separator (default is $/), and passes each substring to the supplied block.
28 str.empty? <br>Returns true if str is empty (i.e., has a length of 0).
29 str.eql?(other) <br>Returns true if the two strings have the same length and content.
30 str.gsub(pattern, replacement) [or] <br>str.gsub(pattern) { match block } <br>Returns a copy of str with all occurrences of pattern replaced by replacement or the value of the block. The pattern is typically a Regexp; if it is a String, no regular expression meta-characters are interpreted (i.e., /\d/ will match a digit, but '\d' will match a backslash followed by a 'd').
31 str[fixnum] [or] str[fixnum,fixnum] [or] str[range] [or] str[regexp] [or] str[regexp, fixnum] [or] str[other_str] <br>References str with the following parameters: a Fixnum as a parameter returns the character encoding of fixnum; two Fixnums as parameters return a substring from the offset (first fixnum) to the length (second fixnum); a range as a parameter returns a substring within that range; a regexp as a parameter returns the matching part of the string; a regexp with a fixnum returns the match data at the fixnum position; an other_str returns the substring that matches other_str. A negative Fixnum starts from the end of the string, with -1.
32 str[fixnum] = fixnum [or] str[fixnum] = new_str [or] str[fixnum, fixnum] = new_str [or] str[range] = aString [or]<br>str[regexp] = new_str [or] str[regexp, fixnum] = new_str [or] str[other_str] = new_str ] <br>Replaces the entire string or a part of the string. Synonymous with slice!.
33 str.gsub!(pattern, replacement) [or] str.gsub!(pattern) { match block } <br>Performs the substitutions of String#gsub, returning str, or nil if no substitutions were performed.
34 str.hash <br>Returns a hash based on the string's length and content.
35 str.hex <br>Treats leading characters from str as a string of hexadecimal digits (with an optional sign and an optional 0x) and returns the corresponding number. Returns zero if the conversion fails.
36 str.include? other_str [or] str.include? fixnum <br>Returns true if str contains the given string or character.
37 str.index(substring [, offset]) [or]<br>str.index(fixnum [, offset]) [or]<br>str.index(regexp [, offset]) <br>Returns the index of the first occurrence of the given substring, character (fixnum), or pattern (regexp) in str. Returns nil if not found. If the second argument is given, it specifies the position in the string to begin the search.
38 str.insert(index, other_str) <br>Inserts other_str before the character at the given index, modifying str. A negative index counts from the end of the string, inserting after the given character. It is intended to appear to insert before the given index.
39 str.inspect <br>Returns a printable version of str, with special characters escaped.
40 str.intern [or] str.to_sym <br>Returns the Symbol corresponding to str, creating the symbol if it did not previously exist.
41 str.length <br>Returns the length of str. Compare with size.
42 str.ljust(integer, padstr=' ') <br>Returns a new string of length integer with str left justified and padded with padstr; returns str if integer is less than str.length.
43 str.lstrip <br>Returns a copy of str with leading whitespace removed.
44 str.lstrip! <br>Removes leading whitespace from str, returning nil if no change was made.
45 str.match(pattern) <br>Converts pattern to a Regexp (if it isn't already one) and then invokes its match method on str.
46 str.oct <br>Treats leading characters of str as a string of octal digits (with an optional sign) and returns the corresponding number. Returns 0 if the conversion fails.
47 str.replace(other_str) <br>Replaces the contents of str with the corresponding values in other_str.
48 str.reverse <br>Returns a new string with the characters from str in reverse order.
49 str.reverse! <br>Reverses str in place and returns it.
50 str.rindex(substring [, fixnum]) [or]<br>str.rindex(fixnum [, fixnum]) [or]<br>str.rindex(regexp [, fixnum]) <br>Returns the index of the last occurrence of the given substring, character (fixnum), or pattern (regexp) in str. Returns nil if not found. If the second argument is given, it specifies the position in the string to end the search. Characters beyond this point are not considered.
51 str.rjust(integer, padstr=' ') <br>Returns a new string of length integer with str right justified and padded with padstr; returns str if integer is less than str.length.
52 str.rstrip <br>Returns a copy of str with trailing whitespace removed.
53 str.rstrip! <br>Removes trailing whitespace from str, returning nil if no change was made.
54 str.scan(pattern) [or]<br>str.scan(pattern) { match, ... block } <br>Both forms iterate through str, matching the pattern (which may be a Regexp or a String). For each match, a result is generated and either added to the result array or passed to the block. If the pattern contains no groups, each individual result consists of the matched string, $&. If the pattern contains groups, each individual result is an array of groups.
55 str.slice(fixnum) [or] str.slice(fixnum, fixnum) [or]<br>str.slice(range) [or] str.slice(regexp) [or]<br>str.slice(regexp, fixnum) [or] str.slice(other_str)<br>See str[fixnum], etc.<br>str.slice!(fixnum) [or] str.slice!(fixnum, fixnum) [or]<br>str.slice!(range) [or] str.slice!(regexp) [or]<br>str.slice!(other_str) <br>Deletes the specified portion from str, and returns the portion deleted. The forms with a fixnum for a parameter will raise an IndexError if the value is out of range; the forms with a range parameter will raise a RangeError, and the forms with a Regexp or String will ignore the assignment.
56 str.split(pattern=$;, [limit]) <br>Divides str into substrings based on a delimiter, returning an array of these substrings. If pattern is a String, it is used as the delimiter. If pattern is a single space, str is split on whitespace, removing any leading and consecutive whitespace characters. If pattern is a Regexp, str is split where the pattern matches. When pattern matches a zero-length string, str is split into individual characters. If the pattern parameter is omitted, the value of $; is used. If $; is nil (default), str is split on whitespace as if were specified as the delimiter. If the limit parameter is omitted, trailing null fields are suppressed. If limit is a positive number, at most that many fields will be returned (if limit is 1, the entire string is returned as the only entry in an array). If limit is a negative number, there is no limit to the number of fields returned, and trailing null fields are not suppressed.
57 str.squeeze([other_str]*) <br>Builds a set of characters from the other_str parameter(s) using the procedure described for String#count. Returns a new string where runs of the same character that occur in this set are replaced by a single character. If no arguments are given, all runs of identical characters are replaced by a single character.
58 str.squeeze!([other_str]*) <br>Equivalent to squeeze, but modifies str in place and returns nil if no changes were made.
59 str.strip <br>Returns a copy of str with leading and trailing whitespace removed.
60 str.strip! <br>Removes leading and trailing whitespace from str, returning nil if no change was made.
61 str.sub(pattern, replacement) [or]<br>str.sub(pattern) { match block } <br>Returns a copy of str with the first occurrence of pattern replaced with either replacement or the value of the block. The pattern is typically a Regexp; if given as a String, no regular expression metacharacters are interpreted.
62 str.sub!(pattern, replacement) [or]<br>str.sub!(pattern) { match block } <br>Performs the substitutions of String#sub and returns str, or nil if no substitutions were performed.
63 str.succ [or] str.next <br>Returns the successor to str.
64 str.succ! [or] str.next! <br>Equivalent to String#succ, but modifies str in place.
65 str.sum(n=16) <br>Returns a basic n-bit checksum of the characters in str, where n is the optional Fixnum parameter, defaulting to 16. The result is simply the sum of the binary value of each character in str, modulo 2n - 1. This is not a particularly good checksum.
66 str.swapcase <br>Returns a copy of str with uppercase letters converted to lowercase and vice versa.
67 str.swapcase! <br>Equivalent to String#swapcase, but modifies str in place and returns nil if no changes were made.
68 str.to_f <br>Returns the result of interpreting leading characters in str as a floating point number. Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0.0 is returned. This method never raises an exception.
69 str.to_i(base=10) <br>Returns the result of interpreting leading characters in str as an integer base (2, 8, 10, or 16). Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0 is returned. This method never raises an exception.
70 str.to_s [or] str.to_str <br>Returns the receiver.
71 str.tr(from_str, to_str) <br>Returns a copy of str with the characters in from_str replaced by the corresponding characters in to_str. If to_str is shorter than from_str, it is padded with its last character. Both strings may use the c1.c2 notation to denote ranges of characters. If from_str starts with a ^, it denotes all characters except those listed.
72 str.tr!(from_str, to_str) <br>Equivalent to String#tr, but modifies str in place and returns nil if no changes were made.
73 str.tr_s(from_str, to_str) <br>Processes str as for String#tr, then removes duplicate characters in regions that were affected by the translation.
74 str.tr_s!(from_str, to_str) <br>Equivalent to String#tr_s, but modifies str and returns it, or returns nil if no changes were made.
75 str.unpack(format) <br>Decodes str (which may contain binary data) according to the format string, returning an array of the extracted values. The format string consists of a series of single-character directives. Each directive may be followed by a number indicating the repetition count. An asterisk (*) will use all remaining elements. The directives sSiIlL may each be followed by an underscore (_) to use the underlying platform's native size for the specified type, otherwise a platform-independent consistent size is used. Spaces in the format string are ignored.
76 str.upcase <br>Returns a copy of str with all lowercase letters replaced with their uppercase counterparts. The operation is locale-insensitive, only affecting characters a to z.
77 str.upcase! <br>Changes the contents of str to uppercase, returning nil if no changes were made.
78 str.upto(other_str) { s block } <br>Iterates through successive values, starting at str and ending at other_str (inclusive), passing each value to the block. The String#succ method is used to generate each value.

String unpack Directives

The table below lists the unpack directives for the String#unpack method.

Directive Returns Description
A String Removes trailing nulls and spaces.
a String String.
B String Extracts bits from each character (highest to lowest).
b String Extracts bits from each character (lowest to highest).
C Fixnum Extracts a character as an unsigned integer.
c Fixnum Extracts a character as an integer.
D, d Float Treats sizeof(double) length of characters as a native double.
E Float Treats sizeof(double) length of characters as a little-endian byte order double.
e Float Treats sizeof(float) length of characters as a little-endian byte order float.
F, f Float Treats sizeof(float) length of characters as a native float.
G Float Treats sizeof(double) length of characters as a network byte order double.
g Float Treats sizeof(float) length of characters as a network byte order float.
H String Extracts hexadecimal digits from each character (highest to lowest).
h String Extracts hexadecimal digits from each character (lowest to highest).
I Integer Treats sizeof(int) length (modified by _) of consecutive characters as a native integer.
i Integer Treats sizeof(int) length (modified by _) of consecutive characters as a signed native integer.
L Integer Treats four (modified by _) consecutive characters as an unsigned native long integer.
l Integer Treats four (modified by _) consecutive characters as a signed native long integer.
M String Quoted printable.
m String Base64 encoded.
N Integer Treats four characters as an unsigned long in network byte order.
n Fixnum Treats two characters as an unsigned short in network byte order.
P String Treats sizeof(char ) length of characters as a pointer, returning *len characters from the referenced location.
p String Treats sizeof(char *) length of characters as a pointer to a null-terminated string.
Q Integer Treats eight characters as an unsigned quad word (64-bit).
q Integer Treats eight characters as a signed quad word (64-bit).
S Fixnum Treats two (modified by _) consecutive characters as an unsigned short in native byte order.
s Fixnum Treats two (modified by _) consecutive characters as a signed short in native byte order.
U Integer UTF-8 character as an unsigned integer.
u String UU encoded.
V Fixnum Treats four characters as an unsigned long in little-endian byte order.
v Fixnum Treats two characters as an unsigned short in little-endian byte order.
w Integer BER compressed integer.
X Skips back one character.
x Skips forward one character.
Z String With *, removes trailing nulls up to the first null.
@ Skips to the offset given by the length parameter.

Examples

Try the following examples to unpack various data.

"abc \0\0abc \0\0".unpack('A6Z6')   #=> ["abc", "abc "]
"abc \0\0".unpack('a3a3')           #=> ["abc", " \000\000"]
"abc \0abc \0".unpack('Z*Z*')       #=> ["abc ", "abc "]
"aa".unpack('b8B8')                 #=> ["10000110", "01100001"]
"aaa".unpack('h2H2c')               #=> ["16", "61", 97]
"\xfe\xff\xfe\xff".unpack('sS')     #=> [-2, 65534]
"now=20is".unpack('M*')             #=> ["now is"]
"whole".unpack('xax2aX2aX1aX2a')    #=> ["h", "e", "l", "l", "o"]
❮ Ruby Date Time Ruby Operator ❯