Lua Strings
A string, or sequence (String), is a series of characters composed of numbers, letters, and underscores.
In Lua, strings can be represented using three methods:
- Characters enclosed between single quotes.
- Characters enclosed between double quotes.
- Characters enclosed between [[ and ]].
Examples of these three methods are as follows:
Example
string1 = "Lua"
print("\"String 1 is\"", string1)
string2 = 'tutorialpro.org'
print("String 2 is", string2)
string3 = [["Lua Tutorial"]]
print("String 3 is", string3)
The output of the above code is:
"String 1 is" Lua
String 2 is tutorialpro.org
String 3 is "Lua Tutorial"
Escape characters are used to represent characters that cannot be directly displayed, such as backspace, carriage return, etc. For example, to include a double quote in a string, you can use "\".
Here are all the escape characters and their meanings:
Escape Character | Meaning | ASCII Value (Decimal) |
---|---|---|
\a | Bell (BEL) | 007 |
\b | Backspace (BS), moves the current position to the previous column | 008 |
\f | Form feed (FF), moves the current position to the start of the next page | 012 |
\n | New line (LF), moves the current position to the start of the next line | 010 |
\r | Carriage return (CR), moves the current position to the start of the current line | 013 |
\t | Horizontal tab (HT), jumps to the next tab stop | 009 |
\v | Vertical tab (VT) | 011 |
\ | Represents a backslash character '\' | 092 |
\' | Represents a single quote (apostrophe) character | 039 |
\" | Represents a double quote character | 034 |
\0 | Null character (NULL) | 000 |
\ddd | Any character represented by a 1 to 3-digit octal number | Octal |
\xhh | Any character represented by a 1 to 2-digit hexadecimal number | Hexadecimal |
String Operations
Lua provides various methods to support string operations:
Number | Method & Purpose |
---|---|
1 | string.upper(argument): Converts the string to uppercase. |
2 | string.lower(argument): Converts the string to lowercase. |
3 | string.gsub(mainString, findString, replaceString, num): Replaces occurrences in the string. mainString is the string to operate on, findString is the substring to find, replaceString is the substring to replace with, and num is the number of replacements (optional, defaults to all). Example: string.gsub("aaaa", "a", "z", 3) outputs zzza 3 |
4 | string.find(str, substr, [init, [plain]]): Searches for substr within str. Returns the start and end indices if found, otherwise nil. init specifies the start position (default is 1), and plain determines if simple or regex matching is used (default is false). Example: string.find("Hello Lua user", "Lua", 1) outputs 7 9 |
5 | string.reverse(arg): Reverses the string. Example: string.reverse("Lua") outputs auL |
6 | string.format(...): Returns a formatted string similar to printf. Example: string.format("the value is:%d", 4) outputs the value is:4 |
7 | string.char(arg) and string.byte(arg[, int]): char converts integers to characters and concatenates, byte converts characters to integer values (optional specific character, default is the first). Example: string.char(97, 98, 99, 100) outputs abcd and string.byte("ABCD", 4) outputs 68 |
8 | string.len(arg): Calculates the length of the string. Example: string.len("abc") outputs 3 |
9 | string.rep(string, n): Returns n copies of the string. Example: string.rep("abcd", 2) outputs abcdabcd |
10 | ..: Concatenates two strings. Example: print("www.tutorialpro." .. "com") outputs www.tutorialpro.org |
11 | string.gmatch(str, pattern): Returns an iterator function that, when called, returns the next substring in str that matches pattern. Example: for word in string.gmatch("Hello Lua user", "%a+") do print(word) end outputs Hello Lua user |
12 | string.match(str, pattern, init): Finds the first match in str. init specifies the start position (default is 1). Returns captures or the entire match string if successful, otherwise nil. Example: = string.match("I have 2 questions for you.", "%d+ %a+") outputs 2 questions |
String Slicing
String slicing is done using the sub() method.
string.sub()
is used to slice a string, with the prototype:
string.sub(s, i [, j])
Parameters:
- s: The string to slice.
- i: The starting position.
- j: The ending position, default is -1, the last character.
Example
-- String
local sourcestr = "prefix--tutorialprogoogletaobao--suffix"
print("\nOriginal string", string.format("%q", sourcestr))
-- Slice part, 4th to 15th
local first_sub = string.sub(sourcestr, 4, 15)
print("\nFirst slice", string.format("%q", first_sub))
-- Take prefix, 1st to 8th
local second_sub = string.sub(sourcestr, 1, 8)
print("\nSecond slice", string.format("%q", second_sub))
-- Slice last 10 characters
local third_sub = string.sub(sourcestr, -10)
print("\nThird slice", string.format("%q", third_sub))
-- Index out of bounds, outputs original string
local fourth_sub = string.sub(sourcestr, -100)
print("\nFourth slice", string.format("%q", fourth_sub))
The output of the above code is:
Original string "prefix--tutorialprogoogletaobao--suffix"
First slice "fix--tutorialprog"
Second slice "prefix--"
Third slice "ao--suffix"
Fourth slice "prefix--tutorialprogoogletaobao--suffix"
String Case Conversion
The following example demonstrates how to convert string case:
Example
string1 = "Lua";
print(string.upper(string1))
print(string.lower(string1))
The output of the above code is:
LUA
lua
String Search and Reverse
The following example demonstrates how to search and reverse strings:
Example
string = "Lua Tutorial"
-- Search for substring
print(string.find(string, "Tutorial"))
reversedString = string.reverse(string)
print("New string is", reversedString)
The output of the above code is:
5 12
New string is lairotuT auL
String Formatting
Lua provides the string.format()
function to generate formatted strings. The first argument is the format, followed by the data corresponding to each placeholder in the format.
Format strings may include the following escape codes:
- %c - Accepts a number and converts it to the corresponding character in the ASCII table.
- %d, %i - Accepts a number and converts it to a signed integer format.
- %o - Accepts a number and converts it to an octal number format.
- %u - Accepts a number and converts it to an unsigned integer format.
- %x - Accepts a number and converts it to a hexadecimal number format using lowercase letters.
- %X - Accepts a number and converts it to a hexadecimal number format using uppercase letters.
- %e - Accepts a number and converts it to scientific notation format using a lowercase 'e'.
- %E - Accepts a number and converts it to scientific notation format using an uppercase 'E'.
- %f - Accepts a number and converts it to a floating-point number format.
- %g (%G) - Accepts a number and converts it to the shorter format between %e (%E, corresponding to %G) and %f.
- %q - Accepts a string and converts it to a format safe for Lua compiler input.
- %s - Accepts a string and formats it according to the given parameters.
To further refine the format, parameters can be added after the % sign in the following order:
- (1) Sign: A '+' sign indicates that the following numeric escape code will display a positive sign for positive numbers. By default, only negative numbers show a sign.
- (2) Padding: A '0' indicates that padding will be used when a string width is specified. The default padding character is a space.
- (3) Alignment: By default, strings are right-aligned when a width is specified. Adding a '-' sign changes this to left-alignment.
- (4) Width: The width of the string.
- (5) Precision/String Truncation: A decimal number 'n' following the width, if followed by 'f' (floating-point escape code, such as %6.3f), specifies that the floating-point number should retain only 'n' decimal places. If followed by 's' (string escape code, such as %5.3s), it specifies that the string should display only the first 'n' characters.
Example
string1 = "Lua"
string2 = "Tutorial"
number1 = 10
number2 = 20
-- Basic string formatting
print(string.format("Basic formatting %s %s", string1, string2))
-- Date formatting
date = 2; month = 1; year = 2014
print(string.format("Date formatting: %02d/%02d/%04d", date, month, year))
-- Scientific notation
print(string.format("%e", 1234.5678))
-- Hexadecimal
print(string.format("%x", 255))
-- Fixed-point number
print(string.format("%06.2f", 1234.5678))
-- String truncation
print(string.format("%.5s", "Hello Lua"))
The output of the above code is:
Basic formatting Lua Tutorial
Date formatting: 02/01/2014
1.234568e+03
ff
1234.57
Hello
This is a Chinese to English translation, please provide the English translation for this text. Do not provide any explanations or text apart from the translation.
Chinese: print(string.format("Date formatting %02d/%02d/%03d", date, month, year))
-- Decimal formatting
print(string.format("%.4f",1/3))
The above code execution results are:
Basic formatting Lua Tutorial
Date formatting 02/01/2014
0.3333
Other examples:
Example
string.format("%c", 83) -- Outputs S
string.format("%+d", 17.0) -- Outputs +17
string.format("%05d", 17) -- Outputs 00017
string.format("%o", 17) -- Outputs 21
string.format("%u", 3.14) -- Outputs 3
string.format("%x", 13) -- Outputs d
string.format("%X", 13) -- Outputs D
string.format("%e", 1000) -- Outputs 1.000000e+03
string.format("%E", 1000) -- Outputs 1.000000E+03
string.format("%6.3f", 13) -- Outputs 13.000
string.format("%q", "One\nTwo") -- Outputs "One\
-- Two"
string.format("%s", "monkey") -- Outputs monkey
string.format("%10s", "monkey") -- Outputs monkey
string.format("%5.3s", "monkey") -- Outputs mon
Character and Integer Conversion
The following example demonstrates the conversion between characters and integers:
Example
-- Character conversion
-- Convert the first character
print(string.byte("Lua"))
-- Convert the third character
print(string.byte("Lua",3))
-- Convert the last character
print(string.byte("Lua",-1))
-- Convert the second character
print(string.byte("Lua",2))
-- Convert the second to last character
print(string.byte("Lua",-2))
-- Integer ASCII code to character
print(string.char(97))
The above code execution results are:
76
97
97
117
117
a
Other Common Functions
The following example demonstrates other string operations, such as calculating string length, string concatenation, string duplication, etc.:
Example
string1 = "www."
string2 = "tutorialpro"
string3 = ".com"
-- Concatenate strings using ..
print("Concatenated string",string1..string2..string3)
-- String length
print("String length ",string.len(string2))
-- Duplicate string 2 times
repeatedString = string.rep(string2,2)
print(repeatedString)
The above code execution results are:
Concatenated string www.tutorialpro.org
String length 6
tutorialprotutorialpro
Pattern Matching
Patterns in Lua are described directly with regular strings. They are used in pattern matching functions string.find, string.gmatch, string.gsub, string.match.
You can also use character classes in pattern strings.
A character class is a pattern item that can match any character within a specific set. For example, the character class %d matches any digit. Thus, you can use the pattern string %d%d/%d%d/%d%d%d%d
to search for dates in the dd/mm/yyyy format:
Example
s = "Deadline is 30/05/1999, firm"
date = "%d%d/%d%d/%d%d%d%d"
print(string.sub(s, string.find(s, date))) --> 30/05/1999
The following table lists all the character classes supported by Lua:
Single character (except ^$()%.[]*+-?): Matches the character itself.
. (dot): Matches any character.
%a: Matches any letter.
%c: Matches any control character (e.g., \n).
%d: Matches any digit.
%l: Matches any lowercase letter.
%p: Matches any punctuation.
%s: Matches any whitespace character.
%u: Matches any uppercase letter.
%w: Matches any letter or digit.
%x: Matches any hexadecimal digit.
%z: Matches any character that represents 0.
%x (where x is a non-letter, non-digit character): Matches the character x. Primarily used to handle functional characters (^$()%.[]*+-?) in expressions, e.g., %% matches %.
[multiple character classes]: Matches any character within the [] set. For example, [%w_] matches any letter, digit, or underscore (_).
[^multiple character classes]: Matches any character not included in the [] set. For example, [^%s] matches any non-whitespace character.
When the above character classes are written in uppercase, they represent the negation of the class. For example, %S matches any non-whitespace character. For instance, '%A' matches any non-letter character:
> print(string.gsub("hello, up-down!", "%A", "."))
hello..up.down. 4
The number 4 is not part of the string result; it is the second result returned by gsub, representing the number of substitutions made.
In pattern matching, some special characters have special meanings. The special characters in Lua are:
( ) . % + - * ? [ ^ $
'%' is used as an escape character for special characters, so '%.' matches a dot; '%%' matches the character '%'. The escape character '%' can also be used for all non-letter characters.
Pattern items can be:
A single character class matching any single character in that class;
A single character class followed by a '
*
', which matches zero or more characters of that class. This item always matches the longest possible sequence;A single character class followed by a '
+
', which matches one or more characters of that class. This item always matches the longest possible sequence;A single character class followed by a '
-
', which matches zero or more characters of that class. Unlike '*
', this item always matches the shortest possible sequence;A single character class followed by a '
?
', which matches zero or one character of that class. It will match one if possible;%
, where n can be from 1 to 9; this item matches a substring equal to the nth captured substring (described later);%b
, where x and y are two distinct characters; this item matches strings that start with x, end with y, and are balanced. This means that if you read the string from left to right, counting +1 for each x and -1 for each y, the ending y is the first one that counts to 0. For example, the item%b()
can match balanced parentheses;%f[
, indicating a border pattern; this item matches an empty string located right before a character in set, where the character before this position is not in set. The set set is as described earlier. The empty string's start and end positions are calculated as if there were a character '\0
' there.
Pattern:
A pattern is a sequence of pattern items. Prepending '^
' to the pattern anchors the match at the start of the string. Appending '$
' to the pattern anchors the match at the end of the string. If '^
' and '$
' appear elsewhere, they have no special meaning and represent themselves.
Captures:
A pattern can enclose a sub-pattern in parentheses; these sub-patterns are called captures. When a match is successful, the substrings of the string that matched the captures are saved for future use. Captures are numbered by their left parentheses. For example, for the pattern "(a*(.)%w(%s*))"
, the part of the string that matches "a*(.)%w(%s*)"
is saved as the first capture (number 1); the character matched by ".
" is the second capture, and the part matched by "%s*
" is the third capture.
As a special case, an empty capture ()
captures the current string position (a number). For example, if the pattern "()aa()"
is applied to the string "flaaap"
, it will produce two captures: 3 and 5.
```