Python3 Strings
Strings are the most commonly used data types in Python. We can create strings using quotes ('
or "
).
Creating a string is simple, just assign a value to a variable. For example:
var1 = 'Hello World!'
var2 = "tutorialpro"
Accessing Values in Strings in Python
Python does not support a single character type; a single character is used as a string.
To access substrings, use square brackets []
to slice the string. The syntax for slicing is as follows:
variable[start_index:end_index]
Index values start at 0 and -1 is the starting position from the end.
Here is an example:
Example (Python 3.0+)
#!/usr/bin/python3
var1 = 'Hello World!'
var2 = "tutorialpro"
print("var1[0]: ", var1[0])
print("var2[1:5]: ", var2[1:5])
Output of the above example:
var1[0]: H
var2[1:5]: unoo
Updating Strings in Python
You can slice a part of the string and concatenate it with other fields. Here is an example:
Example (Python 3.0+)
#!/usr/bin/python3
var1 = 'Hello World!'
print("Updated String : ", var1[:6] + 'tutorialpro!')
Output of the above example:
Updated String : Hello tutorialpro!
Escape Characters in Python
When special characters are needed in strings, Python uses a backslash \
as an escape character. Here is a table:
Escape Character | Description | Example |
---|---|---|
(at end of line) | Line continuation | >>> print("line1 \<br>... line2 \<br>... line3")<br>line1 line2 line3<br>>>> |
\ | Backslash | >>> print("\")<br>\ |
\' | Single quote | >>> print('\'')<br>' |
\" | Double quote | >>> print("\"")<br>" |
\a | Bell | >>> print("\a") (executes with a sound) |
\b | Backspace | >>> print("Hello \b World!")<br>Hello World! |
\000 | Null | >>> print("\000")<br>>>> |
\n | Newline | >>> print("\n")<br>>>> |
\v | Vertical tab | >>> print("Hello \v World!")<br>Hello <br> World!<br>>>> |
\t | Horizontal tab | >>> print("Hello \t World!")<br>Hello World!<br>>>> |
\r | Carriage return, moves the content after \r to the beginning of the string, replacing the beginning characters until the content after \r is fully replaced. | >>> print("Hello\rWorld!")<br>World!<br>>>> print('google tutorialpro taobao\r123456')<br>123456 tutorialpro taobao |
\f | Form feed | >>> print("Hello \f World!")<br>Hello <br> World!<br>>>> |
\yyy | Octal value, y represents characters 0-7, e.g., \012 represents a newline. | >>> print("\110\145\154\154\157\40\127\157\162\154\144\41")<br>Hello World! |
\xyy | Hexadecimal value, starts with \x, y represents the character, e.g., \x0a represents a newline | >>> print("\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21")<br>Hello World! |
\other | Other characters are output as is |
String Operators in Python
The following table uses variable a with the value "Hello" and variable b with the value "Python":
Operator | Description | Example |
---|---|---|
+ | String concatenation | a + b outputs: HelloPython |
* | Repeat output string | a*2 result: HelloHello |
[] | Get character by index | a[1] result: e |
[ : ] | Slice a part of the string, follows left-closed, right-open principle, str[0:2] does not include the 3rd character. | a[1:4] result: ell |
in | Membership operator - Returns True if the string contains the given character | 'H' in a result: True |
not in | Membership operator - Returns True if the string does not contain the given character | 'M' not in a result: True |
r/R | Raw string - All characters in the string are used as-is, without escaping special or non-printable characters. |
Raw strings are identical to regular strings except they have an 'r' (can be uppercase or lowercase) before the first quote. | print( r'\n' )<br>print( R'\n' ) | | % | Format string | See the next section. |
Example (Python 3.0+)
#!/usr/bin/python3
a = "Hello"
b = "Python"
print("a + b result: ", a + b)
print("a * 2 result: ", a * 2)
print("a[1] result: ", a[1])
print("a[1:4] result: ", a[1:4])
if( "H" in a) :
print("H is in variable a")
else :
print("H is not in variable a")
if( "M" not in a) :
print("M is not in variable a")
else :
print("M is in variable a")
print (r'\n')
print (R'\n')
The above example outputs:
a + b result: HelloPython
a * 2 result: HelloHello
a[1] result: e
a[1:4] result: ell
H is in variable a
M is not in variable a
\n
\n
Python String Formatting
Python supports formatted string output. Although this may involve very complex expressions, the basic usage is to insert a value into a string with a string format specifier %s.
In Python, string formatting uses the same syntax as the sprintf function in C.
Example (Python 3.0+)
#!/usr/bin/python3
print ("My name is %s and I'm %d years old!" % ('Xiao Ming', 10))
The above example outputs:
My name is Xiao Ming and I'm 10 years old!
Python string formatting symbols:
| %c | Format character and its ASCII code | | %s | Format string | | %d | Format integer | | %u | Format unsigned integer | | %o | Format unsigned octal number | | %x | Format unsigned hexadecimal number | | %X | Format unsigned hexadecimal number (uppercase) | | %f | Format floating-point number, can specify precision after the decimal point | | %e | Format floating-point number in scientific notation | | %E | Same as %e, format floating-point number in scientific notation | | %g | Abbreviation for %f and %e | | %G | Abbreviation for %f and %E | | %p | Format variable address in hexadecimal |
Formatting operator helper directives:
Symbol | Function |
---|---|
* | Define width or decimal precision |
- | Left alignment |
+ | Display plus sign (+) before positive numbers |
<sp> | Display space before positive numbers |
# | Display '0' before octal numbers, '0x' or '0X' before hexadecimal numbers (depending on 'x' or 'X') |
0 | Fill numbers with '0' instead of default spaces |
% | '%%' outputs a single '%' |
(var) | Map variables (dictionary arguments) |
m.n. | m is the minimum total width, n is the number of digits after the decimal point (if applicable) |
Starting from Python2.6, a new function for string formatting, str.format(), was added, enhancing string formatting capabilities.
Python Triple Quotes
Python triple quotes allow a string to span multiple lines, including newline characters, tabs, and other special characters. Here is an example:
Example (Python 3.0+)
#!/usr/bin/python3
para_str = """This is an example of a multi-line string
Multi-line strings can use tab characters
TAB ( \t ).
They can also use newline characters [ \n ].
"""
print (para_str)
The above example outputs:
This is an example of a multi-line string
Multi-line strings can use tab characters
TAB ( \t ).
They can also use newline characters [ \n ].
This is an example of a multi-line string. Multi-line strings can use tab characters TAB ( ). They can also use newline characters [ ].
Triple quotes allow programmers to escape from the quagmire of quotes and special strings, maintaining a small piece of string formatting in what is known as WYSIWYG (What You See Is What You Get) format.
A typical use case is when you need a piece of HTML or SQL, where string concatenation and special string escapes can become very cumbersome.
errHTML = '''
<HTML><HEAD><TITLE>
Friends CGI Demo</TITLE></HEAD>
<BODY><H3>ERROR</H3>
<B>%s</B><P>
<FORM></FORM>
</BODY></HTML>
'''
cursor.execute('''
CREATE TABLE users (
login VARCHAR(8),
uid INTEGER,
prid INTEGER)
''')
---
## f-string
f-string is added in Python 3.6, known as literal string interpolation, which is a new syntax for formatting strings.
Previously, we were used to using the percentage sign (%):
## Example
name = 'tutorialpro' 'Hello %s' % name 'Hello tutorialpro'
**f-string** formats strings starting with `f`, followed by the string, where expressions are enclosed in curly braces {}. It replaces the variables or expressions with their computed values, as shown below: ## Example
name = 'tutorialpro' f'Hello {name}' # Replace variable 'Hello tutorialpro' f'{1+2}' # Use expression '3'
w = {'name': 'tutorialpro', 'url': 'www.tutorialpro.org'} f'{w["name"]}: {w["url"]}' 'tutorialpro: www.tutorialpro.org'
This method is clearly simpler, eliminating the need to determine whether to use %s or %d. In Python 3.8, you can use the `=` symbol to concatenate the expression and its result: ## Example
x = 1 print(f'{x+1}') # Python 3.6 2
x = 1 print(f'{x+1=}') # Python 3.8 x+1=2 ```
Unicode String
In Python 2, normal strings are stored as 8-bit ASCII, while Unicode strings are stored as 16-bit Unicode strings, allowing for a wider character set. The syntax involves prefixing the string with u.
In Python 3, all strings are Unicode strings.
Built-in String Functions in Python
Common built-in functions for strings in Python are as follows:
No. | Method and Description |
---|---|
1 | capitalize() Converts the first character of the string to uppercase |
2 | center(width, fillchar) Returns a centered string of specified width, with fillchar as the fill character, defaulting to a space. |
3 | count(str, beg=0, end=len(string)) Returns the number of occurrences of str within the string, or within a specified range if beg and end are provided. |
4 | bytes.decode(encoding="utf-8", errors="strict") In Python 3, there is no decode method, but we can use the decode() method of the bytes object to decode the given bytes object, which can be encoded back using str.encode(). |
5 | encode(encoding='UTF-8', errors='strict') Encodes the string using the specified encoding format, defaulting to a ValueError unless errors are set to 'ignore' or 'replace'. |
6 | endswith(suffix, beg=0, end=len(string)) Checks if the string ends with suffix, or within a specified range if beg and end are provided, returning True if so, otherwise False. |
7 | expandtabs(tabsize=8) <br>Converts tab characters in the string to spaces, with the default number of spaces per tab being 8. |
8 | find(str, beg=0, end=len(string)) <br>Checks if str is contained in the string, optionally within the specified range beg and end. Returns the starting index if found, otherwise returns -1. |
9 | index(str, beg=0, end=len(string)) <br>Similar to find(), but raises an exception if str is not found in the string. |
10 | isalnum() <br>Returns True if the string has at least one character and all characters are alphabetic or numeric, otherwise returns False. |
11 | isalpha() <br>Returns True if the string has at least one character and all characters are alphabetic or Chinese characters, otherwise returns False. |
12 | isdigit() <br>Returns True if the string contains only digits, otherwise returns False. |
13 | islower() <br>Returns True if the string contains at least one case-sensitive character and all such characters are lowercase, otherwise returns False. |
14 | isnumeric() <br>Returns True if the string contains only numeric characters, otherwise returns False. |
15 | isspace() <br>Returns True if the string contains only whitespace, otherwise returns False. |
16 | istitle() <br>Returns True if the string is titlecased (as in title()), otherwise returns False. |
17 | isupper() <br>Returns True if the string contains at least one case-sensitive character and all such characters are uppercase, otherwise returns False. |
18 | join(seq) <br>Concatenates the string representations of elements in seq using the specified string as a separator. |
19 | len(string) <br>Returns the length of the string. |
20 | ljust(width[, fillchar]) <br>Returns the original string left-justified, padded to width length with fillchar (default is a space). |
21 | lower() <br>Converts all uppercase characters in the string to lowercase. |
22 | lstrip() <br>Removes leading spaces or specified characters from the string. |
23 | maketrans() <br>Creates a translation table for character mapping, used with translate(). |
24 | max(str) <br>Returns the maximum alphabetical character from the string str. |
25 | min(str) <br>Returns the minimum alphabetical character from the string str. |
26 | replace(old, new [, max]) <br>Replaces old with new in the string, with an optional limit max on the number of replacements. |
27 | rfind(str, beg=0,end=len(string)) <br>Similar to find(), but searches from the right. |
28 | rindex( str, beg=0, end=len(string)) <br>Similar to index(), but searches from the right. |
29 | rjust(width,[, fillchar]) <br>Returns the original string right-justified, padded to width length with fillchar (default is a space). |
30 | rstrip() <br>Removes trailing spaces or specified characters from the string. |
31 | split(str="", num=string.count(str)) <br>Splits the string using str as a delimiter, optionally limited to num+1 splits. |
32 | splitlines([keepends]) <br>Splits the string at line breaks, returning a list of lines. If keepends is True, line breaks are included. |
33 | startswith(substr, beg=0,end=len(string)) <br>Checks if the string starts with the specified substring substr, optionally within the specified range beg and end. |
34 | strip([chars]) <br>Performs both lstrip() and rstrip() on the string. |
35 | swapcase() <br>Converts uppercase letters to lowercase and lowercase letters to uppercase in the string. |
36 | title() <br>Returns a "titlecased" version of the string where words start with an uppercase letter and the remaining letters are lowercase (see istitle()). |
37 | translate(table, deletechars="") <br>Translates the characters of the string based on the table provided (containing 256 characters),<br>characters to be filtered out are placed in the deletechars parameter. |
38 | upper() <br>Converts all lowercase letters in the string to uppercase. |
39 | zfill (width) <br>Returns a string of length width, right-justified with the original string, padded with zeros at the beginning. |
40 | isdecimal() <br>Checks if the string contains only decimal characters; returns true if so, otherwise returns false. |