R XML File
XML stands for Extensible Markup Language (XML), which is designed for data transmission and storage.
If you are not familiar with XML, you can refer to: XML Tutorial
To read and write XML files in R, you need to install an extension package. You can install it by entering the following command in the R console:
install.packages("XML", repos = "https://mirrors.ustc.edu.cn/CRAN/")
To check if the installation was successful:
> any(grepl("XML", installed.packages()))
[1] TRUE
Create a sites.xml
file, with the XML file and test script in the same directory, the code is as follows:
Example
<sites>
<site>
<id>1</id>
<name>Google</name>
<url>www.google.com</url>
<likes>111</likes>
</site>
<site>
<id>2</id>
<name>tutorialpro</name>
<url>www.tutorialpro.org</url>
<likes>222</likes>
</site>
<site>
<id>3</id>
<name>Taobao</name>
<url>www.taobao.com</url>
<likes>333</likes>
</site>
</sites>
Next, we can use the XML package to load the XML file data:
Example
# Load the XML package
library("XML")
# Set the file name
result <- xmlParse(file = "sites.xml")
# Output the result
print(result)
To count the amount of XML data:
Example
# Load the XML package
library("XML")
# Set the file name
result <- xmlParse(file = "sites.xml")
# Extract the root node
rootnode <- xmlRoot(result)
# Count the data size
rootsize <- xmlSize(rootnode)
# Output the result
print(rootsize)
Executing the above code outputs:
[1] 3
To view node data, use [ ]
for a specific row, and [[ ]]
for a specified row and column:
Example
# Load the XML package
library("XML")
# Set the file name
result <- xmlParse(file = "sites.xml")
# Extract the root node
rootnode <- xmlRoot(result)
# View the data of the 2nd node
print(rootnode[2])
# View the 1st data of the 2nd node
print(rootnode[[2]][[1]])
# View the 3rd data of the 2nd node
print(rootnode[[2]][[3]])
Executing the above code outputs:
$site
<site>
<id>2</id>
<name>tutorialpro</name>
<url>www.tutorialpro.org</url>
<likes>222</likes>
</site>
attr(,"class")
[1] "XMLInternalNodeList" "XMLNodeList"
<id>2</id>
<url>www.tutorialpro.org</url>
Convert XML to Data List
The above outputs are in XML format. We can use the xmlToList()
function to convert the file data to a list format, making it easier to read:
Example
# Load the XML package
library("XML")
# Set the file name
result <- xmlParse(file = "sites.xml")
# Convert to list
xml_data <- xmlToList(result)
print(xml_data)
print("============================")
# Output the data of the 1st row, 2nd column
print(xml_data[[1]][[2]])
Executing the above code outputs:
$site
$site$id
[1] "1"
$site$name
[1] "Google"
$site$url
[1] "www.google.com"
$site$likes
[1] "111"
$site
$site$id
[1] "2"
$site$name
[1] "tutorialpro"
$site$url [1] "www.tutorialpro.org"
$site$likes [1] "222"
$site $site$id [1] "3"
$site$name [1] "Taobao"
$site$url [1] "www.taobao.com"
$site$likes [1] "333"
[1] "============================" [1] "Google"