This is a short note about reading tables from common office document formats into R.
xls and xlsx
After problems with large worksheets I have turned to using the readxl package.
require(readxl)
sheet_names <- excel_sheets("my_file.xls")
d <- read_excel("my_file.xls", sheet = 1, range = "A3:C7", col_names = TRUE)
docx
Recently I had to save a doc file in the docx format in order to be able to extract a table. Extracting tables from docx works like this:
require(docxtractr)
docx <- read_docx("my_file.docx")
tables <- docx_extract_all_tbls(docx)