Date Modified

This is a short note about reading tables from common office document formats into R.

xls and xlsx

After problems with large worksheets I have turned to using the readxl package.

require(readxl)
sheet_names <- excel_sheets("my_file.xls")
d <- read_excel("my_file.xls", sheet = 1, region = "A3:C7", col_names = TRUE)

docx

Recently I had to save a doc file in the docx format in order to be able to extract a table. Extracting tables from docx works like this:

require(docxtractr)
docx <- read_docx("my_file.docx")
tables <- docx_extract_all_tbls(docx)