FromXLSXOperator

XLSX

Convert an XLSX/XLS file into Parquet or CSV file

View on GitHub

Last Updated: Sep. 6, 2021

Access Instructions

Install the XLSX provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

sourcestrSource filename (XLSX or XLS, templated)
targetstrTarget filename (templated)
worksheetstr or intWorksheet title or number (zero-based, templated)
skip_rowsintNumber of input lines to skip (default: 0, templated)
limitintRow limit (default: None, templated)
drop_columnslist of strList of columns to be dropped
add_columnslist of str or dictionary of string key/value pairColumns to be added (dict or list column=value)
typesstr or dictionary of string key/value pairforce Parquet column types (dict or list column=’str’, ‘d’, ‘datetime64[ns]’)
column_nameslist of strforce columns names (list)
file_formatstrOutput file format (parquet, csv, json, jsonl)
csv_delimiterstrCSV delimiter (default: ‘,’)
csv_headerstrConvert CSV output header case (‘lower’, ‘upper’, ‘skip’)

Documentation

Convert an XLSX/XLS file into Parquet or CSV file

Read an XLSX or XLS file and convert it into Parquet, CSV, JSON, JSON Lines(one line per record) file.

Was this page helpful?