Energy Engineer (UnB) │ Data Scientist and Analytics (USP)
The programming languages used were PYTHON and VBA EXCEL.
Seeking to simplify and make clear the flow of activities to obtain the final product, this notebook has been divided into 4 parts.
PART 1 - INTRODUCTION
- Containing a short summary of how the project was developed, the basics and bibliography.
PART 2 - EXCEL VBA
- Introducing the
VBAformulas developed in excel to be called inpython.
PART 3 - PYTHON
- Presenting the
pythoncode used for this project.
PART 4 - CONCLUSION
- Final considerations of the project.
- https://www.automateexcel.com/vba-code-examples/
- https://www.rondebruin.nl/index.htm
- https://www.xlwings.org/
- https://pandas.pydata.org/docs/
- http://timgolden.me.uk/pywin32-docs/contents.html
- https://github.com/wesm/pydata-book
- https://github.com/fzumstein/python-for-excel
-
At first, the files are only available in excel
".XLS"format, under the name"vendas-combustíveis-m3.xls". -
Within this initial file, there are two pivot tables that are the target. These are:
-
Pivot Table 1 ) "Vendas, pelas distribuidoras, dos derivados combustíveis de petróleo por Unidade da Federação e produto - 2000-2020 (m3)"
-
Pivot Table 2 ) "Vendas, pelas distribuidoras, de óleo diesel por tipo e Unidade da Federação - 2013-2020 (m3)"
-
-
This data, presented by the pivot tables, does not have its data source easily accessible in another spreadsheet. Also, the data is not available through the Excel shortcut: PivotTableTools>Analyze>Change Data Source. This shows the need to extract them using Excel's own
VBAprogramming language. The advantage of extracting them this way is not only the reduced time for processes that could be long, but the possibility of applying them viapython, through thexlwings library.-
The worksheet, once opened, has by default only one sheet, called "plan1".
-
The macros created in VBA are available in the folder
"\ANP-PROJECT\Codigos_VBA". -
To extract this data, 4 macros were created in VBA. These are presented and described in PART 2 EXCEL
-
-
Once all the VBA - MACROS have been created, they can be called by
pythonand applied there viaxlwings library. -
After applying the Macros on python, the end products of the extraction are two files in
"CSV-UTF8": -
These files were managed via the
Pandas Libraryfrom python. Having the descriptive in PART 3 PYTHON -
Finally, the final product of this project is two files in
"CSV-UTF8"available in the folder"\ANP-PROJECT\Planilhas Finais", according to the following table:
| Column | Type |
|---|---|
| year_month | date |
| uf | string |
| product | string |
| unit | string |
| volume | double |
| created_at | timestamp |