11================================================================================
2- pyexcel-xlsxr - Let you focus on data, instead of file formats
2+ pyexcel-xlsxr - Let you focus on data, instead of xlsx format
33================================================================================
44
55.. image :: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png
@@ -14,8 +14,34 @@ pyexcel-xlsxr - Let you focus on data, instead of file formats
1414.. image :: https://img.shields.io/gitter/room/gitterHQ/gitter.svg
1515 :target: https://gitter.im/pyexcel/Lobby
1616
17- .. image :: https://readthedocs.org/projects/pyexcel-xlsxr/badge/?version=latest
18- :target: http://pyexcel-xlsxr.readthedocs.org/en/latest/
17+
18+ **pyexcel-xlsxr ** is a specialized xlsx reader using lxml. It does partial reading, meaning
19+ it wont load all content into memory.
20+
21+
22+ Known constraints
23+ ==================
24+
25+ Fonts, colors and charts are not supported.
26+
27+ Installation
28+ ================================================================================
29+
30+
31+ You can install pyexcel-xlsxr via pip:
32+
33+ .. code-block :: bash
34+
35+ $ pip install pyexcel-xlsxr
36+
37+
38+ or clone it and install it:
39+
40+ .. code-block :: bash
41+
42+ $ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
43+ $ cd pyexcel-xlsxr
44+ $ python setup.py install
1945
2046 Support the project
2147================================================================================
@@ -32,36 +58,214 @@ With your financial support, I will be able to invest
3258a little bit more time in coding, documentation and writing interesting posts.
3359
3460
35-
36- Introduction
61+ Usage
3762================================================================================
38- **pyexcel-xlsxr ** does Read xlsx file using partial xml.
3963
64+ As a standalone library
65+ --------------------------------------------------------------------------------
4066
67+ .. testcode ::
68+ :hide:
4169
42- Installation
43- ================================================================================
70+ >>> import os
71+ >>> import sys
72+ >>> if sys.version_info[0 ] < 3 :
73+ ... from StringIO import StringIO
74+ ... else :
75+ ... from io import BytesIO as StringIO
76+ >>> PY2 = sys.version_info[0 ] == 2
77+ >>> if PY2 and sys.version_info[1 ] < 7 :
78+ ... from ordereddict import OrderedDict
79+ ... else :
80+ ... from collections import OrderedDict
4481
45- You can install pyexcel-xlsxr via pip:
4682
47- .. code-block :: bash
83+ .. testcode ::
84+ :hide:
4885
49- $ pip install pyexcel-xlsxr
86+ >>> from pyexcel_xlsxw import save_data
87+ >>> data = OrderedDict() # from collections import OrderedDict
88+ >>> data.update({" Sheet 1" : [[1 , 2 , 3 ], [4 , 5 , 6 ]]})
89+ >>> data.update({" Sheet 2" : [[" row 1" , " row 2" , " row 3" ]]})
90+ >>> save_data(" your_file.xlsx" , data)
5091
5192
52- or clone it and install it:
93+ Read from an xlsx file
94+ ********************************************************************************
5395
54- .. code-block :: bash
96+ Here's the sample code:
5597
56- $ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
57- $ cd pyexcel-xlsxr
58- $ python setup.py install
98+ .. code-block :: python
99+
100+ >> > from pyexcel_xlsxr import get_data
101+ >> > data = get_data(" your_file.xlsx" )
102+ >> > import json
103+ >> > print (json.dumps(data))
104+ {" Sheet 1" : [[1 , 2 , 3 ], [4 , 5 , 6 ]], " Sheet 2" : [[" row 1" , " row 2" , " row 3" ]]}
105+
106+
107+
108+ .. testcode ::
109+ :hide:
110+
111+ >>> data = OrderedDict()
112+ >>> data.update({" Sheet 1" : [[1 , 2 , 3 ], [4 , 5 , 6 ]]})
113+ >>> data.update({" Sheet 2" : [[7 , 8 , 9 ], [10 , 11 , 12 ]]})
114+ >>> io = StringIO()
115+ >>> save_data(io, data)
116+ >>> unused = io.seek(0 )
117+ >>> # do something with the io
118+ >>> # In reality, you might give it to your http response
119+ >>> # object for downloading
120+
121+
122+
123+
124+ Read from an xlsx from memory
125+ ********************************************************************************
126+
127+ Continue from previous example:
128+
129+ .. code-block :: python
130+
131+ >> > # This is just an illustration
132+ >> > # In reality, you might deal with xlsx file upload
133+ >> > # where you will read from requests.FILES['YOUR_XLSX_FILE']
134+ >> > data = get_data(io)
135+ >> > print (json.dumps(data))
136+ {" Sheet 1" : [[1 , 2 , 3 ], [4 , 5 , 6 ]], " Sheet 2" : [[7 , 8 , 9 ], [10 , 11 , 12 ]]}
59137
60138
139+ Pagination feature
140+ ********************************************************************************
61141
62- Development guide
142+
143+
144+ Let's assume the following file is a huge xlsx file:
145+
146+ .. code-block :: python
147+
148+ >> > huge_data = [
149+ ... [1 , 21 , 31 ],
150+ ... [2 , 22 , 32 ],
151+ ... [3 , 23 , 33 ],
152+ ... [4 , 24 , 34 ],
153+ ... [5 , 25 , 35 ],
154+ ... [6 , 26 , 36 ]
155+ ... ]
156+ >> > sheetx = {
157+ ... " huge" : huge_data
158+ ... }
159+ >> > save_data(" huge_file.xlsx" , sheetx)
160+
161+ And let's pretend to read partial data:
162+
163+ .. code-block :: python
164+
165+ >> > partial_data = get_data(" huge_file.xlsx" , start_row = 2 , row_limit = 3 )
166+ >> > print (json.dumps(partial_data))
167+ {" huge" : [[3 , 23 , 33 ], [4 , 24 , 34 ], [5 , 25 , 35 ]]}
168+
169+ And you could as well do the same for columns:
170+
171+ .. code-block :: python
172+
173+ >> > partial_data = get_data(" huge_file.xlsx" , start_column = 1 , column_limit = 2 )
174+ >> > print (json.dumps(partial_data))
175+ {" huge" : [[21 , 31 ], [22 , 32 ], [23 , 33 ], [24 , 34 ], [25 , 35 ], [26 , 36 ]]}
176+
177+ Obvious, you could do both at the same time:
178+
179+ .. code-block :: python
180+
181+ >> > partial_data = get_data(" huge_file.xlsx" ,
182+ ... start_row = 2 , row_limit = 3 ,
183+ ... start_column = 1 , column_limit = 2 )
184+ >> > print (json.dumps(partial_data))
185+ {" huge" : [[23 , 33 ], [24 , 34 ], [25 , 35 ]]}
186+
187+ .. testcode ::
188+ :hide:
189+
190+ >>> os.unlink(" huge_file.xlsx" )
191+
192+
193+ As a pyexcel plugin
194+ --------------------------------------------------------------------------------
195+
196+ No longer, explicit import is needed since pyexcel version 0.2.2. Instead,
197+ this library is auto-loaded. So if you want to read data in xlsx format,
198+ installing it is enough.
199+
200+
201+ Reading from an xlsx file
202+ ********************************************************************************
203+
204+ Here is the sample code:
205+
206+ .. code-block :: python
207+
208+ >> > import pyexcel as pe
209+ >> > sheet = pe.get_book(file_name = " your_file.xlsx" )
210+ >> > sheet
211+ Sheet 1 :
212+ + -- -+ -- -+ -- -+
213+ | 1 | 2 | 3 |
214+ + -- -+ -- -+ -- -+
215+ | 4 | 5 | 6 |
216+ + -- -+ -- -+ -- -+
217+ Sheet 2 :
218+ + ------ -+ ------ -+ ------ -+
219+ | row 1 | row 2 | row 3 |
220+ + ------ -+ ------ -+ ------ -+
221+
222+
223+
224+ .. testcode ::
225+ :hide:
226+
227+ >>> sheet.save_as(" another_file.xlsx" )
228+
229+
230+
231+ Reading from a IO instance
232+ ********************************************************************************
233+
234+ You got to wrap the binary content with stream to get xlsx working:
235+
236+ .. code-block :: python
237+
238+ >> > # This is just an illustration
239+ >> > # In reality, you might deal with xlsx file upload
240+ >> > # where you will read from requests.FILES['YOUR_XLSX_FILE']
241+ >> > xlsxfile = " another_file.xlsx"
242+ >> > with open (xlsxfile, " rb" ) as f:
243+ ... content = f.read()
244+ ... r = pe.get_book(file_type = " xlsx" , file_content = content)
245+ ... print (r)
246+ ...
247+ Sheet 1 :
248+ + -- -+ -- -+ -- -+
249+ | 1 | 2 | 3 |
250+ + -- -+ -- -+ -- -+
251+ | 4 | 5 | 6 |
252+ + -- -+ -- -+ -- -+
253+ Sheet 2 :
254+ + ------ -+ ------ -+ ------ -+
255+ | row 1 | row 2 | row 3 |
256+ + ------ -+ ------ -+ ------ -+
257+
258+
259+
260+
261+ License
63262================================================================================
64263
264+ New BSD License
265+
266+ Developer guide
267+ ==================
268+
65269Development steps for code changes
66270
67271#. git clone https://github.com/pyexcel/pyexcel-xlsxr.git
@@ -133,8 +337,9 @@ Acceptance criteria
133337#. Agree on NEW BSD License for your contribution
134338
135339
340+ .. testcode ::
341+ :hide:
136342
137- License
138- ================================================================================
139-
140- New BSD License
343+ >>> import os
344+ >>> os.unlink(" your_file.xlsx" )
345+ >>> os.unlink(" another_file.xlsx" )
0 commit comments