Skip to content

Commit 7e9054c

Browse files
committed
📚 mobanize the reader, use common template
1 parent 862d038 commit 7e9054c

File tree

2 files changed

+276
-25
lines changed

2 files changed

+276
-25
lines changed

.moban.d/README.rst

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,54 @@
1-
{% extends "BASIC-README.rst.jj2" %}
1+
{% extends "README.rst.jj2" %}
22

3-
{%block constraint%}
3+
{%block documentation_link%}
44
{%endblock%}
55

6-
{%block features %}
7-
**{{name}}** does {{description}}.
6+
{%block description %}
7+
**{{name}}** is a specialized xlsx reader using lxml. It does partial reading, meaning
8+
it wont load all content into memory.
9+
810
{%endblock%}
11+
12+
{% block write_to_file %}
13+
14+
.. testcode::
15+
:hide:
16+
17+
>>> from pyexcel_xlsxw import save_data
18+
>>> data = OrderedDict() # from collections import OrderedDict
19+
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
20+
>>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})
21+
>>> save_data("your_file.{{file_type}}", data)
22+
23+
{% endblock %}
24+
25+
26+
{% block write_to_memory %}
27+
28+
.. testcode::
29+
:hide:
30+
31+
>>> data = OrderedDict()
32+
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
33+
>>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})
34+
>>> io = StringIO()
35+
>>> save_data(io, data)
36+
>>> unused = io.seek(0)
37+
>>> # do something with the io
38+
>>> # In reality, you might give it to your http response
39+
>>> # object for downloading
40+
41+
42+
{%endblock%}
43+
44+
{% block pyexcel_write_to_file%}
45+
46+
.. testcode::
47+
:hide:
48+
49+
>>> sheet.save_as("another_file.{{file_type}}")
50+
51+
{% endblock %}
52+
53+
{% block pyexcel_write_to_memory%}
54+
{% endblock %}

README.rst

Lines changed: 226 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
================================================================================
2-
pyexcel-xlsxr - Let you focus on data, instead of file formats
2+
pyexcel-xlsxr - Let you focus on data, instead of xlsx format
33
================================================================================
44

55
.. image:: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png
@@ -14,8 +14,34 @@ pyexcel-xlsxr - Let you focus on data, instead of file formats
1414
.. image:: https://img.shields.io/gitter/room/gitterHQ/gitter.svg
1515
:target: https://gitter.im/pyexcel/Lobby
1616

17-
.. image:: https://readthedocs.org/projects/pyexcel-xlsxr/badge/?version=latest
18-
:target: http://pyexcel-xlsxr.readthedocs.org/en/latest/
17+
18+
**pyexcel-xlsxr** is a specialized xlsx reader using lxml. It does partial reading, meaning
19+
it wont load all content into memory.
20+
21+
22+
Known constraints
23+
==================
24+
25+
Fonts, colors and charts are not supported.
26+
27+
Installation
28+
================================================================================
29+
30+
31+
You can install pyexcel-xlsxr via pip:
32+
33+
.. code-block:: bash
34+
35+
$ pip install pyexcel-xlsxr
36+
37+
38+
or clone it and install it:
39+
40+
.. code-block:: bash
41+
42+
$ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
43+
$ cd pyexcel-xlsxr
44+
$ python setup.py install
1945
2046
Support the project
2147
================================================================================
@@ -32,36 +58,214 @@ With your financial support, I will be able to invest
3258
a little bit more time in coding, documentation and writing interesting posts.
3359

3460

35-
36-
Introduction
61+
Usage
3762
================================================================================
38-
**pyexcel-xlsxr** does Read xlsx file using partial xml.
3963

64+
As a standalone library
65+
--------------------------------------------------------------------------------
4066

67+
.. testcode::
68+
:hide:
4169

42-
Installation
43-
================================================================================
70+
>>> import os
71+
>>> import sys
72+
>>> if sys.version_info[0] < 3:
73+
... from StringIO import StringIO
74+
... else:
75+
... from io import BytesIO as StringIO
76+
>>> PY2 = sys.version_info[0] == 2
77+
>>> if PY2 and sys.version_info[1] < 7:
78+
... from ordereddict import OrderedDict
79+
... else:
80+
... from collections import OrderedDict
4481

45-
You can install pyexcel-xlsxr via pip:
4682

47-
.. code-block:: bash
83+
.. testcode::
84+
:hide:
4885

49-
$ pip install pyexcel-xlsxr
86+
>>> from pyexcel_xlsxw import save_data
87+
>>> data = OrderedDict() # from collections import OrderedDict
88+
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
89+
>>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})
90+
>>> save_data("your_file.xlsx", data)
5091

5192

52-
or clone it and install it:
93+
Read from an xlsx file
94+
********************************************************************************
5395

54-
.. code-block:: bash
96+
Here's the sample code:
5597

56-
$ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
57-
$ cd pyexcel-xlsxr
58-
$ python setup.py install
98+
.. code-block:: python
99+
100+
>>> from pyexcel_xlsxr import get_data
101+
>>> data = get_data("your_file.xlsx")
102+
>>> import json
103+
>>> print(json.dumps(data))
104+
{"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [["row 1", "row 2", "row 3"]]}
105+
106+
107+
108+
.. testcode::
109+
:hide:
110+
111+
>>> data = OrderedDict()
112+
>>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
113+
>>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})
114+
>>> io = StringIO()
115+
>>> save_data(io, data)
116+
>>> unused = io.seek(0)
117+
>>> # do something with the io
118+
>>> # In reality, you might give it to your http response
119+
>>> # object for downloading
120+
121+
122+
123+
124+
Read from an xlsx from memory
125+
********************************************************************************
126+
127+
Continue from previous example:
128+
129+
.. code-block:: python
130+
131+
>>> # This is just an illustration
132+
>>> # In reality, you might deal with xlsx file upload
133+
>>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
134+
>>> data = get_data(io)
135+
>>> print(json.dumps(data))
136+
{"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [[7, 8, 9], [10, 11, 12]]}
59137
60138
139+
Pagination feature
140+
********************************************************************************
61141

62-
Development guide
142+
143+
144+
Let's assume the following file is a huge xlsx file:
145+
146+
.. code-block:: python
147+
148+
>>> huge_data = [
149+
... [1, 21, 31],
150+
... [2, 22, 32],
151+
... [3, 23, 33],
152+
... [4, 24, 34],
153+
... [5, 25, 35],
154+
... [6, 26, 36]
155+
... ]
156+
>>> sheetx = {
157+
... "huge": huge_data
158+
... }
159+
>>> save_data("huge_file.xlsx", sheetx)
160+
161+
And let's pretend to read partial data:
162+
163+
.. code-block:: python
164+
165+
>>> partial_data = get_data("huge_file.xlsx", start_row=2, row_limit=3)
166+
>>> print(json.dumps(partial_data))
167+
{"huge": [[3, 23, 33], [4, 24, 34], [5, 25, 35]]}
168+
169+
And you could as well do the same for columns:
170+
171+
.. code-block:: python
172+
173+
>>> partial_data = get_data("huge_file.xlsx", start_column=1, column_limit=2)
174+
>>> print(json.dumps(partial_data))
175+
{"huge": [[21, 31], [22, 32], [23, 33], [24, 34], [25, 35], [26, 36]]}
176+
177+
Obvious, you could do both at the same time:
178+
179+
.. code-block:: python
180+
181+
>>> partial_data = get_data("huge_file.xlsx",
182+
... start_row=2, row_limit=3,
183+
... start_column=1, column_limit=2)
184+
>>> print(json.dumps(partial_data))
185+
{"huge": [[23, 33], [24, 34], [25, 35]]}
186+
187+
.. testcode::
188+
:hide:
189+
190+
>>> os.unlink("huge_file.xlsx")
191+
192+
193+
As a pyexcel plugin
194+
--------------------------------------------------------------------------------
195+
196+
No longer, explicit import is needed since pyexcel version 0.2.2. Instead,
197+
this library is auto-loaded. So if you want to read data in xlsx format,
198+
installing it is enough.
199+
200+
201+
Reading from an xlsx file
202+
********************************************************************************
203+
204+
Here is the sample code:
205+
206+
.. code-block:: python
207+
208+
>>> import pyexcel as pe
209+
>>> sheet = pe.get_book(file_name="your_file.xlsx")
210+
>>> sheet
211+
Sheet 1:
212+
+---+---+---+
213+
| 1 | 2 | 3 |
214+
+---+---+---+
215+
| 4 | 5 | 6 |
216+
+---+---+---+
217+
Sheet 2:
218+
+-------+-------+-------+
219+
| row 1 | row 2 | row 3 |
220+
+-------+-------+-------+
221+
222+
223+
224+
.. testcode::
225+
:hide:
226+
227+
>>> sheet.save_as("another_file.xlsx")
228+
229+
230+
231+
Reading from a IO instance
232+
********************************************************************************
233+
234+
You got to wrap the binary content with stream to get xlsx working:
235+
236+
.. code-block:: python
237+
238+
>>> # This is just an illustration
239+
>>> # In reality, you might deal with xlsx file upload
240+
>>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
241+
>>> xlsxfile = "another_file.xlsx"
242+
>>> with open(xlsxfile, "rb") as f:
243+
... content = f.read()
244+
... r = pe.get_book(file_type="xlsx", file_content=content)
245+
... print(r)
246+
...
247+
Sheet 1:
248+
+---+---+---+
249+
| 1 | 2 | 3 |
250+
+---+---+---+
251+
| 4 | 5 | 6 |
252+
+---+---+---+
253+
Sheet 2:
254+
+-------+-------+-------+
255+
| row 1 | row 2 | row 3 |
256+
+-------+-------+-------+
257+
258+
259+
260+
261+
License
63262
================================================================================
64263

264+
New BSD License
265+
266+
Developer guide
267+
==================
268+
65269
Development steps for code changes
66270

67271
#. git clone https://github.com/pyexcel/pyexcel-xlsxr.git
@@ -133,8 +337,9 @@ Acceptance criteria
133337
#. Agree on NEW BSD License for your contribution
134338

135339

340+
.. testcode::
341+
:hide:
136342

137-
License
138-
================================================================================
139-
140-
New BSD License
343+
>>> import os
344+
>>> os.unlink("your_file.xlsx")
345+
>>> os.unlink("another_file.xlsx")

0 commit comments

Comments
 (0)