📚 mobanize the reader, use common template

chfw · chfw · commit 7e9054cca8c5 · 2017-11-24T22:21:06.000Z
diff --git a/.moban.d/README.rst b/.moban.d/README.rst
@@ -1,8 +1,54 @@
-{% extends "BASIC-README.rst.jj2" %}
+{% extends "README.rst.jj2" %}
 
-{%block constraint%}
+{%block documentation_link%}
 {%endblock%}
 
-{%block features %}
-**{{name}}** does {{description}}.
+{%block description %}
+**{{name}}** is a specialized xlsx reader using lxml. It does partial reading, meaning
+it wont load all content into memory.
+
 {%endblock%}
+
+{% block write_to_file %}
+
+.. testcode::
+   :hide:
+
+    >>> from pyexcel_xlsxw import save_data
+    >>> data = OrderedDict() # from collections import OrderedDict
+    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
+    >>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})
+    >>> save_data("your_file.{{file_type}}", data)
+
+{% endblock %}
+
+
+{% block write_to_memory %}
+
+.. testcode::
+   :hide:
+
+    >>> data = OrderedDict()
+    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
+    >>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})
+    >>> io = StringIO()
+    >>> save_data(io, data)
+    >>> unused = io.seek(0)
+    >>> # do something with the io
+    >>> # In reality, you might give it to your http response
+    >>> # object for downloading
+
+
+{%endblock%}
+
+{% block pyexcel_write_to_file%}
+
+.. testcode::
+   :hide:
+
+    >>> sheet.save_as("another_file.{{file_type}}")
+
+{% endblock %}
+
+{% block pyexcel_write_to_memory%}
+{% endblock %}
diff --git a/README.rst b/README.rst
@@ -1,5 +1,5 @@
 ================================================================================
-pyexcel-xlsxr - Let you focus on data, instead of file formats
+pyexcel-xlsxr - Let you focus on data, instead of xlsx format
 ================================================================================
 
 .. image:: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png
@@ -14,8 +14,34 @@ pyexcel-xlsxr - Let you focus on data, instead of file formats
 .. image:: https://img.shields.io/gitter/room/gitterHQ/gitter.svg
    :target: https://gitter.im/pyexcel/Lobby
 
-.. image:: https://readthedocs.org/projects/pyexcel-xlsxr/badge/?version=latest
-   :target: http://pyexcel-xlsxr.readthedocs.org/en/latest/
+
+**pyexcel-xlsxr** is a specialized xlsx reader using lxml. It does partial reading, meaning
+it wont load all content into memory.
+
+
+Known constraints
+==================
+
+Fonts, colors and charts are not supported.
+
+Installation
+================================================================================
+
+
+You can install pyexcel-xlsxr via pip:
+
+.. code-block:: bash
+
+    $ pip install pyexcel-xlsxr
+
+
+or clone it and install it:
+
+.. code-block:: bash
+
+    $ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
+    $ cd pyexcel-xlsxr
+    $ python setup.py install
 
 Support the project
 ================================================================================
@@ -32,36 +58,214 @@ With your financial support, I will be able to invest
 a little bit more time in coding, documentation and writing interesting posts.
 
 
-
-Introduction
+Usage
 ================================================================================
-**pyexcel-xlsxr** does Read xlsx file using partial xml.
 
+As a standalone library
+--------------------------------------------------------------------------------
 
+.. testcode::
+   :hide:
 
-Installation
-================================================================================
+    >>> import os
+    >>> import sys
+    >>> if sys.version_info[0] < 3:
+    ...     from StringIO import StringIO
+    ... else:
+    ...     from io import BytesIO as StringIO
+    >>> PY2 = sys.version_info[0] == 2
+    >>> if PY2 and sys.version_info[1] < 7:
+    ...      from ordereddict import OrderedDict
+    ... else:
+    ...     from collections import OrderedDict
 
-You can install pyexcel-xlsxr via pip:
 
-.. code-block:: bash
+.. testcode::
+   :hide:
 
-    $ pip install pyexcel-xlsxr
+    >>> from pyexcel_xlsxw import save_data
+    >>> data = OrderedDict() # from collections import OrderedDict
+    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
+    >>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})
+    >>> save_data("your_file.xlsx", data)
 
 
-or clone it and install it:
+Read from an xlsx file
+********************************************************************************
 
-.. code-block:: bash
+Here's the sample code:
 
-    $ git clone https://github.com/pyexcel/pyexcel-xlsxr.git
-    $ cd pyexcel-xlsxr
-    $ python setup.py install
+.. code-block:: python
+
+    >>> from pyexcel_xlsxr import get_data
+    >>> data = get_data("your_file.xlsx")
+    >>> import json
+    >>> print(json.dumps(data))
+    {"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [["row 1", "row 2", "row 3"]]}
+
+
+
+.. testcode::
+   :hide:
+
+    >>> data = OrderedDict()
+    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})
+    >>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})
+    >>> io = StringIO()
+    >>> save_data(io, data)
+    >>> unused = io.seek(0)
+    >>> # do something with the io
+    >>> # In reality, you might give it to your http response
+    >>> # object for downloading
+
+
+
+
+Read from an xlsx from memory
+********************************************************************************
+
+Continue from previous example:
+
+.. code-block:: python
+
+    >>> # This is just an illustration
+    >>> # In reality, you might deal with xlsx file upload
+    >>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
+    >>> data = get_data(io)
+    >>> print(json.dumps(data))
+    {"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [[7, 8, 9], [10, 11, 12]]}
 
 
+Pagination feature
+********************************************************************************
 
-Development guide
+
+
+Let's assume the following file is a huge xlsx file:
+
+.. code-block:: python
+
+   >>> huge_data = [
+   ...     [1, 21, 31],
+   ...     [2, 22, 32],
+   ...     [3, 23, 33],
+   ...     [4, 24, 34],
+   ...     [5, 25, 35],
+   ...     [6, 26, 36]
+   ... ]
+   >>> sheetx = {
+   ...     "huge": huge_data
+   ... }
+   >>> save_data("huge_file.xlsx", sheetx)
+
+And let's pretend to read partial data:
+
+.. code-block:: python
+
+   >>> partial_data = get_data("huge_file.xlsx", start_row=2, row_limit=3)
+   >>> print(json.dumps(partial_data))
+   {"huge": [[3, 23, 33], [4, 24, 34], [5, 25, 35]]}
+
+And you could as well do the same for columns:
+
+.. code-block:: python
+
+   >>> partial_data = get_data("huge_file.xlsx", start_column=1, column_limit=2)
+   >>> print(json.dumps(partial_data))
+   {"huge": [[21, 31], [22, 32], [23, 33], [24, 34], [25, 35], [26, 36]]}
+
+Obvious, you could do both at the same time:
+
+.. code-block:: python
+
+   >>> partial_data = get_data("huge_file.xlsx",
+   ...     start_row=2, row_limit=3,
+   ...     start_column=1, column_limit=2)
+   >>> print(json.dumps(partial_data))
+   {"huge": [[23, 33], [24, 34], [25, 35]]}
+
+.. testcode::
+   :hide:
+
+   >>> os.unlink("huge_file.xlsx")
+
+
+As a pyexcel plugin
+--------------------------------------------------------------------------------
+
+No longer, explicit import is needed since pyexcel version 0.2.2. Instead,
+this library is auto-loaded. So if you want to read data in xlsx format,
+installing it is enough.
+
+
+Reading from an xlsx file
+********************************************************************************
+
+Here is the sample code:
+
+.. code-block:: python
+
+    >>> import pyexcel as pe
+    >>> sheet = pe.get_book(file_name="your_file.xlsx")
+    >>> sheet
+    Sheet 1:
+    +---+---+---+
+    | 1 | 2 | 3 |
+    +---+---+---+
+    | 4 | 5 | 6 |
+    +---+---+---+
+    Sheet 2:
+    +-------+-------+-------+
+    | row 1 | row 2 | row 3 |
+    +-------+-------+-------+
+
+
+
+.. testcode::
+   :hide:
+
+    >>> sheet.save_as("another_file.xlsx")
+
+
+
+Reading from a IO instance
+********************************************************************************
+
+You got to wrap the binary content with stream to get xlsx working:
+
+.. code-block:: python
+
+    >>> # This is just an illustration
+    >>> # In reality, you might deal with xlsx file upload
+    >>> # where you will read from requests.FILES['YOUR_XLSX_FILE']
+    >>> xlsxfile = "another_file.xlsx"
+    >>> with open(xlsxfile, "rb") as f:
+    ...     content = f.read()
+    ...     r = pe.get_book(file_type="xlsx", file_content=content)
+    ...     print(r)
+    ...
+    Sheet 1:
+    +---+---+---+
+    | 1 | 2 | 3 |
+    +---+---+---+
+    | 4 | 5 | 6 |
+    +---+---+---+
+    Sheet 2:
+    +-------+-------+-------+
+    | row 1 | row 2 | row 3 |
+    +-------+-------+-------+
+
+
+
+
+License
 ================================================================================
 
+New BSD License
+
+Developer guide
+==================
+
 Development steps for code changes
 
 #. git clone https://github.com/pyexcel/pyexcel-xlsxr.git
@@ -133,8 +337,9 @@ Acceptance criteria
 #. Agree on NEW BSD License for your contribution
 
 
+.. testcode::
+   :hide:
 
-License
-================================================================================
-
-New BSD License
+   >>> import os
+   >>> os.unlink("your_file.xlsx")
+   >>> os.unlink("another_file.xlsx")