Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Sunday, June 15, 2014

Accurate Eye Center Location through Invariant Isocentric Patterns



Below, is a Python implementation of the paper Accurate Eye Center Location through Invariant Isocentric Patterns. Most of the comments in the code are copied directly from the paper.

Download: https://gist.github.com/dimitrs/d667ccb433d5f2baa8df




Tuesday, April 01, 2014

Mixing OpenCV and SciKit-image


I saw a Mathematica post that described how to detect and flatten a label on a jar. My goal here is to do something similar in Python. I am familiar with OpenCV-Python which is what I have always used for my computer vision projects, but it occurred to me that there is no reason why I should only use OpenCV-Python. I could use both OpenCV-Python and SciKit-image at the same time. After all, they are both based on Numpy. For this project I start with OpenCV-Python and then switch to SciKit-image for the last step.

jar.png


#Import both skimage and cv
from skimage import transform as tf
from skimage import io
import cv2

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt

# Could use either skimage or cv to read the image
# img = io.imread('jar.png')   
img = cv2.imread('jar.png')
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray_image,127,255,cv2.THRESH_BINARY)
edges = cv2.Canny(thresh ,100, 200)

# Find largest contour (should be the label)
contours,hierarchy = cv2.findContours(edges, 0, 1)
areas = [cv2.contourArea(c) for c in contours]
max_index = np.argmax(areas)
cnt=contours[max_index]

# Create a mask of the label
mask = np.zeros(img.shape,np.uint8)
cv2.drawContours(mask, [cnt],0,255,-1)

Mask of the label



# Find the 4 borders
scale = 1
delta = 0
ddepth = cv2.CV_8U
borderType=cv2.BORDER_DEFAULT
left=cv2.Sobel(mask,ddepth,1,0,ksize=1,scale=1,delta=0,borderType)
right=cv2.Sobel(mask,ddepth,1,0,ksize=1,scale=-1,delta=0, borderType)
top=cv2.Sobel(mask,ddepth,0,1,ksize=1,scale=1,delta=0,borderType)
bottom=cv2.Sobel(mask,ddepth,0,1,ksize=1,scale=-1,delta=0,borderType)
Left & Right borders after Sobel

# Remove noise from borders
kernel = np.ones((2,2),np.uint8)
left_border = cv2.erode(left,kernel,iterations = 1)
right_border = cv2.erode(right,kernel,iterations = 1)
top_border = cv2.erode(top,kernel,iterations = 1)
bottom_border = cv2.erode(bottom,kernel,iterations = 1)
Left & Right borders after calling erode


# Find coeficients c1,c2, ... ,c7,c8 by minimizing the error function. 
# Points on the left border should be mapped to (0,anything).
# Points on the right border should be mapped to (108,anything)
# Points on the top border should be mapped to (anything,0)
# Points on the bottom border should be mapped to (anything,70)
# Equations 1 and 2: 
#    c1 + c2*x + c3*y + c4*x*y, c5 + c6*y + c7*x + c8*x^2

sumOfSquares_y = '+'.join(["(c[0]+c[1]*%s+c[2]*%s+c[3]*%s*%s)**2" %
    (x,y,x,y) for y,x,z in np.transpose(np.nonzero(left_border)) ])
sumOfSquares_y += " + "
sumOfSquares_y += \
    '+'.join(["(-108+c[0]+c[1]*%s+c[2]*%s+c[3]*%s*%s)**2" % \
    (x,y,x,y) for y,x,z in np.transpose(np.nonzero(right_border)) ])
res_y = optimize.minimize(lambda c: eval(sumOfSquares_y),(0,0,0,0),method='SLSQP')

sumOfSquares_x = \
    '+'.join(["(-70+c[0]+c[1]*%s+c[2]*%s+c[3]*%s*%s)**2" % \
    (y,x,x,x) for y,x,z in np.transpose(np.nonzero(bottom_border))])
sumOfSquares_x += " + "
sumOfSquares_x += \
    '+'.join( [ "(c[0]+c[1]*%s+c[2]*%s+c[3]*%s*%s)**2" % \
    (y,x,x,x) for y,x,z in np.transpose(np.nonzero(top_border)) ] )
res_x = optimize.minimize(lambda c: eval(sumOfSquares_x),(0,0,0,0), method='SLSQP')


# Map the image using equatinos 1 and 2 (coeficients c1...c8 in res_x and res_y)
def map_x(res, cord):
    m = res[0]+res[1]*cord[1]+res[2]*cord[0]+res[3]*cord[1]*cord[0]
    return m
def map_y(res, cord):
    m = res[0]+res[1]*cord[0]+res[2]*cord[1]+res[3]*cord[1]*cord[1]
    return m

flattened = np.zeros(img.shape, img.dtype) 
for y,x,z in np.transpose(np.nonzero(mask)):
    new_y = map_y(res_x.x,[y,x]) 
    new_x = map_x(res_y.x,[y,x])
    flattened[float(new_y)][float(new_x)] = img[y][x]
# Crop the image 
flattened = flattened[0:70, 0:105]
Flattened Image

There is fair amount of distortion in the flattened image. Alternatively, use PiecewiseAffineTransform from SciKit-image:
# Use  skimage to transform the image
leftmost = tuple(cnt[cnt[:,:,0].argmin()][0])
rightmost = tuple(cnt[cnt[:,:,0].argmax()][0])
topmost = tuple(cnt[cnt[:,:,1].argmin()][0])
bottommost = tuple(cnt[cnt[:,:,1].argmax()][0])

dst = list()
src = list()
for y,x,z in np.transpose(np.nonzero(top_border)):
    dst.append([x,y])
    src.append([x,topmost[1]])
for y,x,z in np.transpose(np.nonzero(bottom_border)):
    dst.append([x,y])
    src.append([x,bottommost[1]])
for y,x,z in np.transpose(np.nonzero(left_border)):
    dst.append([x,y])
    src.append([leftmost[0],y])
for y,x,z in np.transpose(np.nonzero(right_border)):
    dst.append([x,y])
    src.append([rightmost[0],y])
src = np.array(src)
dst = np.array(dst)

tform3 = tf.PiecewiseAffineTransform()
tform3.estimate(src, dst)
warped = tf.warp(img, tform3, order=2)
warped = warped[85:170, 31:138]

Flattened label usgin skimage

Friday, March 09, 2012

Use WEKA in your Python code

Weka is a collection of machine learning algorithms that can either be applied directly to a dataset or called from your own Java code. There is an article called “Use WEKA in your Java code” which as its title suggests explains how to use WEKA from your Java code. This is not a surprising thing to do since Weka is implemented in Java. As the title of this post suggests, I will describe how to use WEKA from your Python code instead.

If you have built an entire software system in Python, you might be reluctant to look at libraries in other languages. After all, there are a huge number of excellent Python libraries, and many good machine-learning libraries written in Python or C and C++ with Python bindings. However, as far as I am concerned, it would be a pity not to make use of Weka just because it is written in Java. It is one of the most well known machine-learning libraries around with an extensive number of implemented algorithms. What’s more, there are very few data stream mining libraries around and MOA, related to Weka and also written in Java is the best I have seen.

I use Jpype (http://jpype.sourceforge.net/) to access Weka class libraries. Once you have it installed, download the latest Weka & Moa versions and copy moa.jar, sizeofag.jar and weak.jar into your working directory. Below you can see the full Python listing of the test application. The code initializes the JVM, imports some Weka packages and classes, reads a data set, splits it into a training set and test set, trains a J48 tree classifier and then tests it. If you are familiar with Weka, this will all be very easy.

In a separate post, I will explore how easy it is to use MOA in the same way.

# Initialize the specified JVM
from jpype import *options = [
"-Xmx4G",
"-Djava.class.path=./moa.jar",
"-Djava.class.path=./weka.jar",
"-Djavaagent:sizeofag.jar",
]
startJVM(getDefaultJVMPath(), *options)

# Import java/weka packages and classes
Trees = JPackage("weka.classifiers.trees")
Filter = JClass("weka.filters.Filter")
Attribute = JPackage("weka.filters.unsupervised.attribute")
Instance = JPackage("weka.filters.unsupervised.instance")
RemovePercentage = JClass("weka.filters.unsupervised.instance.RemovePercentage")
Remove = JClass("weka.filters.unsupervised.attribute.Remove")
Classifier = JClass("weka.classifiers.Classifier")
NaiveBayes = JClass("weka.classifiers.bayes.NaiveBayes")
Evaluation = JClass("weka.classifiers.Evaluation")
FilteredClassifier = JClass("weka.classifiers.meta.FilteredClassifier")
Instances = JClass("weka.core.Instances")
BufferedReader = JClass("java.io.BufferedReader")
FileReader = JClass("java.io.FileReader")
Random = JClass("java.util.Random")


#Reading from an ARFF file
reader = BufferedReader(FileReader("./iris.arff"))
data = Instances(reader)
reader.close()
data.setClassIndex(data.numAttributes() - 1) # setting class attribute

# Standardizes all numeric attributes in the given dataset to have zero mean and unit variance, apart from the class attribute.
standardizeFilter = Attribute.Standardize()
standardizeFilter.setInputFormat(data)
data = Filter.useFilter(data, standardizeFilter)

# Randomly shuffles the order of instances passed through it.
randomizeFilter = Instance.Randomize()
randomizeFilter.setInputFormat(data)
data = Filter.useFilter(data, randomizeFilter)

# Creating train set
removeFilter = RemovePercentage()
removeFilter.setInputFormat(data)
removeFilter.setPercentage(30.0)
removeFilter.setInvertSelection(False)
trainData = Filter.useFilter(data, removeFilter)

# Creating test set
removeFilter.setInputFormat(data)
removeFilter.setPercentage(30.0)
removeFilter.setInvertSelection(True)
testData = Filter.useFilter(data, removeFilter)

# Create classifier
j48 = Trees.J48()
j48.setUnpruned(True) # using an unpruned J48
j48.buildClassifier(trainData)

print "Number Training Data", trainData.numInstances(), data.numInstances()
print "Number Test Data", testData.numInstances()

# Test classifier
for i in range(testData.numInstances()):
    pred = j48.classifyInstance(testData.instance(i))
    print "ID:", testData.instance(i).value(0),
    print "actual:", testData.classAttribute().value(int(testData.instance(i).classValue())),
    print "predicted:", testData.classAttribute().value(int(pred))

shutdownJVM()

Wednesday, January 04, 2012

How-to embed HTML in a ReStructuredText PDF document


Download: rst2pdf_html.tar.gz 

I use rst2pdf to create PDF documents from reStructuredtext and on occasion, I have needed to embed HTML in those documents. Unfortunately, this is not possible with rst2pdf as it does not support html embedding. However, with this version of rst2pdf you can embed html as I have done in the example below. The email addresses are rendered as HTML using the “raw:: html” directive. This will only work with this version of rst2pdf and you will need to have xhtml2pdf installed.


ReStructuredtext for the above example:

.. list-table::
   :widths: 50 50 
   :header-rows: 1

   * - **Name**   
     - **e-mail**   
   * - Pedro
     - .. raw:: html
       
         <p><a href="mailto:pedro@address.es">pedro@mailaddress.es</a></p>
   * - Dimitri
     - .. raw:: html
       
         <p><a href="mailto:dimitr@mail.es">dimitr@mailaddress.es</a></p>

Friday, September 30, 2011

Web applications in Python without Javascript, CSS, HTML


Download: rctk_zk.tar.gz

If would like to know how to write rich web applications in pure Python without any knowledge of Javascript, CSS, HTML or Ajax then continue reading this post.


Take a look at the screen-shot. It is the demo application of a framework called RCTK – which makes the development of complex web applications as easy as desktop applications. As the RCTK documentation puts it, the API has been inspired by Tkinter. So if you are comfortable with Tkinter (or for example wxPython), you don't have to be a web programmer to create impressive web applications. Even though applications like the one in the screen-shot are programmed in pure Python, RCTK still has to render Javascript widgets in the browser, transparently to the programmer of course. Currently, two different front-end tool-kits are supported for this: Jquery (see demo) and Qooxdoo (see demo).

If you have just looked at the demos you will be wondering why my screen-shot looks different. Not even the creators of RCTK will fully recognise it. On the other hand, If you have ever used the ZK Java framework then you will probably be familiar with the “classic” blue colour theme and “Grid” widget shown in the screen-shot. I have extracted the Javascript part of ZK and added it to RCTK as an optional third front-end tool-kit. I will contribute the source code to the RCTK project as soon as I can. However, not all ZK components are available. For the moment, it is limited to the current RCTK API. Hopefully, we will expand the API to add other widgets, layouts and features such as border layout, html editor, drag-and-drop and other ZK components. Take a look at the ZK demo to see what components are available.

Below you can see the source code for the grid in the screen-shot. Simple !

class Demo(object):
    title = "List"
    description = "Demonstrates the List control"

    def build(self, tk, parent):
        parent.setLayout(GridLayout(rows=4, columns=2))
        parent.append(StaticText(tk, "Single"))
        parent.append(StaticText(tk, "Multiple"))
        grid = Grid(tk,
        [
            Column('Inv. No', width=55, sorttype=Column.INT),
            Column('Date', width=90, sorttype=Column.DATE),
            Column('Amount', width=80, sorttype=Column.FLOAT),
            Column('Tax', width=80, align=Column.RIGHT, sorttype=Column.FLOAT),
            Column('Total', width=80, align=Column.RIGHT, sorttype=Column.FLOAT),
            Column('Notes', width=150, sortable=False)
        ])

        parent.append(grid,colspan=2)

        for i in range(0, 20):
            grid.add((str(1000+i), "2010-01-%d" % (i+1), str(i), str(i*10.19), str(i*1.19), "Hello world, %d" % I))




Wednesday, July 27, 2011

Python Bindings for Sally (a machine learning tool)

Download:   sally-0.6.1-with-bindings.tar.gz

One of the tools I have used recently in my machine-learning projects is Sally. As Sally’s web page describes it: “There are many applications for Sally, for example, in the areas of natural language processing, bioinformatics, information retrieval and computer security”. You can look at the example page to see more details. It is written in C which makes it fast, but as is usually the case, using a tool like this directly from Python, would make life easier. It would make for faster prototyping and system development and since it is a tool that I think I will be using repeatedly in the future, I gave the library Python bindings. In this post I would like to outline the technique I use to create a python module from a C library. I use Swig for the bindings and you will have to be, to some extent, familiar with Swig to follow the rest of this post.

As input, SWIG takes a file containing ANSI C/C++ declarations, a special "interface file" (usually given an .i suffix). At its simplest, an interface file looks something like this (see below), where a module called "example" will be created with all C/C++ functions and variables in example.h available from Python.
%module example

%{
#include "example.h"
%}

%include "example.h"
 
Unfortunately, interface files are not usually that simple. There are limitations to what Swig will parse correctly. For example, complex declarations such as function pointers and arrays are problematic.

In the case of Sally, libconfig is used for its configuration management and one would need to include libconfig in the interface file. Take a look at the interface file below. Libconfig's config_lookup_string function is problematic. Swig can not deal with the char** without extra work from us. I created a function called config_lookup_string_2 that wraps config_lookup_string and with the help of the cstring.i library, this becomes useable from Python. Unfortunately, this is quite typical -- it often becomes a time consuming process to check every function and structure you want to provide bindings for, and look for ways of making problematic functions and structures work correctly from Python.
%module pysally

%{
#include <libconfig.h>
%}

%include <cstring.i>
%cstring_output_allocate(char **out1, free(*$1));

%{

void config_lookup_string_2(
    const config_t *config, const char *path, char **out1)
{
    *out1 = (char *) malloc(1024);
    (*out1)[0] = 0;   
    config_lookup_string(config, path, (const char *)out1);
}

%}
%include <libconfig.h>

The above interface file can quickly grow into a bit of a night-mare, in terms of development time and complexity as you add additional functions that Swig can’t deal with transparently. I take an alternative route. The approach I use is to create a facade over the api I want to use from Python. The facade consists of one or more C++ classes and it is the facade for which I provide bindings. The facade is made as complex as Swig's parser allows it to be without having to add complex Swig directives in the interface file.

Getting back to Sally. Essentially the library does three things:
1) Read a config file.
2) Read text from a file or files and process them.
3) Write features to an output file.

As far as Sally's configuration processing is concerned, one could provide some getter and setter member functions. It’s not strictly necessary to access the configuration from Python because the facade takes care of the configuration details in the load_config and init methods. All I have to do from Python is pass the name of the configuration file Sally is to use. Below, is my initial attempt at creating a facade and its interface file. You can pass the input and output paths (in the constructor), and configuration path (in load_config). As you can see, I make use of std::string because Swig deals with it semi-transparently by the addition of %include "std_string.i" in the interface file.

swig.i
%module pysally

%{
#include "pysally.h"
%}

%include "std_string.i"
%include "pysally.h"

pysally.h
class Sally
{
public:

    Sally(int verbose, std::string in, std::string out) :
        entries_(0), input_(in), output_(out) {}

    ~Sally();

    /// Load the configuration of Sally
    void load_config(const std::string& config_file);

    /// Init the Sally tool
    void init();

    /// Main processing routine of Sally.
    /// This function processes chunks of strings.
    void process();

    /// Get/Set configuration
    std::string getConfigAttribute(std::string name);

    void setConfigAttribute(std::string name, std::string value);

    // etc
    // ...
    // ...

private:
    config_t cfg_;
    int verbose_;
    long entries_;
    std::string input_;
    std::string output_;
};

From Python you would use it like this:
verbose = 0
in = "/tmp/input"
out = "/tmp/output"
config = "/tmp/sally.cfg"

s = Sally(verbose, in, out)
s.load_config(config)
s.init()
s.process()

As a result, I can now use Sally from Python, which is nice but it doesn’t really provide anything that I can’t already do with the C executable Sally provides. The Sally library allows you to configure its outputs for a specified format, such as plain text, in LibSVM or Matlab formats. Even though it’s not too difficult to add C code for other formats, it is even easier to do from Python. I provide two additional C++ classes; Reader and Writer (see the code below). The reader and writer facades use the underlying Sally library to read and write to files using the format specified in the configuration file, just as the original Sally binary does. But by extending these classes in Python, one could override the default behaviour -- read and write in other formats,  read and write to a database instead, or even write Sally's output directly to another machine-learning module or read its input directly from a web-scrapping python module instead of a file.

Below, you can see the final interface file, the C++ Reader/Writer classes that provide the default implementation and Python extension Reader/Writer classes. The interface file is still very simple. The only new additions are the directors directive. Directors allow C++ classes to be extended in Python, and from C++ these extensions look exactly like native C++ classes. Neither C++ code nor Python code needs to know where a particular method is implemented.

swig.i
%module(directors="1") pysally
%{
#include "pysally.h"
%}

%feature("director") Reader;        
%feature("director") Writer;       

%include "std_string.i"
%include "pysally.h"

pysally.h
class Writer
{
public:

    Writer(std::string out);

    virtual ~Writer();
   
    virtual void init();   
   
    virtual const std::string getName();

    virtual int write(const output_list& output, int len);
   
private:   
    config_t& cfg_;   
    std::string output_;
    bool hasout_;
};

class Reader
{
public:

    Reader(std::string in);

    virtual ~Reader();
   
    virtual void init();   
   
    virtual const std::string getName();
   
    virtual long getNrEntries();

    virtual int read(string_list& strs, int len);

private:   
    config_t& cfg_;   
    std::string input_; 
    long entries_;   
};

run.py
class MyReader(Reader):
    def __init__(self, input):
        super(MyReader, self).__init__(input)

    def read(self, strings, len):       
        return super(MyReader, self).read(strings, len)

    def init(self):
        super(MyReader, self).init()

    def getNrEntries(self):
        return super(MyReader, self).getNrEntries()


class MyWriter(Writer):
    def __init__(self, output):
        super(MyWriter, self).__init__(output)

    def init(self):        
        pass   

    def write(self, fvec, len):       
        for j in range(len):           
            print "l:", fvec.getFeaturesLabel(j),           
            for i in range(fvec.getListLength(j)):
                print fvec.getDimension(j, i), fvec.getValue(j, i),
                print  fvec.getValue(j, i)           
            print fvec.getFeaturesSource(j)           
            print       
        return 1

input = "/home/edimchr/reuters.zip"
output = "/home/edimchr/tmp/pyreuters.libsvm"
verbose = 0
r = MyReader(input)
w = MyWriter(output)
#r = Reader(input)
#w = Writer(output)

s = Sally(verbose, r, w)
s.load_config("./example.cfg")
s.init()
s.process()

From Python then, you can extend the Reader and/or Writer classes defined in C++. MyReader and MyWriter are passed to the Sally facade via its constructor, and from then-on the underlying C++ code uses the derived python implementations. MyReader simply defers to its base class i.e. Reader, and MyWriter prints the various output information Sally generated.

You may have noticed that the Reader class defines the member function:

    virtual int read(string_list& strs, int len);

And Writer defines the member function:

    virtual int write(const output_list& output, int len);

What are string_list and output_list ? Sally defines a couple of structures that it uses to store the text read (string_t) and output features calculated (fvec_t). These two structures are especially problematic for Swig. As a result, I create a facade over each one called string_list and output_list.
class string_list
{   
private:
    string_t* str_;
   
public:   
    string_list(string_t* str) :
        str_(str) {}
  
    /// Length for element i
    void setStringLength(int i, int len) 
      { str_[i].len = len ; }

    /// String data for element i
    void setStringData(int i, char* data) 
      { str_[i].str = strdup(data); } 
   
    /// Optional label of string
    void setLabel(int i, float label) 
      { str_[i].label = label; }
       
    /// Optional description of source
    void setSource(int i, char* src) 
      { str_[i].src = strdup(src); } 
   
    string_t* getString() const { return str_; }
};

class output_list
{   
private:
    fvec_t** vec_;
   
public:   
    output_list(fvec_t** vec) :
        vec_(vec) {}
  
    /// Length for element i
    unsigned long getListLength(int i) const 
      { return vec_[i]->len; }

    /// Nr of features for element i
    unsigned long getTotalFeatures(int i) const 
      { return vec_[i]->total; }
   
    /// Label for element i
    float getFeaturesLabel(int i) const 
      { return vec_[i]->label; }
   
    /// List of dimensions j
    unsigned long getDimension(int i, int j) 
      { return vec_[i]->dim[j]; }
   
    /// List of values for element i
    float getValue(int i, int j) 
      { return vec_[i]->val[j]; }   
   
    char* getFeaturesSource(int i) const 
      { return vec_[i]->src; } 
   
    fvec_t** getFvec() const { return vec_; }
};
By creating a simple C++ facade over the API, Swig can parse the interface file without difficulties. In general, one could use std::string, std::vector, and std::map.


Building the Python module

Sally is a C library and is built with Autotools.

1) You need additional Autoconf macros to enable SWIG and Python support. I added ac_pkg_swig.m4, ax_pkg_swig.m4 and ax_python_devel.m4 to the m4 subdirectory.
 
    sally-0.6.1/
        m4/
            ac_pkg_swig.m4
            ax_pkg_swig.m4
            ax_python_devel.m4
        pysally/
            Makefile.am
            swig.i
        src/
            Makefile.am
        Makefile.am
        configure.in
 
2) Add pysally to sally-0.6.1/Makefile.am
    ……
    SUBDIRS = src doc tests contrib pysally
    ……
    ……
 
3) Add the following to sally-0.6.1/configure.in
    AC_PROG_CXX
    AC_DISABLE_STATIC
    AC_PROG_LIBTOOL
    AX_PYTHON_DEVEL(>= '2.3')
    AM_PATH_PYTHON
    AC_PROG_SWIG(1.3.21)
    SWIG_ENABLE_CXX
    SWIG_PYTHON
 
4) Add pysally/Makefile to AC_CONFIG_FILES in sally-0.6.1/configure.in

    AC_CONFIG_FILES([
       Makefile \
       src/Makefile \
       src/input/Makefile \
       src/output/Makefile \
       src/fvec/Makefile \
       doc/Makefile \
       tests/Makefile \
       contrib/Makefile \
       pysally/Makefile \
    ])

5) Create the pysally subdirectory and add the files
    Makefile.am
    swig.i       <-- interface file
    globals.h 
    pysally.h    <-- wrapper facades
    pysally.cpp
    run.py       <-- example code to use the module

6) To build:
    cd sally-0.6.1
    ./autogen.sh
    ./configure --prefix=/home/yourhome/sally_install/ --enable-libarchive
    make
    make install

Tuesday, June 28, 2011

Restructured text table reports

Download: rst2pdf-read-only.tar.gz

I use rst2pdf to create PDF documents from reStructuredText (ReST). It is relatively easy to generate table reports from a database, like the ones shown below. In this example, the table does not fit into one page and spans two.




Although it is easy to execute statistical queries such as sum a column in a database table using SQL, it is not possible to add summary rows to each page when a table spans multiple pages. The shortcoming with ReST and rst2pdf is that you do not know how many rows will fit into a page because it depends on the page size (A4 etc), font size and table content. The modified version of rst2pdf I provide in this post allows you to attach additional rows (with sum of column information) onto the bottom of tables, automatically, and for each page.  In the example you can see how to insert current page sum of column, previous pages sum of column and total pages sum of column information into the summary row. The ":sum:" is a nonstandard interpreted text role, which means it will only work with this version of rst2pdf.

The syntax to generate the table is this:

.. list-table:: Title
   :widths: 15 10 30
   :header-rows: 1

   * - Description
     - Weight
     - Height
   * - A
     - 1
     - 2 
   * - B
     - 1
     - 2
   * - C
     - 1
     - 2
   * - D
     - 1
     - 2
   * - E
     - 1
     - 2
   * - F
     - 1
     - 2 
   * - G
     - 1
     - 2
   * - H
     - 1
     - 2
   * - I
     - 1
     - 2 
   * - J
     - 1
     - 2
   * - k
     - 1
     - 2
   * - L
     - 1
     - 2
   * - M
     - 1
     - 2
   * - N
     - 1
     - 2
   * - O
     - 1
     - 2
   * - P
     - 1
     - 2
   * - Q
     - 1
     - 2
   * - R
     - 1
     - 2
   * - Previous Total 
     - :sum:`previous_pages_col`
     - :sum:`previous_pages_col`
   * - Current Page Total 
     - :sum:`current_pages_col`
     - :sum:`current_pages_col`
   * - Total 
     - :sum:`total_pages_col`
     - :sum:`total_pages_col`


The command to create the table:

python createpdf.py --repeat-table-rows table.txt -o table.pdf