Guppy-PE

A Python Programming Environment


Guppy A fish swimming in Python
Heapy Heap Analysis Toolset
GSL Guppy Specification Language
Documentation
Download
Credits
Contact

This is the home page for Guppy-PE , a programming environment providing object and heap memory sizing, profiling and analysis. It includes a prototypical specification language that can be used to formally specify aspects of Python programs and generate tests and documentation from a common source.

Guppy

Guppy is an umbrella package combining Heapy and GSL with support utilities such as the Glue module that keeps things together.

The name guppy was choosen because I found it in a backward-dictionary as a word ending with py and I thought it was cute enough and that it would not so likely conflict with some other package name. It was to be a general name since all kinds of packages should fit under this top level name.

The name guppy-pe is because there was another project named guppy in Sourceforge when I was about to register guppy. The other guppy was not in Python, so I added -pe which means Programming Environment. The Python package is just guppy.

Heapy

The aim of Heapy is to support debugging and optimization regarding memory related issues in Python programs.

Such issues can make a program use too much memory, making it slow by itself as well as slowing down an entire server, or it may fail to run at all in a limited memory device such as a mobile phone.

The primary motivation for Heapy is that there has been a lack of support for the programmer to get information about the memory usage in Python programs. Heapy is an attempt to improve this situation. A project with a similar intent is PySizer.

The problem situation has a number of aspects, which I think can be characterised, for example, as follows.

As Heapy has evolved, with considerations like this in mind, it currently provides the following features.

Data gathering

  • Finds reachable and/or unreachable objects in the object heap, and collects them into special C-implemented 'nodesets'. Can get data about the objects such as their sizes and how they refer to each other.
  • Uses a C library that can get data about non-standard types from extension modules, given a function table.
  • Optionally uses multiple Python interpreters in the same process, so one can monitor the other transparently.

Data processing

  • Algebraic set operations, for example the set difference can be used to extract the objects allocated after a reference point in time.
  • Various classifications of object sets, and different classifiers can be combined.
  • Shortest paths to a set of objects from other objects, which can be used to find out why the objects are retained in memory.
  • Calculation of the 'dominated' set from a set of root objects which yields the set of objects that would be deallocated if the root objects were deallocated.

Presentation

  • Tables where each row represents a classification of data.
  • Lists of shortest paths where the edges show the relationships found between the underlying C objects.
  • Reference pattern, presenting a spanning tree of the graph with sets of objects treated as a unit.
  • Limits the number of rows when presentation objects are shown, without depending on an external pager.
  • An interactive graphical browser program can show a time sequence of classified heap data sets as a graph together with a table detailing the data at a specific time or the difference between two points in time.

Portability aspects

  • Can be used with an unmodified C Python, back to version 2.3 AFAIK. Does not depend on any external unix-specific or other utilities.
  • Requires Tk if the graphical browser is to be used.
  • Can not be used with Jython or other non-C Python versions.

System aspects

  • A general 'glue' model provides a session context that imports modules and creates objects automatically when accessed. The glue model is not Heapy specific but is used throughout Guppy and could be used by other packages as well.
  • The glue model makes it practical to have everything in Guppy being dynamically allocated in a session context, so there is no need for any global module-level variables. The modules themself are stored as usual in sys.modules but they are not modified.
  • To be true there is one exception I come to think of but it is really exceptional.

Heapy has been used during development of itself and of the other parts of Guppy. It has been used to tell how much memory the parts of compound objects use, to see what could be worthwhile to optimize. It was used to find a memory leak in the Heapy profile browser, and to find out the cause, which as far as I can tell was due to a bug in a library routine which I have reported.

Example

The following example shows

  1. How to create the session context: h=hpy()
  2. How to show the reachable objects in the heap: h.heap()
  3. How to create and show a set of objects: h.iso(1,[],{})
  4. How to show the shortest paths from the root to x: h.iso(x).sp
>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25773  53  1612820  49   1612820  49 str
     1  11699  24   483960  15   2096780  64 tuple
     2    174   0   241584   7   2338364  72 dict of module
     3   3478   7   222592   7   2560956  78 types.CodeType
     4   3296   7   184576   6   2745532  84 function
     5    401   1   175112   5   2920644  89 dict of class
     6    108   0    81888   3   3002532  92 dict (no owner)
     7    114   0    79632   2   3082164  94 dict of type
     8    117   0    51336   2   3133500  96 type
     9    667   1    24012   1   3157512  97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  33      136  77       136  77 dict (no owner)
     1      1  33       28  16       164  93 list
     2      1  33       12   7       176 100 int
>>> x=[]
>>> h.iso(x).sp
 0: h.Root.i0_modules['__main__'].__dict__['x']
>>> 

GSL

The Guppy Specification Language is an evolving specification language. I started experimenting with this language because I felt the need to have a way to specify documentation and tests from the same source. GSL can describe aspects of a system, especially its API, in a way that can be automatically converted to tests as well as to documents. The documents generated have a formal structure for describing the formal aspects of the specification, complemented with descriptive text from the same source documents. A language that is similar in intent is the Assertion Definition Language .

Specifications written in GSL can be used for:

GSL has been used to generate the documentation for this Guppy distribution. Some part of the specification has been checked against the implementation using the generated tests, which did reveal some discrepancies that were subsequently corrected.

The documents generated by GSL use a formal syntax to describe parameter modes. This document contains examples of such parameter descriptions and explains what they mean.

Documentation

Some documentation is included with the source code distribution and can also be browsed here via the following links.

Document example Explains the meaning of some aspects of the documents.
Guppy Specification of guppy , the top level module.
Profile Browser How to use the graphical heap profile browser.
Screenshot Example showing the graphical heap profile browser in action.
GSL The Guppy Specification Language.
heapyc Specification of the heapyc extension module. Note that this is an internal interface and may be subject to change.
sets Specification of the interface to the setsc extension module which contains bitsets and nodesets.

The following documentation is not included with the source code.

heapy-thesis.pdf The master's thesis, "Heapy: A Memory Profiler and Debugger for Python", which presents background, design, implementation, rationale and some use cases for Heapy (version 0.1).
Metadata and Abstract Published at Linkoping University Electronic Press.
heapy-presentation.pdf Slides from the presentation.

Download

The latest version is in the svn trunk directory. As of this writing, I have tested it successfully in 64 as well as 32 bits mode with Ubuntu 7.10 on an AMD64, with Python 2.3, 2.4, 2.5 and 2.6 . No major changes other than compatibility fixes have been made from Guppy 0.1.6 version below.

In the nearest future, I plan to add some interactive help and some more examples to the documentation, perhaps a tutorial. Look out for this in the svn HEAD if you want the latest. I may make a new release (0.1.7) in perhaps a month or so.

To check out the latest (HEAD) revision, you can do:

svn co https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy guppy

To check out the revision tested as of this writing (2008-04-07), you can do:

svn co -r18 https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy guppy

Older source code in tar gzip format.

guppy-0.1.6.tar.gz Updated 2006-10-16. Doesn't work with 64 bits. -- Quick bug fix version, correcting the name of the Root object in the hpy instance. This is a kind of bug the automatic test generation should have catched, since it was specified with another name in the documentation, but I yet need some time to get that to work... so I am just uploading this quick fix now.
guppy-0.1.5.tar.gz Updated 2006-10-12. Fixed bugs wrt remote monitoring and HTML rendering. New features include the shorthand sp for shpaths and representing the source of the shortest paths in terms of a Root object in the hpy instance. See changelog.
guppy-0.1.4.tar.gz Updated 2006-10-11. Most changes are to make it work with Python 2.5; other changes include improved error reporting in Glue.py and some test improvements.
guppy-0.1.3.tar.gz Updated 2006-03-02. Updates to Monitor so multiple lines work. It also got a command to interrupt the remote process. Cleanups and bugfixes especially todo with Python2.4 (used to crash with array objects). A bunch of other fixes, see changelog.
guppy-0.1.2.tar.gz Pointer comparison bugs and test portability problems were fixed. See the included changelog.
guppy-0.1.1.tar.gz The C source code for the extension modules was changed to be ANSI compatible and I also changed some help text that had become outdated.
guppy-0.1.tar.gz Original version. Extension modules could not be compiled using strict ANSI C compilers.

Credits

Contact

The author, Sverker Nilsson, may be contacted at:
svenil@users.sourceforge.net
I have registered a mailing list for discussions, questions, announcements etc. The list information, subscription form and archives are available at:
http://lists.sourceforge.net/mailman/listinfo/guppy-pe-list
Please let me know of problems, either by mailing me directly, or via the mailing list mentioned above or the SourceForge bug tracking system:
http://sourceforge.net/tracker/?group_id=105577&atid=641821
The Sourceforge project summary page is:
http://sourceforge.net/projects/guppy-pe

Generated by GSL-HTML 0.1.5 on Mon Apr 7 19:21:02 2008