One of the unexpected benefit for me in writing this blog is the chance to know incredible and generous people who happened to share some interest in my writings and analysis. Every now and then, I received email from generous readers who shared their knowledge or proprietary data which I would not otherwise get. One of this generous gentlemen shared his passion about Python and its possible application to help automate trading decisions.
As you might already aware, the market analysis in this blog is purely data driven and it uses logical rules to decide whether to go Long, Short, or stay Neutral. I personally use R, an open source programming language, to perform data processing to generate the market timing output. R is a high level programming language, thus it is relatively easy to master. Its strength lies in its time series and statistical packages developed by many experts in scientific community. In addition, like Matlab, it can handle matrix very well. Because of its high level nature, programming in R is like writing in English. For research purposes, this is a good property because we can test ideas quickly. It is also a full-scale programming language, thus you can implement anything imaginable. Once an idea is found to be solid, we can then move on to implement it in a platform which is good at speed.
One of the reader suggested to me to look into Python. His emails indeed opened up my perspective about Python. Because of the wealth of information contained, I asked him whether it is okay to share in the blog. He is happy to share. Below is his email. It is good for beginners who wants to get a broad overview of Python and how to start.
I was happy to see that you claim to be an R (rather than S) programmer. Do you use other open source tools? Python? PyLab? Enthought?
You may suspect that I’m a Python advocate. I’ve been using Python since the mid ’90s for engineering applications … that I’d previously implemented in FORTRAN and C ,,, After using Python for 17+/- years, I still can’t say enough good about it, especially as a glue language.
If you’ll indulge me, I’ll point out something I’m aware of, but, will never have the time to pursue. There was a nascent effort to provide for a financial module within Pylab (http://matplotlib.sourceforge.net/examples/pylab_examples/finance_work2.html). Pylab is a Python replacement and extension of Matlab based on the Matplotlib library. The main author of these packages, a neuroscientist, was hired away by the financial industry. Since then, the financial module development has languished (he gives a good overview of his work in the first five or ten minutes of this speech http://videolectures.net/mloss08_hunter_mat/ — he mentions R a little further along, and a financial example from about 22 to 27 minutes). IEEE’s ‘Computing in Science & Engineering’ devoted an entire issue to Python a few years ago; it’s a good intro for an engineer. Here’s an interesting book that touches on Matplotlib use (http://www.amazon.com/Beginning-Python-Visualization-Transformation-Professionals/dp/1430218436/ref=pd_rhf_dp_cpp_tab0_p_t_4) (it’s likely you can find a pdf of this floating around the web to preview if you want). You’re probably aware of Ta-Lib (http://ta-lib.org/hdr_doc.html), which may be useful.
I would very much like to see an open source financial charting and analysis tool that also gives access to the unrestricted power of a real programming language (a standard, general purpose, cross platform language — Python). Please pardon me if these comments are not of immediate interest to you, but, I’m trying to further this cause wherever it seems possible there may be an interest.
Regardless of your enthusiasm for Python, I’m happy to have found you. Your clarity of thought shines from your pages. Thanks for your efforts.
I know you’re interested in statistics, maybe time series; it’s available, but I don’t have experience with it. It’s pretty easy, though, to find people using Python to call R in order to get the power of Python and still have R’s statistics. Here’s a fellow doing it (http://www2.warwick.ac.uk/fac/sci/moac/students/peter_cock/python/lin_reg/) using RPy (http://rpy.sourceforge.net/). There is also something called R/SPlus-Python (http://www.omegahat.org/RSPython/). This gives equivalent statements in R, Matlab, Pylab, some others (http://mathesaurus.sourceforge.net/math-synonyms.pdf).I’ve been culling info about Python for a long time and have a big garbage dump of urls to pull from, some of which I’m going to throw at you. My hope is that some of it will appeal to you. It could overwhelm. Take what helps, forget the rest. Glad you appreciate John Hunter. He mentioned Fernando Perez (http://fperez.org/talks/index.html). Pylab is built on his shell, IPython (http://ipython.org/). It’s very powerful in its own right. Here’s part of the ieee-cise issue I mentioned before (http://users.cse.ucdavis.edu/~cmg/Group/readings/pythonissue_1of4.pdf), all four parts are available. Hunter is in part 4 of 4, Perez 1 of 4. The Dubois introduction and Oliphant’s article are worthwhile. Another issue of ieee-cise, this year, was devoted to Python (vol. 13, no. 2). Hunter and Perez have a paper in it — all I can find on the web is some sort of preliminary version (sage.math.washington.edu/tmp/stein-cise-comments-may22.pdf).
OK, now that you know that Perez is the author of IPython and Oliphant of numpy (which are, with matplotlib, the foundation of pylab) consider WesMcKinney, a doctoral candidate in Statistical Science atDuke, and author of Pandas (http://pandas.sourceforge.net/), a statistics and time series module built on these tools. Here’s a podcast of Perez, Oliphant and McKinney (http://www.enthought.com/~ascopatz/inscight/inscight_13_2011_05_18.mp3) discussing the the data structures to support financial computing. Here are some presentations by McKinney (http://python.mirocommunity.org/video/1531/pycon-2010-python-in-quantitatandhttp://conference.scipy.org/scipy2011/slides/mckinney_time_series.pdf). PyTables might also be of interest to you (http://www.pytables.org/moin). (I haven’t really explored these last items.)A lot of Google is written in Python. Both Guido van Rossum, Python’s creator, and, Alex Martelli are at Google. Google has a lot of their speeches, and a lot of other Python material posted. But, try this (http://neopythonic.blogspot.com/2009/11/python-in-scientific-world.html).
Why Python? It gets work done. It allows the immediacy of an interpreter for development with the ability to call compiled code at time critical points. It’s cross platform. You can program in an object oriented way, or not, completely your choice. There are tons of very respectable modules available to do almost anything you can imagine. The US national labs use it as a glue language, and, have open sourced a lot of their work. All the venerable linear algebra packages have been wrapped and may be called. The numpy module (written by Oliphant, now president of Enthought) allows for very fast numerical computing (http://www.youtube.com/user/EnthoughtMedia#p/u/1/vWkb7VahaXQ). … There is numpy/scipy support from creditable places, e.g., Caltech’s hosting of SciPy200x for many years (http://conference.scipy.org/proceedings/). Lots of significant projects are using it as a glue language, e.g., Sage (http://www.sagemath.org/). It’s becoming (has become) the common denominator of numerical computing.
Python is said to be executable pseudo code. Once you experience the compactness of Python code, and, the compactness of thought it engenders, C++ and Java seem ponderous, clunky, exasperating. Software Carpentry might help you compare languages (http://software-carpentry.org/4_0/python/intro/— you may want to look around here, version 3 is the last I’ve really looked athttp://software-carpentry.org/3_0/). It might be helpful to consider that Bruce Eckel, ANSI/ISO C++ committee member, and author of the ‘Thinking in C++’ and ‘Thinking in Java’ books, has embraced Python for his own use (http://mindview.net/Books/Python/ThinkingInPython.html— the first link there is a powerpoint that might interest you).
Here is a good set of Scientific Python notes (http://scipy-lectures.github.com/_downloads/PythonScientific-simple.pdf) that I found referenced in this list of resources (http://clouds.eos.ubc.ca/~phil/numeric/docs/_build/html/python.html). Perez has assembled a starter kit (http://fperez.org/py4science/starter_kit.html). Here’s an intro book (http://www.ibiblio.org/swaroopch/byteofpython/files/120/byteofpython_120.pdf), a tutorial (http://www.nmt.edu/tcc/help/pubs/lang/pytut/pytut.pdf), a reference (http://www.nmt.edu/tcc/help/pubs/python25/python25.pdf), and the first video in a Google in-house training series (http://www.youtube.com/watch?v=tKTZoB2Vjuk with supporting materialshttp://code.google.com/edu/languages/google-python-class/index.html). Note most work is still done in Python 2.x (not yet 3.x).
I think it’s worthwhile to mention this. With an eye open for things that might help grandchildren, I stumbled on a fellow’s youtube channel. He’s an MSEE fromMIT, aHarvardMBA, and ran a hedge fund for six years. He posted some short videos to help his niece with high school math, they went viral, next thing he knows a friend is calling from a conference to say Bill Gates is on stage saying how happy he was to find the videos for his daughters. Now Gates and Google are backing him. He has 2500+/- short videos posted that he claims cover K-12 math and science (although, he covers linear algebra and differential equations). He has a computer science offering in Python. The few videos I watched were good, and, I picked up a new perspective or two about Python from them (http://www.youtube.com/user/khanacademy?blend=1&ob=4).
What I’ve tried to illustrate is that Python, often using the particular tools IPython, numpy and matplotlib (often as they are conveniently integrated by pylab), is the de facto standard for numerical computing, analysis and display. What I claim is that financial computing, analysis and display should join this club, and use these standard tools, which I think are the wave of the future.
Hope this helps. (And, that it somehow helps to inspire someone to finish work on the matplotlib.finance module. I want to use that module!) Thanks again for your interest.
PS — It seems that you can now access IPython in the cloud (http://www.pythonanywhere.com/try-ipython/). This might be an easy way to get some experience without having to install anything (but, it’s noticably slow). Any Python should work at the prompt. The last time I encountered the company that provides this service (Resolver —http://www.resolversystems.com/), they were building a Python integrated spreadsheet application based on open source tools. I think they were somehow connected to an Enthought financial users group (maybe in London?). There was some mention of them on Guru’s site a couple of years ago.