Thoughts on Programming Languages for Scientific Computing

7 #

I have a friend who works in chemistry / protein folding and thus does a lot of computational stuff within that field. Coming from an application/web programming background, I was utterly shocked when she mentioned that FORTRAN is widely used in computational simulations. The great programmer Nietzsche once declared that “FORTRAN is dead”, and I held that belief too, thinking that FORTRAN was only used in the days when C and C++ weren’t developed and programmers didn’t have anything else. I didn’t know a single person who uses FORTRAN.

But I was wrong. It turns out that FORTRAN is still actively used in scientific computing. In addition, it is not an “outdated language”. FORTRAN actually has quite a few versions which have evolved and added on modern programming languages features over the years.

The biggest draw of FORTRAN is speed. It’s one of the lowest level out of all the high level programming languages (like C, C++, Pascal, Java, etc.). The language, intended for scientific computing, was designed to be fast. In comparison, languages like C and C++ were designed as general purpose languages that could be used for a variety of purposes.

But I wasn’t totally sold on FORTRAN. I don’t like the syntax and rather code in C++ any day. I didn’t believe that C++ was all that much slower than FORTRAN so I decided to do some research into this matter. The following comes from an email I wrote:

  1. FORTRAN is widely cited to be “20% to a factor of ten” faster than C++. I didn’t find anything for C, but I think the speed of C is about the same as C++.
  2. However, the data in #1 was published in 1997. In the modern day, C and C++ compilers have been heavily optimized to produce speeds comparable to FORTRAN. Some modern benchmark results are here:

    (These two benchmarks show FORTRAN to be about the same as C/C++
    depending on how the code is written and compiled)
    http://dan.corlan.net/bench.html (maybe around the year 2000)
    http://dan.corlan.net/amd64_dual_core_benchmarks.html (newer version)

    (This benchmark shows FORTRAN to be much slower than C/C++. Java is
    even faster!)
    http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all

    (This benchmark shows C++ being slightly faster than FORTRAN)
    http://pauloherrera.blogspot.com/2006/12/introduction-ive-read-lot-of-things.html

  3. In addition, there is a library for C++, called Blitz++, which allows C++ programs using the library to approach the speed of FORTRAN. Benchmarks are here:
    http://www.oonumerics.org/blitz/benchmarks/
    (Note the percentages, ie. 95.7%, is the speed relative to the
    comparable FORTRAN program. So 100% is the same speed.)

    Also note that the benchmarks for Blitz++ were completed in like the year 2000. I expect if the same benchmarks were run today with modern compilers, the results would be even better.

  4. Therefore, I conclude that C/C++ is comparable to the speed of FORTRAN and depending on what you are doing and how you write the code can be faster than FORTRAN. So it depends on your preference of the programming language. I would probably always pick C++ over FORTRAN since I am more familiar with C++ and I think C++ code is easier to write and read.
  5. Surprisingly, python *can* be really fast. See:
    http://scipy.org/PerformancePython
    Of course, it will never approach the speed of C++ or FORTRAN, but Python code can be very easy to write and for medium sized projects, you can write the code much faster.
  6. Surprisingly, Java *can* be faster than C/C++. This is largely attributed to compiler optimizations that Java can do that C/C++ can’t do. See:

    http://www.idiom.com/~zilla/Computer/javaCbenchmark.html

    http://www.kano.net/javabench/ (see links at bottom of the page for more java vs c++ benchmarks)
    http://www.kano.net/javabench/data (good graphs)

    In my opinion, Java code is easier to write than C++ code. You might also get advantages of portability since Java code can pretty much run on any computer unmodified.

  7. Unsurprisingly, Matlab is slow. Even when Matlab code is optimized, it is at least 2 times as slow as comparable C/C++ code. Unoptimized, it can be 30 times or more slower! Of course, the good thing about Matlab is that you can easily write programs that perform math — much faster than with C/C++, Java, Python, or FORTRAN. So Matlab might be good to test an idea or implement something on a small scale.

Overall conclusion: C++ speed is about the same or only a little bit slower than FORTRAN. As C++ compilers become more optimized, the difference decreases. Java seems pretty fast, so I would probably do more research on it. If it *really* comparable to C/C++ speeds, I would probably use Java. For medium scale applications, I would use Python. For small scale applications, I would use Python or Matlab.

Project Update and Computer Irony

2 #

Eugene asked me today via email how things were going. I figured I should post the response here too so that other people can know what’s up too:

It’s going alright. Low efficiency though since I’m constantly distracted by stuff. Also, since I want to do it the “right way” sometimes it’s a lot slower (as opposed to building it in Rails at the loss of some extensibility and compatibility since I’m trying to make it so that other people can run it on their own server). If it was just an app running on one server with no one else using it, then rails would probably be the best way.

I’m hoping to get a really basic version released (nothing really special) next week so other people can play around with it too. At the same time, I’m going to try to set up the commercial side of it (but not release it to the public yet) so that I can try focusing on security and speed.

But in all, it’s more difficult than I imagined. Most of the time is spent debating what’s the best way to do it and by looking at how other scripts do it.

There are also some setbacks….like these past days, windows suddenly began corrupting my files after each reboot or hibernation for no good reason. Actually, I think it’s this program called RollBack Rx which is supposed to be a data recovery software that runs in the background (yeah, what irony). So I went to uninstall the program, but after I uninstalled it, the program took with it a chunk of core windows files which rendered windows useless.

So I used a Feisty live cd to mount the C drive over samba and used my other computer to copy over files from the windows cd. Unfortunately, that didn’t help at all.

So I went to reinstall windows. But I discovered that my DVD/CD drive suddenly couldn’t read CD-Rs (I’m working with Toshiba to get it exchanged although they want me to send my whole laptop in). My two windows CDs were both CD-Rs (legit versions, might I add).

Finally, I decided to install Feisty since it was a hard-pressed CD (I got it from the Ship-It service) and my CD/DVD drive could read that. Since I don’t use tablet functionality a lot when I’m at home, I figured I could live with Ubuntu for a while until I get my DVD/CD drive fixed.

It’s surprisingly good. It actually set up my digitizer pen out of the box (although right click doesn’t work). Wireless also works out of the box. Video and audio are good (although I had to tweak video a bit). It even tells me that the battery life on the logitech wireless mice is only at 14%!

Well, in truth, it’s not like I didn’t know that Feisty was good. Three of Avery’s computers run Ubuntu so I had quite some experience with it. But compared to two years ago when I tried out Ubuntu, Feisty does very good hardware detection. So I’ll probably be using Ubuntu for a while until I can get windows back on (mainly for MS OneNote).

parasiteLaTeX, Wordpress LaTeX plugin

3 #

Wordpress.com added a new feature today allowing for \LaTeX output in blog posts. Interestingly enough, they use a script that takes \LaTeX input and outputs an image of the compiled output. The url follows a format:

http://l.wordpress.com/latex.php?latex=[yourmathcodehere] &bg=ffffff&fg=000000&s=0

So I’ve decided to write a simple plugin that leeches off of this url to generate LaTeX output for your blog.

Installation:

  1. Download the plugin: parasiteLaTeX-plugin_1.0.zip (Plugin was pulled as by Matt’s request) [zip, 2KB].
  2. Install the plugin by unzipping the file, parasiteLaTeX-plugin.php, into your wp-content/plugins directory.
  3. Activate the plugin in the Administrative interface. That’s it!

To use LaTeX in your blog posts:
Surround your math code with: [ tex] and [/ tex] (removing the spaces between [ and tex).

Enjoy! Here's a demo:

Z_1 = n_Q V = (M \tau / 2 \pi \hbar^2)^{3/2} V

Frustrations of Rails and Wikis!

2 #

Argggg! I just spent the last 3 hours looking around for a good wiki system to use for Avery’s new web site. I was daring and wanted to try Instiki, a beautiful ruby based wiki. I was trying to set up fcgid with apache2, but that didn’t work out very well. So then I decided to ditch apache and go with lighttpd instead. But then lighttpd became a pain, and since it didn’t integrate well with subversion, I decided to go back to apache and set up a proxy to seamlessly pass the urls to Ruby’s WEBrick server (the default one). That worked well except that Instiki insisted on having a directory after the base path of the url (ex. test.com/someword/show/HomePage when I wanted test.com/show/HomePage). So I decided not to use it.

So I then went back to looking around for good php wikis and was dismayed at how crappy all of the current ones are. MediaWiki is too bloated. DokuWiki can’t use MySQL so it doesn’t scale up for larger sites. WikkaWiki is good, but the development on it isn’t very active (edit: The project is active, but just has slow release cycles), and in my previous experience, it takes some work to skin. Other wiki’s were too bloated, visually unappealing, or forced linking with WikiWords. I was thinking to myself: “Gah! Is it really *that* difficult to make a php wiki with these features: [insert what I was thinking here]?!” I really wanted to start coding one, but decided not to since I still had lots of work to do.

Erg, it’s really frustrating though. Now I don’t really have a good solution except for maybe falling back on a Wordpress CMS, but that goes against my idea of a wiki.

Simple Asides Fixed for WP 2.0.4

3 #

Spent some time today fixing my Simple Asides plugin for the latest version of Wordpress, 2.0.4. It seems like there were some changes in the way that the functions of my plugin were called by wordpress. Therefore, I had to move a few variables around. The code is still pretty ugly and unelegant though.

Download/upgrade the plugin on the Simple Asides page. Sorry it took so long to fix.

Simple Asides Broken by Wordpress 2.0.4

2 #

See title. Yeah, I just upgraded, and there seems to be problems with my plugin again. I will fix this soon.

Page Watch

1 #

Avery House is currently playing this game of assasin, and we have our stats posted on the team ranking page. Since his page gets updated to reflect kills/assasinations, one strategy to keep tabs on people is to save copies of this stats page over time to see who killed and who was eliminated at the same time.

So I wrote this quick script in python called pageWatch.py. Essentially, it polls a page every so often and if the page changed, then a quick HTML Diff will be used.

Now, how I find the time to write these things, I do not know. *Returns to math problem set*

Open Source Alternative to 3tunes

19 #

(UPDATE (05/17/06) : Pandora changed their site today to not display the song information in the title of the browser window. Therefore, this script does not work anymore. However, you could still retrieve the access* files from the temporary directory and use something like MusicBrainz to identify the files. Perhaps I will try to find a way to pull the information from the flash script)

Yesterday on digg, a curious program called 3tunes appeared which, to quote the site was a “time-shifting application for the website Pandora.com. It grabs the music information from the titlebar of Mozilla Firefox and then saves the music with the correct track information, in the format of %artist% - %title%.mp3″.

This was pretty cool stuff! However, the program didn’t work on my computer. I did some digging around (no pun intended), and found how the program operated. It copies the mp3 files that pandora buffers in the temp directory of the hard drive and renames it to the song artist and title which appears in the browser window title bar.

The problem with 3tunes is that it has the temporary path hard coded into the program (something like: C:\Documents and Settings\Username\Local Settings\Temp). However, I changed my temp directory to another path and so 3tunes fails.

I liked the concept of 3tunes though so having some time today, I decided to write an open source alternative to it in Python. The script doesn’t have a GUI, but it just sits in the background and saves the song that Pandora is playing to the current directory. The requirements are:

How to use:
Open up a browser window and make sure Pandora is running in that window. Also, do not open any more tabs in that window. Then just drop the pandoraRenamer.py script in a directory and double click to run the script (or in the command line: python pandoraRenamer.py). To exit the script, hit Control-C. The 128Kbps mp3 files will automatically be saved in the directory that pandoraRenamer.py is in.

Oh, and here’s the script: pandoraRenamer 0.1.0 (py - 4KB)

It’s released under the GPL so you can mess around with it! Digg if you find helpful?

Obsessed over Textmate

10 #

You may call me crazy (you can really! Go ahead!), but for the last few days, I’ve been really obsessed over this Mac OS X based text editor called Textmate. I commented to a friend that this application almost makes me want to switch to the Mac OS platform. This text editor is almost perfect and I spent too much time trying to find a Windows alternative (I failed). However, I did come across some close potential windows applications that could probably provide the functionality of Textmate but not all of the eye candy (The eye candy is impossible on Windows since they are Mac OS X based). And I’ve been really considering modifying an existing open source application to mimic Textmate. Instead of working on my physics problem set today or studying for my Chem 14 midterm, I somehow ended up outlining all the steps necessary to modify an open source text editor to function like textmate. In fact, it should only take about a week of development time or so…

The whole point of this post was to make myself stop working on this and to work on more important things like a really difficult midterm and a difficult problem set all due within two days…

Code Igniter

4 #

I’m always on the lookout for good PHP development frameworks. In the past, I’ve tested a lot of the more popular ones but seemed to hate their restrictivity to rigid file structures, function calls, and file bloat. A lot of frameworks weren’t elegant and seemed to cause further development issues that they were trying to solve in the first place. A couple of weeks ago, I thought CakePHP was pretty good. However, today, I stumbled upon Code Igniter which is almost exactly my definition of a perfect PHP framework. The documentation is top-notch and the script and I agree on many development philosophies (ie. The use of PHP as the templating system and not other libraries). I can’t wait to finish my midterms and mess around with this framework.