In the July 2008 issue of IEEE Computer, there is short article called “In Praise of Scripting: Real Programming Pragmatism“, by Ronald P. Loui, a professor at Washington University (WUSTL). The article deals with the issue of what is the appropriate first language to teach new CS (Computer Science) students, and considers that a “scripting” langauge like Python or Ruby might be way better than Java (no doubt about that I think).
The interesting material in the article is the background on WHY he thinks that this is the case. He points to the immense popularity and rise of scripting in much of computing land. In the past ten years, it is clear to him (and I would agree with this too mostly) that languages like Perl, PHP, Awk, Ruby, JavaScript, and Python have eclipse Java and C++ as the most interesting and important programming languages for many practical tasks. Especially for web applications, where Java seems to have a presence but noone would dream of using something as clunky and impractical as C.
What can this teach us for the purpose of simulation and the creation of models of computer system hardware for the purpose of simulation? Maybe a fair bit…
The reason that scripting languages have gotten so popular is primarily that they let you get the job done faster. In web land, this is due to a few important factors listed or hinted at in the article:
- The overall code tends to be shorter and more to the point
- A tendency to make any value into a string when needed (excellent debugging aid)
- Dynamic typing, variables that just appear when needed
- Automatic memory management
- Ease of handling strings
- Most tasks are pretty short
- Performance of computation does not matter very much when processors are as fast as they are today and a single disk or network access operations is likely to dominate programming time
What the article continues with a discussion that we need to focus on programming language pragmatics rather than syntax or semantics. How practical is the language for the everyday tasks of a programmer? And it seems that simple darwinian evolution has propelled scripting-style languages to the top of heap here. Purity and elegance and deep semantical properties are obviously less important than getting the job done in the shortest time possible.
What can this teach us modeling folks?
- A key takeaway is the focus on short code. Code has to be short and focused to be quick and easy to write. You should not need to consult several header files and lots of documentation to understand and formulate your code, which is an ill that keeps hitting C and C++ programs.
- Memory management has nothing to do in a productive programming environment.
- Basic types have to be appropriate to the task at hand.
- Syntax has to be concise and powerful.
- Dynamic typing is much faster to code with than static typing and variable declarations.
- An interactive “try it out” environment is preferable to long compile/link/test cycles.
Obviously, some of these do not translate well into hardware modeling. Hardware entities tend to have very-well-defined types by nature. If I have a 16-bit register containing a 13-bit counter and three status bits, I cannot very well let that counter be a dynamic variable that can take on any numeric value. Or the atom “foo”. It kind of has to be restricted in semantics to how the hardware would behave… it is not just a “value” to be manipulated quite abstractly. Which means that what you probably rather need are types well-suited for the task at hand.
It is also nice if variables can just pop into existence when needed, with the least amount of declaration possible. Another C-family performance killer is the annoying need to declare a function before calling it. That made sense in linearly scanning compilers back in the 1970s, but today, just let the compiler take the entire program into account and find the function. It is not that hard, especially not for the quite small and isolated context that a virtual platform device model is (few device models are more than a few thousand lines of code in my experience).
The interactive nature of scripting is also interesting. The ability to just write new code directly into a running system without an explicit compile usually rests on a virtual-machine approach and does have some cost in terms of performance. And unlike webservers, virtual platforms need all the raw CPU performance they can get! However, it make excellent sense to use an interactive environment to prototype and test things, and then move the resulting code to a harder compile stage.
Right now, we really do not have such a tool at hand. Virtutech DML is in my opinion the most promising step along this road, but it is not really “perl for modeling” at this time. SystemC has too many C++ roots to really behave well in this respect… you get all the drawbacks of explicit memory management, rampant headerfiles, and statically declared types. It might be good to compile, but it sure is an absolute pain to write.
And if we want virtual platforms to really fly, it is not so much “all about the models”, but really “all about model programming” — since there is a huge volume of models that need to written out there, and getting the time needed to write these models down is a primary concern. The total work invested in modeling any particular device is what we need to focus on, really.
A very nice example on how easy programming and modeling should be with the right tools is found on Joe Armstrong’s Erlang blog where he does an “ftp server” from scratch…
http://armstrongonsoftware.blogspot.com/2006/09/why-i-often-implement-things-from.html
“Performance of computation does not matter very much when processors are as fast as they are today and a single disk or network access operations is likely to dominate programming time”.
I does. It’s just that scripting languages rely on lots of nice cool features, all of them written in C or C++.
Performance doesn’t matter for the kind of programs generally developed in scripting languages, but I does for other applications, like the web browser your probably using, which is written in…, and not in…
I think that is a good point… the performance effect of certain scripting languages when used intensely IS significant. Especially when said application does not do a lot of disk or network access. The Virtutech simulation system, Simics, that I know intimately, or the compilers I used to develop before that both have this property. When I apply heavy-duty scripting to Simics, performance quickly drops drastically… with all time spent jumping into Python from C and back.
Simics is such a program. All C and assembly, and a little C++, in the core parts where cycles and instructions do count. And we can peg the CPU load of a processor at 100% for hours on end.
Even so, the point I am trying to make is that scripting languages have a key lesson for programming environment designers: creating abstractions close to the user’s needs makes for better programs and shorter programs and something that can often be compiled efficiently. I also think that scripting approaches makes sense to prototype, and then move to a faster implementation.
I know that scripting languages are never very fast, the happy-go-lucky attitude was from the original article quoted.
“Creating abstractions close to the user’s needs makes for better programs and shorter programs and something that can often be compiled efficiently”.
C++ is all about creating abstractions, and they can be created very close to the users needs (in more efficient ways than other languages), althou it can’t be compiled efficiently. Hopefully this will be solved with the introduction of modules, which is planed for TR2: http://herbsutter.wordpress.com/2007/02/07/iso-c0x-complete-public-review-draft-in-october-2007 (look for the word “modules”).
The world is as it has always been: scripting for some things, and C/C++ (and the like) for others. It’s just that the applications for scripting has grown a lot thanks to the Web and power of computers nowadays, that everybody thinks we should all stop using C/C++. But remember: while everybody talks about Perl, PHP, Awk, Ruby, JavaScript, and Python; C and C++ are hidden there making everything happen (like your browser, and yours, and yours…).
With the Boost libraries (www.boost.org) writing C++ code is no longer hard. It used to be when there was no Boost.
I think teaching scripting languages for CS is good for algorithmic purposes, but you need to teach them C and assembler at least, because sooner or later they will ask “what makes everything happen?”, and they MUST know WHAT and HOW.
I think scripting languages are a very very powerful tool (in my mind it makes a lot of sense to have them), but they will never replace C and C++, specially at the rate C++ is evolving.