Posts tagged "programming-languages":

28 Jul 2014

Why C++?

Recently, on the otb-users mailing list somebody asked why C++ was chosen to implement OTB.

The main reasons why I chose C++ for OTB were the following (by order of relevance):

Most of the existing libraries on which OTB was going to rely on were written in C++. Not only ITK, which is at the core of OTB, but also OSSIM and gdal.
C++ can be as efficient as C and has higher level abstraction mechanisms allowing for cleaner code and architecture (templates, classes, name spaces, etc.).
C++ was the language I was most familiar with.

C++ was first introduced in the late 70's and was first called "C with classes" since C++ was designed to provide Simula's facilities for program organization together with C's efficiency and flexibility for systems programming. The class concept (with derived classes and virtual functions) was borrowed from the Simula programming language.

C++ supports different programming styles:

procedural and imperative styles due to its closeness to C,
object orientation with classes, virtual functions and inheritance,
generic programming through templates,
functional programming through function objects.

Let's have a look at these different items.

Procedural and imperative styles: the C style

C++ has the same tools as C for imperative programming. So there are the classical control flow mechanisms such as for and while loops, if statements, etc.

C++ supports procedural programming via functions, which are like C functions but with more strict type checking of the arguments. Unlike C, function arguments can be passed by reference, which is like passing a pointer to the object, but using the same syntax as if the object itself was passed. This allows to pass large objects without copying them as with pointers, but the programmer can choose to declare the reference as constant so that the object can't be modified inside the function.

Object orientation: C with classes

C++ has classes which are composite types (that is, they can contain other objects) like C structs, but they can also have associated methods, or member functions. At first, object orientation in C++ can be seen very as similar to Java's or Python's in terms of what a class can contain and how it is used. The main difference is that C++ uses value semantics. In clear, when a variable of a given class is created in C++, it is a value, like an int or a double. In Java or Python, this is not the case. In these languages, what you get is a reference to an object, not the object itself, and therefore, you can't manage memory as you would want to and a garbage collector is needed in order to free memory when the object is not used anymore.

In C++ you can choose to have an object (the default behavior) and in this case, like an int or a double, it will be allocated in the stack and be deallocated automatically when it goes out of scope (typically at the end of the block, when a closing curly brace is found). So there is no need for garbage collector, since no garbage is generated. Java programmers will appreciate this joke …

On the other hand, you can choose to use pointers to the objects, and then the objects are allocated in the free store and the programmer is responsible for freeing the memory.

C++ offers multiple inheritance (unlike Java) and virtual functions (like Java).

C++ also offers operator overloading. This means that arithmetic operators like + for addition or * for multiplication can be redefined for each class. This is useful for instance for a matrix class. Other operators like [], -> can also be overloaded.

Generic programming: templates and the STL

Generic programming can be used in C++ thanks to templates. Class templates are classes defined in terms of generic types. That is, a class can contain an object of type T which is not defined. All code, like member functions, etc. can be written in terms of this generic type.

When a programmer wants to use this class template, she has to say which is the concrete type that will be used instead of T. This is very useful when algorithms are exactly the same for different data types like integers, floating point numbers, etc. In languages which don't support generic programming, the same algorithm has to be rewritten for every different type, or the programmer has to chose the best type (for some definition of best) for which to write the algorithm.

There are also function templates, which are functions defined in terms of generic types.

The mechanism behind templates generates, at compile time, the code for the concrete types which will be as efficient as hand-written code for these types.

C++ comes with the Standard Template Library, the STL, which provides generic containers and generic algorithms. Generic containers are containers like vectors, lists, etc. whose elements are generic (like ints, doubles, strings or any other type or class). Generic algorithms are algorithms which operate on generic types (like ints, doubles, strings or any other type or class). These algorithms operate on ranges inside a container, so they are the same for a vector of ints or for a list of doubles. These ranges are defined by iterators, which are like pointers or coordinates inside the container.

In this way, if you have N types of containers and M different algorithms, you don't have to write NxM versions of the code, but just N+M. So less code to write and maintain. I am a huge fan of templates.

Functional programming

Functional programming is not just programming using functions, but rather using functions as first class citizens. That means being able to pass functions as arguments to other functions and return them as results.

In C, one can achieve this by using pointers to functions, but this is risky and ugly, in terms of function signature.

In C++ we can define functions as types, and therefore they can be used as arguments to and return values from other functions. The way of creating a function type (in C++ we call them function objects or functors) is to create a class and use operator overloading. In the same way as + can be overloaded in a matrix class, the () operator can be overloaded and therefore if myfunc is an object of the class F where the operator has been overloaded, we can use myfunc() like a normal function call.

And therefore, now we can define functions which take or return objects of class F and implement functional programming.

The STL uses extensively function objects in order to tune the behavior of algorithms. For example, the STL find_if algorithm returns the position of first element in a container for which a particular condition is true. It takes as one of its arguments the function which will be used to test if the elements of the container verify a given condition.

Conclusion

As far as I know, C++ offers the best trade-off between efficiency and abstraction level. Or as Bjarne Stroustrup said in a recent interview:

The essence of C++ is that it provides a direct map to hardware and offers mechanisms for very general zero-overhead abstraction.

Since C++ can be as efficient as C, I don't see any reason to use C for a new programming project. Learning C is still useful if one needs to modify existing C code. However, if one wants to use an existing C library in a new project, C++ seems a better choice.

The recent C++11 and C++14 versions of the language (which is by the way an ISO standard) have introduced many new features which make the use of C++ easier and this is giving a sort of renaissance to the language.

Nowadays, added to its classical uses in systems programming and scientific applications, C++ is starting to be used in mobile applications.

19 Aug 2012

Emacs Lisp as a general purpose language?

I live in Emacs

As you may already know if you read this blog (yes, you, the only one who reads this blog …) I live inside Emacs. I have been an Emacs user for more than 12 years now and I do nearly everything with it:

org-mode for organization and document writing (thanks to the export functionalities),
gnus for e-mail (I used to use VM before),
ERC and identica-mode for instant messaging and micro-blogging (things that I don't do much anymore),
this very post is written in Emacs and automatically uploaded to my blog using the amazing org2blog package,
dired as a file manager,
and of course all my programming tasks.

This has been a progressive shift from just using Emacs as a text editor to using it nearly as an operating system. An operating system with a very decent editor, despite of what some may claim!. It is not a matter of totalitarism or aesthetics, but a matter of efficiency: the keyboard is the most efficient user input method for me, so being able to use the same keyboard shortcuts for programming, writing e-mails, etc. is very useful.

Looking for a Lisp

In a previous post I wrote about my thoughts on choosing a programming language for a particular project at work. My main criteria for the choice were:

availability of a REPL;
easy OTB access;
integrated concurrent programming constructs.

I finally chose Clojure, a language that had been on my radar for nearly a year by then. I had already started to learn the language and made some tests to access OTB Java bindings and I was satisfied with it. Since sometimes life happens, and in between there is real work too, I haven't really had the time to produce lots of code with it, and by now, I have only used it for things I would had done in Python before. That is, some glue code to access OTB, a PostGIS data base, and things like that. I also feel some kind of psychological barrier to commit to Clojure as a main language (more on this below). Added to that, has also been the recent evolution of the OTB application framework, which is very easy to use from the command line. In the last days of my vacation time, the thoughts about the choice of Clojure came back to me. I think that everything was triggered by a conversation with Tisham Dhar about his recent use of Scheme. It happened during IGARSS 2012 in Munich, and despite (or maybe thanks to?) the Bavarian beer I was analyzing pros and cons again. Actually, before starting to look at Clojure, when I was re-training myself about production systems, I had thought about GNU Guile (a particular implementation of Scheme) as an extension language for OTB applications, but finally gave up, since I thought that going the other way around was better (having the rule-based system around the image processing bits). When I thought that I was again settled with my choice of Clojure, I stumbled upon the announcement of elnode's latest version. Elnode is a web server written in Emacs Lisp. I then spent some hours reading Nic Ferrier's blog (the author of Elnode) and I saw that he advocates for the use of Emacs Lisp as a general purpose programming language. Elnode itself is a demonstration of this, of course. Therefore, I am forced to consider Emacs Lisp as an alternative.

But what are the real needs?

Availability of a REPL: well, Emacs not only has a REPL (M-x ielm), but any Emacs lisp code in any buffer can be evaluated on the fly (C-j on elisp buffers or C-x C-e in any other buffer). Actually, you can see Emacs, not as an editor, but as some sort of Smalltalk-like environment for Lisp which gets to be enriched and modified on the fly. So this need is more than fulfilled.
Easy access to OTB: well, there are no bindings for Emacs lisp, but given the fact that the OTB applications are available on the command line, they can be easily wrapped with system calls which can even be asynchronous. The use of the querying facilities of the OTB application engine, should make easy this wrapping. So the solution is not so straightforward as using the Java bindings on Clojure, but seems feasible without a huge effort.
Concurrency: this one hurts, since Emacs lisp does not have threads, so I don't think it is possible to do concurrency at the Lisp level. However, since much of the heavy processing is done at the OTB level, OTB multi-threading capabilities allow to cope with this.

For those who think that Emacs Lisp is only useful for text manipulation, a look at the Common Lisp extensions and to the object oriented CLOS-compatible library EIEIO may be useful.

So why not Emacs lisp then?

The main problem I see for the use of Emacs Lisp is that I have been unable to find examples of relatively large programs using it for other things than extending Emacs. There are of course huge tools like org-mode and Gnus, but they have been designed to run interactively inside Emacs. There are other more technical limitations of Emacs Lisp which can make the development of large applications difficult, as for instance the lack of namespaces, and other little things like that. However, there are very cool things that could be done to make OTB development easier using Emacs. For instance, one thing which can be tedious when testing OTB applications on the command line is setting the appropriate parameters or even remembering the list of the available ones. It would nice to have an Emacs mode able to query the application registry and auto-complete or suggest the available options. This could consist of a set of Emacs Lisp functions using the comint mode to interact with the shell and the OTB application framework. This would also allow to parse the documentation strings for the different applications and present them to the user in a different buffer. Once we have that, it should be possible to build a set of macros to set-up a pipeline of applications and generate an Emacs Lisp source file to be run later on (within another instance of Emacs or as an asynchronous process). This would build a development environment for application users as opposed to a development environment for application developers. And since Lisp code is data, the generates Emacs Lisp code, could be easily read again and used as a staring point for other programs, but better than that, a graphical representation of the pipeline could be generated (think of Tikz or GraphViz). Yes I know, talk is cheap, show me the code, etc. I agree.

Conclusion

Clojure has a thriving community and new libraries appear every week to address very difficult problems (core.logic, mimir, etc.). Added to that, all the Java libraries out there are at your fingertips. And this is very difficult to beat. I hear the criticisms about Clojure: it's just a hype, it's mainly promoted by a company, etc. Yes, I would like to be really cool and use Scheme or Haskell, but pragmatism counts. And, anyway, the programming language you are thinking of is not good enough either.