Werner's own blurbs

Gpg4win and the feds

16 July 2013 8:29 PM (gnupg | gpg4win | trust)

The current issue 16/2013 of the German c't magazine runs a bunch of articles on PRISM et al. So far so expected. On page 118 the article “Tarnkappen” mentions GnuPG and claims that only a self compiled version is trustworthy:

Wenn man eine Verschlüsselungssoftware aussuchen kann, sollte man die bevorzugen, deren Quelltext offengelegt ist. Ein Beispiel dafür ist GnuPG. Es nützt aber nichts, wenn man ein fertig kompiliertes Paket wie Gpg4win installiert, das im Auftrag des BSI entwickelt wurde — einer Bundesbehörde. Um wirklich das zu nutzen, was geprüft wurde, muss man die Quelltexte schon selbst kompilieren. Wir haben das mit TrueCrypt versucht.

[If you are able to choose encryption software, you should prefer those with published source code. An example for this is GnuPG. However it is worthless if you install a ready compiled package like Gpg4win, which has been developed on behalf of the BSI — an federal office. To really use what has been checked, it is required to to compile the source code. We tried this with TrueCrypt.]

Let me comment on this.

First, Gpg4win has indeed be developed on behalf of the BSI. Actually the BSI has ordered quite a lot of free software over the last decade and helped to offer solutions to make communication safer and to have replacements for proprietary PIM suites (e.g. by supporting the development of KDE's Kontact). In fact they migrated most of their work places from Windows to Debian. To help with the migration several projects to port existing applications from Unix to Windows have been conducted by external companies. Gpg4win is one of these projects. My company g10code joined up with Intevation and KDAB for this project and our bid was accepted in 2006. The actual development happened openly and could be followed by anyone at the Gpg4win repository. Compare that to the original SE-Linux code, which was secretly done at the NSA, published in 2000, and only 3 years later merged into upstream Linux.

One of our goals was to entirely avoid proprietary tools for development by cross-building the system. This required that we put a lot of work into making the dozens of included software projects cross-buildable. To make this verifiable the documentation clearly explains how to use a Debian system to build a Gpg4win installer from scratch. Of course, not everything worked as expected. In particular the included KDE based Kleopatra key manager had a hard time to get ready for cross-building and we achieved this only recently. To keep the build times at bay we also use some pre-compiled packages of standard free software libraries but they are meanwhile in the minority.

The c't article may be read as if the BSI does the binary version. This is definitely not the case. Almost all releases downloadable at gpg4win.org have been build on one of the machines which are located in my office. The included KDE and Kleopatra packages have been pre-compiled by KDAB in Berlin or by Intevation in their offices. Granted, I can't vouch for the KDE code but I can't do that either for the Pango code, which we currently using as pre-compiled binary. But can I be sure that the Debian system which I use for development has really been build from the published sources? I can only assume that there is no backdoor in any of the software used to bootstrap the installer building.

Second, and to continue on the last argument: Is it actually possible to check the source code? The number of source lines in Gpg4win is immense: More than 3 million lines of code are build and this does not include the pre-compiled packages, like Pango, Cairo, and the huge Qt and KDE libaries. How can malicious code be detected in that amount text? It is too easy to slip malicious code in (for example in the 280000 lines of shell code).

For many years during GnuPG development I checked each line of the diff files between the releases to have a chance to notice strange code. Eventually I gave up on this because it is not anymore possible and, worse, the OS and the toolchain would also need to be checked if one wants to substantial increase the trust in the software. It is just not doable anymore. We need to trust our developers to do the Good Thing(tm). Thus we develop in the public. This gives a somewhat increased probability that malicious code can be kept out.

Last, I have to ask why the authors suggest to compile the software yourself only to use that software then on a closed source, non-verifiable OS, delivered by a company which has a secret spying partnership with the NSA?

The article goes on and describes the experienced problems compiling TrueCrypt for Windows. This requires the use of Visual Studio 2008, another assembler, and even an extra 20 years old C compiler. All of them are proprietary and would thus be able to insert all kind of spying code in the resulting executable. For 64bit Windows the authors finally suggest to better use pre-compiled TrueCrypt drivers.

Isn't it like protecting the gate to your town with barbed wire and expensive locks but hiring the Daltons to guard the fence?

The tragedy of GNU copyright assignments

27 November 2012 3:00 PM (gnu | devel | libgcrypt)

or do we GNU hackers really have the freedom we demand from others?

Free software is about sharing code and thus to help others. Why are GNU hackers supposed to help others with their code but required to reject help from others? Is this what free software is about? I doubt it.

Those questions arise due to a GNU policy of requiring copyright assignments for some of the GNU software. There are no clear rules which software needs to have these legal paper exchange, but at most of the early GNU software does (Emacs, gcc, libc, coreutils, guile).

The official position of the FSF on the requirement of copyright assignments is explained in a short article by Eben Moglen. It is commonly known that there are two main reasons why one should consider to assign the copyright to a single trustworthy entity.

The first one is about legal security or whether it is possible to go after and stop copyright violations. Well, that is the theory. My experience with GnuPG and Libgcrypt seems to shown that the FSF does not care too much about it. For example, at least two companies in Germany sold crypto mail gateways with OpenPGP support provided by GnuPG; they did not released the source code or told the customers about their rights. The FSF didn’t acted upon my requests to stop them violating their (assigned) copyright on GnuPG. This is in contrast to the FTF as run by the FSFE and the gpl-violations group. However, with the FSF holding the copyright, the latter organization has no way to go after such copyright infringements.

The second reason for copyright assignments is to be prepared for future re-licensing. This is actually the most compelling reason. With distributed copyright it would be a lot of work and still often impossible to change the license of a software — even if all contributors would agree. Exceptions like the recent VLC re-licensing are quite rare. A valid question is whether there is at all a need to change a license. Related to the GPL, I see two cases: The first case is fortunately a minor problem because only a few projects are affected by it: Upgrading from GPL version2 to version 3. Usually this is easy, because most GPL projects use the “or any later version” option. Those sticking to GPLv2only, like Linux or OpenVAS, are really troubled and would either need to search the agreement of all contributors or to rewrite the GPLv2only parts of the code.

The second case is about relaxing the conditions of the license, for example from GPL to LGPL. This might be justified for better general interoperability or to help free software projects with GPL incompatible licenses. In the case of the GNU project such a change is done quite seldom. It would technically be easy to do this due to their copyright assignment policy. However, within the GNU project it is more than hard to convince the FSF decision maker(s) that such a change benefits the GNU project. My impression is that to the FSF it is far more important to protect the GNU project than the actual freedom of helping others. Something akin to: “Let us build a high fence so we are free from proprietary software on our pasture. Why care about the lone hackers outside who don’t want to seek shelter behind our fence. After all only we are the good ones.”

Case in point: The GNU towers once declared Libgcrypt to be the standard GNU crypto library. As a core GNU project it was clear that we need to collect copyright assignments. We even started with a special copyright assignment to declare Libgcrypt as an independent project from GnuPG (which is its origin). So now we have a lot of cipher algorithms were we are sure of the code provenience --- i.e. everything has been assigned or disclaimed using a lot of snailed paper. The drawback of this policy is that we had to implement all stuff by ourselves — despite that there was already a lot of highly optimized LGPLed cipher code available. We were simply not allowed to use it.

Some years later Nettle was put together as a collection of freely available algorithms and some new glue stuff. The author did not care about copyright assignments and thus was able to use better optimized code. Nevertheless, Nettle was declared a GNU project and GNUTLS (GNU’s answer to Heathrow^W OpenSSL) eventually switched to Nettle due to a 10% better performance figure in some areas. The funny thing is that GNUTLS itself requires copyright assignments.

Now why shall Libgcrypt require assignments if the GNU project does not care about such dependencies? A reason might be that Libgcrypt provides a fallback in case there will ever be a legal problem with Nettle. I consider this a purely theoretical point because basically both do the same and if suits wants to go after Nettle, I see no reason why they should not also go after Libgcrypt. Copyright is not anymore the sharpest weapon they can use; patents and trademarks are more dangerous than SCOing.

Back in April, I concluded that something needs to be done — at least for Libgcrypt. What I then did was pretty straightforward: Libgcrypt now accepts contributions after having received a simple mail (right, e-mail, no transatlantic snail mail) with a Developer’s Certificate of Origin as known from other projects. Voilà, patches with new features and performance improvements started to come in. Obviously, hackers who were afraid of the paper work or assigning their work to an organization ailing on obstinacy of age, gained interest in this GNU software and started to help out with their experience and time. I really like this outcome; let’s hope it lasts.

Sure, other GNU projects with assignment policy are not anymore able to freely copy and paste code from Libgcrypt. But well, they won’t be able to that with Nettle code either. For the sake of the greater free software community I consider this a minor disadvantage compared to Libgcrypt vegging out as a too closed GNU project.

A textbook on physics from 1902

30 August 2012 4:56 PM (paper)

While browsing my shelfs I found several old books were the paper slowly dissolves due to being printed on non-acid free paper. The one which I would mostly miss is a 3 volume work on mechanical engineering from 1919. It is quite large and will take a lot of time to archive it digitally.

To see whether it can be done in a reasonable time, I did a test with a textbook on physics, which apparantly was used by my grandfather more than hundred years ago. It is a thin book, thus I used a standard flatbed scanner. That would not work with more voluminous books, but here it worked reasonable well. The paper was anyway short of turning unreadable, thus even a bad scan was better than no scan at all.

If you are interested in a German textbook on physics for schools from 1902, you may read or copy it from
http://grundzuege-der-physik.eifzilla.de.

Beta release numbering with GIT

17 August 2012 9:31 AM (devel | git)

Given that building software from source code repositories can turn out into a difficult task for non-developers, the distribution of tarballs snapshots is often a sensible thing to do.

The question is how to identify such a snapshot release. My take on that has always been to use the planned version number and suffix it with a string identifying the version as a beta release. With a centralized revision control system, this is quite easy. You can use a suffix like "-svnNNNN" with NNNN being the revision number. The nice thing is that you still have a monotone increasing revision number; that is the user can easily see that "foo-3.4-svn500" is newer than "foo-3.4-svn480".

Now with a decentralized revision control system, this changed. There is no global revision number anymore and any procedure to make one up one would indispensable put in some centralism. However, with GIT there is way to mitigate that. This makes use of the fact that we should have release planning anyway and that releases are tagged in the repository. The "git describe" command may then be used to come up with a revision number:

git describe --match 'foo-[0-9].*[0-9]'

The output might be "foo-3.4-81-g1234567". We are only interested in the third term, which is the number of commits since the commit tagged as "foo-3.4". We use this as a beta revision number.

Now to pour that into code we need some M4 magic for autoconf. What we need to do is to construct a version number at an early autoconf stage, so that the version number is a static string for autoconf. Autoconf is based on M4, which is not that vintage German camera but a macro language developed by Brian Kernighan and Dennis Ritchie. As with all macro languages it is not easy to use but nevertheless very powerful. We don’t use M4 directly but use autoconf provided wrapper macros.

At the top of configure.ac you need to define a new M4 macro. If our next release will be 3.4 we would write:

m4_define(my_version, [3.4])

and that is the only place you need to change for a new version number. Everything else works automagically using the next few lines of code:

m4_define([git_revision], m4_esyscmd([git branch -v 2>/dev/null \
           | awk '/^\* / {printf "%s",$3}']))
m4_define([git_revision_dec],
          m4_esyscmd_s([echo $((0x$(echo ]git_revision[|head -c 4)))]))

This puts the GIT commit id into the m4 macro git_revision and an abbreviated and decimal variant of it into git_revision_dec. Those two macros are not actually required for our system but they are often useful for other purposes. For example git_revision_dec can be used as the build number on Windows platforms.

m4_define([git_betastring],
           m4_esyscmd_s([git describe --match 'foo-[0-9].*[0-9]' --long|\
                        awk -F- '$3!=0{print"-beta"$3}']))
m4_define([my_full_version],[my_version[]git_betastring])

This is the actual code to extract and build the revision suffix. It is pretty straightforward: "git describe" is called and the number of commits since the matching tag extracted. If the number is not zero, the beta suffix is appended to the version number. The final step is to change the AC_INIT call to something like this:

AC_INIT([foo],[my_full_version],[foo-bugs@example.org])

Now after running autoconf you will get a suitable version number. Run the usual "configure" and "make distcheck" to create snapshot tarballs named for example "foo-3.4-beta81.tar.bz2".

If you are finally ready for a release and did your last commit, you merely need to add a tag

git tag -s foo-3.4

and run "autoconf", "configure" and "make distcheck". Et voilà, your "foo-3.4.tar.bz2" is ready. If that works well, you may now push the tag and then update my_version to 3.5 to start a new development cycle.

Identify theft by clang

10 August 2012 5:31 PM (devel | gcc)

For some time now, I receive bug reports against Libgcrypt and GnuPG where the claim is that there is a bug in the code (surprise). Given that both projects have been build on a wide range on platforms using many different C compilers, I was amazed that clang was able to still find other flaws in the code.

On closer inspection it turned out that clang pretends to be gcc! For example clang 3.1 claims to be gcc 4.2.1:

if (!LangOpts.MicrosoftMode) {
  // Currently claim to be compatible with GCC 4.2.1-5621, but only if we're
  // not compiling for MSVC compatibility
  Builder.defineMacro("__GNUC_MINOR__", "2");
  Builder.defineMacro("__GNUC_PATCHLEVEL__", "1");
  Builder.defineMacro("__GNUC__", "4");
  Builder.defineMacro("__GXX_ABI_VERSION", "1002");
}

Well, I would not complain too much about it if clang would really be compatible with that gcc version. But, it is not even compatible to more than 10 years old gcc versions. One example is that clang does not grok the gcc feature of defining extern inline functions. Certain inline asm code does not work either.

Clang is praised a lot for being able to compile all kind of stuff with better performance than gcc. However it often does this only by claiming to be gcc and hoping that it works out.

I consider this default behaviour of clang as an impolite act against the free software community. Do they really want us to change existing code to

#if defined(__GNUC__) && !defined(__clang__)

? I have been hacking C for more than a quarter of a century but can't remember compilers to steal others identity.

Clang folks, please stop defining __GNUC__ by default.

The perils of contract driven free software

26 June 2012 10:15 AM (devel | en)

Over the years I have been able to do a of lot hacking while working on contracts to extend GnuPG. This contract work allowed me to continue my regular work on GnuPG on my own time. This is a kind of collateral benefit. If the ordered extensions are mainly interesting for the client, there is no big problem. The costumer gets what he wants, the public may use the new features if they are useful to them, and I can make a living.

A problem occurs if extensions are ordered which are directly useful for the public or even have been suggested for a long time. In this case I often had to delay work on such features, either to wait for a customer or to settle the development contract. That is because most clients don’t like to pay for stuff which has already been implemented. If the negotiations take long (sometimes more than a year), it is annoying for the public that they have to wait for new features for a long time — despite that it is sometimes only a few days of work. The problem is that a particular feature is part of a longer list of features a client has asked for. Striking out work item after work item during the negotiations, because they have have been implemented in the meantime, might be good for the client but not for me. They may even make the client believe that he only needs to talk a few more weeks to cut down the price even more.

Thus my conclusion is that development contracts done in the standard IT business manner don’t work well in the free software world. They don’t foster development but destroy the very benefit we have with our “community” driven style of development: creativity and the avoidance of features introduced to justify a higher price.

So, how can professional free software development be done without falling into this pit? I have no clear answer to it. As a prerequisite clients need to understand that they benefit by all means of development; whether done by a contractor or independently by someone else. Ideally they should donate to a global project fond to support general development. Individual contractors may still be hired for quality assurance tasks and to oversee the development of features required by the client. It will be hard to work against fixed deadlines; but how often have such deadlines been kept anyway?

The STEED Self-Signing Nonthority

14 June 2012 6:26 PM (gnupg | steed | en)

Recently GnuPG got its feet in to the CA business. With 2.1beta3 the default installed certificates include this one:

      S/N: 01
   Issuer: /CN=The STEED Self-Signing Nonthority
  Subject: /CN=The STEED Self-Signing Nonthority
 validity: 2011-11-11 00:00:00 through 2106-02-06 00:00:00
 key type: 1024 bit RSA
key usage: certSign crlSign

Huh, what is that? Only 1024 bit — isn’t that insecure? Expires in 2106 — why this? Well, it even comes worse: We also distribute the corresponding private key in the source tarball at: tests/68A638998DFABAC510EA645CE34F9686B2EDF7EA.key.

Uh, that’s crazy. Not really: It is just another arbitrary certificate and GnuPG does not trust it unless the user confirms that it is a trustworthy root certificate. In fact any S/MIME mail may contain a self-signed certificates - they are actually quite common. So, what is special with that certificate?

A closer look shows that it features an uncommon signed attribute: 1.3.6.1.4.1.11591.2.2.2 which is also known as wellKnownPrivateKey (this OID is below the GNU arc). GPGSM (GnuPG’s S/MIME tool) recognizes this attribute, skips the check for trusted root certificates, and return an non-trusted error when operating in the standard validation models (shell and chain). Only in the steed validation model it does not return an error, this is because that model does not care about the certificate chain, but solely bases its validation result on the fingerprint of the certificate. In the steed model the whole certificate rummage is only used to convince existing software that it sees a real certificate. This allows to use existing software to store and transport such certificates. In GnuPG, this special root certificate makes it easier to handle certificates for the STEED system; without it GnuPG would need to make sure that all certificates used by the STEED system carry a special attribute, identifying them as STEED certificates. The solution with a special root is a bit cleaner. It also makes some fun of the existing public PKIs.

In case you wonder whether there is something special with the private key of this nonthority, check yourself: Build libgcrypt and run tests/prime --42.

Yet another attempt to keep a blog

14 June 2011 2:00 AM (en)

Having always preferred static web pages, it might now be worth to checkout whether GIT and Guile are able to change my mind. Let’s see whether Andy’s tekuti makes the difference. I plan to post short blurbs once in a while. For extra fun and as an incentive to get away from Legacy IP, this server has only an AAAA record in the DNS.