There is a scene that repeats itself in various James Bond films, usually towards the end. The archvillain has captured Bond. However, instead of just killing him, he has to do it in some very creative, slow manner, using some sort of Rube Goldberg contraption. Meanwhile, he tells Bond gleefully what his plans are for world domination or whatever it is.
And so you see, Mr. Bond, once I have become supreme ruler of the entire planet, nobody will ever kick sand in my face at the beach again!
Or something like that...
But surely you know this clichéd plot device, Dear Reader. The villain has to keep Bond alive a bit longer because he needs somebody to tell his plans to. And that character weakness gives Bond the extra time to turn the tables and thwart the villain's dastardly plans and he saves the world with a few seconds to spare.
So what does this have to do with open source software, you might be wondering.
Well, really, nothing at all! It's just that sometimes it is easier to illustrate a concept with a negative example. You see, open source software is not a world with secret, devious plans hatched by criminal masterminds. Everything happens in the open! All the changes to the code are committed to the version repository (typically Git nowadays) and it is all visible, at least to anybody who cares to look! Discussions also tend to be public. Well, some discussion could be in private email, but generally, the use of a publicly visible mailing list or forum is preferred. Nowadays, I suppose that mailing lists are a bit retro, but that is hardly the point. I'm just saying that both the ongoing development and discussion thereof tend to be open and public.
Thus, for example, I can look at the various discussions in which a project that uses JavaCC heavily -- in this case JavaParser addressed the question of the state of JavaCC and the desirability of switching to something else. Consider this discussion from late summer of 2018, entitled Consider migrating away from JavaCC. This is issue #1813 in the JavaParser project and, as I write these lines, some 4 years later, the issue is still "open".
The conversation begins with Federico Tomassetti advocating their migrating to a "better maintained" version of JavaCC. This is a "fork" created by one Philip Helger called ParserGeneratorCC. For completeness sake, I think I should point out that Tomassetti's initial post in the thread contains a very basic misstatement of fact. He writes:
Some metrics: JavaCC had no commits in the last 10 months. That is roughly the age of ParserGeneratorCC which has 900 commits. For this reason it seems to me a project that could offer better support and perhaps better features in the future.
He claims that ParserGeneratorCC has 900 commits to the code repository at that point in time. However, it seems that Freddy does not understand that when you create a fork on Github, it inherits the entire history of the project that was forked. Thus, most of those 900 commits were commits made to the original JavaCC project. Out of curiosity, I went back and counted the commits. In the initial 8 months after creating his "fork", Helger did 136 commits. Granted, that doesn't sound so bad, but it is a lot less than 900. However, more importantly, of those 136 commits, at least 90% of them are just fluff. So, Tomassetti made a very basic mistake in his "metrics". Well, we all make mistakes. However, it does not seem like this is a one-time lapse. It does not seem that Tomassetti has the wherewithall to evaluate the state of an open source project. That, or he is careless about the hard facts of a situation to an extent that is somewhat worrisome.
Now, let me make clear one point: I was not involved in this conversation and I did not know any of these people at that point in time. In fact, at that point I myself was nearing the end of a decade's absence from coding. (I had some interaction with these people later. Unfortunately.)
What happened then is that one of the JavaParser collaborators, Danny Van Bruggen, screen name Matozoid responds quite negatively to Signore Tomassetti's proposal. He replies:
I think there is a new version of JavaCC being worked on, and that is our upgrade path. I'd rather have a cool new version than a pretty but outdated fork. Moving to ANTLR would be preferred though.
Danny's rejection of the Helger "fork" as an "upgrade path" for the JavaParser project was actually well founded. Indeed, the fork had (and has) precious little to offer. When I made the acquaintance of Helger over a year after this, naturally I looked at his "fork", trying to figure what he had done in the 2 years (2 years at that point, more like 4 years as I write these lines) since he had forked. I came to the conclusion that his "fork" surely did not embody more than a few days work. My conversation with Helger became surreal quite quickly. He told me (without a trace of embarrassment) that the purpose of his fork was not to actually ever implement any new features. Or, in other words, not only did he have no roadmap for further development, but very consciously did not!
Regardless, Van Bruggen's claim that there is a new version of JavaCC being worked on, and that is our upgrade path is rather unsettling. Any examination of the commit record of the legacy JavaCC project would have led to the same utterly dismal conclusion. To quote the good doctor McCoy in Star Trek: He's dead, Jim. Somehow, Danny boy thinks that the JavaCC "team" or "community" is working on a "cool new version" but there is no sign of this to be seen in the commit record or in any publicly visible conversations on their website. I say it is rather "unsettling" because it makes one wonder about this person's connection with objective reality.
There was a bit more back and forth in the conversation in the following days. Four days after Tomassetti opened the issue, on 2 September, Danny writes quite firmly:
It's not an option: the upgrade path for JavaCC is JavaCC, not that fork. I expect major new features from JavaCC, and moving to a fork could make that harder. What would it even add? It's nice that the code is refactored, but when the JavaCC is out they can start all over since they failed to work with the project owners. Yes, I know they are impossibly secretive, but the fact remains that the code bases are split. And the fork guy is really nice, but I really don't know why he's wasting his time on it.
(All the emphasis in the above-quoted text is mine, by the way.)
I think it's worth breaking this down, but it's hard to know where to begin... Let's see... "They are impossibly secretive". Apparently, Danny Boy had somehow convinced himself that the "project owners" were working on a "cool new version" but there is no trace of this activity because they are "impossibly secretive". So this actually does get us back to the scenario from the James Bond film. Imagine, if you will, the ostensible JavaCC "project owner" Sreeni, in the role of the Bond master villain:
Well, we would tell you our dastardly, top secret plans of what we are going to do with JavaCC, Mr. Van Bruggen, but unfortunately, then we would have to kill you.
You see, it's a funny thing to be kind of a "fly on the wall" in other people's conversations like this. I was thinking about how I would react to somebody saying something like this that shows such a disconnection with reality. "Yeah, man, those guys really are working on a cool new version but it's top secret!"
I mean, really, how is one to respond to this? I'd have to admit that diplomacy has never been my strong suit. It would depend on my mood but I can certainly imagine responding: Dude, why are talking this shit to me? WTF is wrong with you? Tomassetti's response in the following message is more measured than that. He writes:
Out of curiosity, how come you expect new features from JavaCC? In my impression there is no development going on and no commits for almost one year now
Well, I suppose that's code. It really means: "WTF are you talking about???!!!"
On that same day, 2 September 2018, Danny replies
Well, I honestly can't find it anymore,...
And then later, in the same message, he concedes:
Indeed there's not much going on there anymore. ...
That, however, is still a half-truth. On the face of it, that would mean that there was something going on previously but not anymore. Yeah, right. In fact, there was never anything going on there! Somehow the conversation proceeds and the very last message is Danny Boy again. Two weeks later, he writes:
I just noticed that JavaCC's development work is done on branches, not on master. You hardly ever see activity there (although there is now a new release).
This is another strange statement. Work done on branches is still visible and aside from that, any significant new feature initially developed in a branch would later be merged to master anyway, no? Well, in short, Danny boy notices there is now a new release, and also notices that said new release does not correspond to any observable development activity. So the cold, unpleasant truth is right there in front of his eyes. These people have a long history of putting out nothingburger releases. That is how they got from version 2.x (when the JavaCC code was open-sourced by Sun back in 2003) to a version 7.x now -- without adding a single meaningful new feature or fixing any of the longstanding bugs. Rather than connecting the dots on this, Danny resorts to a sort of magical thinking. Okay, they put out new releases and obviously, that means that they did something. (Otherwise...) So, they did something but he cannot point to what they did. They are secretive or they do it in branches.... (Or, as per the Beatles song, maybe they do it in the road...)
Now, to be maximally fair, Danny Van Bruggen is not the only person who seems to be emotionally committed to the notion that the legacy JavaCC is being actively developed. Elsewhere, I wrote about some of my interaction with Professor Theodore Norvell, who maintained the JavaCC FAQ for at least a decade. I was looking through that FAQ quite recently and something occurred to me. I did a quick search for the term "new feature" in that document.
The term "new feature" occurs twice. The first occurrence of the term is in response to 1.7, Are there other implementations. The (rather cursory) answer is:
FreeCC is a derivitive of JavaCC 4. It should translate most JavaCC 4 and JJTree 4 grammars. FreeCC adds some new features. FreeCC is at code.google.com/p/freecc.
FreeCC is my work. It is what I called my work on the JavaCC codebase before deciding on the JavaCC 21 renaming in early 2020. FreeCC adds some new features.
The other occurrence of the term "new feature" is in FAQ #3.20, Why are line and column numbers not recorded?.
In version 2.1 a new feature was introduced. You now have the option that the line and column numbers will not be recorded in the Token objects.
Some new feature, eh?
Yep, this wonderful new feature, introduced in version 2.1 apparently (they are now on version 7.x) allows you to throw away location info. If you set KEEP_LINE_COLUMN
to false (at least the default is true
!) and it will not generate the 4 integer fields beginLine
, beginColumn
, endLine
, and endColumn
in the generated Token.java file. This allows you to reduce the memory footprint of a Token object by 16 bytes.
So, yes, if somebody was really hard-pressed to save on memory usage, they could use this. But here is something about this "new feature". Surely getting rid of all 4 fields so that you have no location info whatsoever is borderline insane. And that is why you never find anybody using this in JavaCC grammars in the wild. However, look, tokens typically do not span more than one line, so beginLine
and endLine
are almost always the same anyway. In most troubleshooting, there is not much need for endColumn
. Once you know where the token begins, knowing that it ends n chars later is not really extra information, especially when you have the image
field. So, you could save 8 bytes (not 16, granted) by simply not generating the endLine
and endColumn
fields. In fact, if an error message simply told you the line number, that would be enough information in the vast majority of cases. If you got rid of three of the four fields, you would save 12 bytes memory per Token object. But you would still know what line number an error occurred on.
Under what circumstances would any pragmatic person throw away all four fields?
There would be some argument, in some memory-restricted scenarios for getting rid of two or three of the four fields but never all four. However, this "new feature" only allows you to remove all four.
So, frankly, what can be said about this particular new feature? It is about as useful as a nun's fill-in-the-blank, no? Professor Norvell maintained the JavaCC FAQ for a period of at least a decade and this is the only "new feature" that he ever needed to document apparently. Well, he also mentioned that my work (called "FreeCC" at the time) also had some new features, but he did not feel there was any onus on him to tell anybody what they were.
Now, getting back to Van Bruggen, a.k.a. Matozoid, a.k.a. Danny Boy, I did not interact with him until May of 2020, nearly 2 years after the above-linked conversation in which Van Bruggen seems to think, as of mid-2018, that despite all objective evidence, JavaCC is an actively developed project. I had come across another JavaParser-related discussion in which Van Bruggen expressed interest in having more fault-tolerant capabilities in JavaParser, except that unfortunately legacy JavaCC has zero concept of fault-tolerance or error recovery! So I wrote a comment there on 9 May 2020, in a discussion that was initiated nearly 3 years earlier, on 11 July 2017.
Hi Danny,
I ran across this 3-year-old conversation and thought to say some things... first of all, I don't think you can get very far on this problem with legacy JavaCC. For one thing, Legacy JavaCC has no concept of rewinding back to a previous point in the parse, say, so your range of strategies for dealing with error recovery are pretty darned limited. (Slim and none, as they say...)
I am trying to resuscitate that old JavaCC dinosaur as an actively developed project, JavaCC 21, and have been working on providing some machinery to deal with these sorts of problems. Here is something I wrote about this topic, just a couple of weeks ago: https://parsers.org/2020/04/23/fault-tolerant-parsing-progress/
And then there is a new feature that I added as a result of my fault-tolerant/error recovery sort of work:
https://parsers.org/2020/05/03/new-experimental-feature-attempt-recover/I would be quite happy if you dropped me a note at revusky@javaccNOSPAM.com since I am mostly an email guy.
I just noticed that the above note has been labelled by somebody in the JavaParser project as disruptive content. I guess I'll just let readers formulate their own judgment on that. It is true that this caused the initiation of a conversation that got rather acrimonious in which, in particular, one Roger Howell (a.k.a. MysterAitch) made a point, on each iteration, of trying to talk down to me, and I finally got openly annoyed at him; that much is true. Shortly after this conversation, Van Bruggen showed up in the legacy JavaCC community, filing an issue entitled Hostile fork of JavaCC in which he completely mischaracterizes my interaction with the JavaParser community. He writes:
Hi! For the last few days we, the people from JavaParser, have been bullied by the creator of a hostile fork of JavaCC, who has also claimed http://parsers.org . This is a friendly warning to be very cautious when dealing with this person.
No mention of what the "bullying" consists of... (Indeed, how could I "bully" these people?) But okay, he was warning them about me. "This guy has picked up the JavaCC codebase and is working on it! Watch out!" (How awful of me!) Of course, this was nearly half a year after I had resumed my JavaCC work (previously called FreeCC) and it is beyond belief that they would not have already got wind of this. (If they hadn't, it would just be a further sign of how disengaged these people are from actual development in this application space.)
Now, in terms of my own reaction to this, I automatically assumed that Van Bruggen was being utterly disingenuous. Surely, as a long-term user of JavaCC, he knew perfectly well how stone-cold dead the project was development-wise. Now I look back on that and think, based on that 2018 conversation (that I only came across later) that maybe Danny Boy really believed that legacy JavaCC really was being actively developed. Granted, we are now up to mid-2020, two whole years after Danny had been challenged to point to any ongoing work on JavaCC and could not do so. So, is it possible that he still maintained the belief that JavaCC was a legitimate, ongoing software development project? (I honestly don't know. It makes my head spin frankly...) Now, if one does finally realize at long last that the legacy JavaCC is nothing but a longstanding fraud, why would anybody think I was violating their rights? What rights? The sacred right to perpetrate a fraud? To sit on a well known project for a period of nearly 2 decades and never do any meaningful work, all the while putting out utterly fradulent releases, taking the version number from 2.x to 7.x? Somehow, their fraudulent representation that they had an actively developed project was perfectly reasonable behavior. And then my legitimate efforts to resurrect this into a genuine active development effort makes me the story's villain!
So, one wonders: in the two years that had elapsed after that 2018 conversation, had he finally figured out that the legacy JavaCC project was dead and basically a fraud. (Or had he?) Or is it possible that he believed simultaneously that the legacy JavaCC project was dead and that it was actively developed? It occurs to me that maybe Van Bruggen has mastered the mental contortions that Orwell describes in his classic novel "1984" as Doublethink. In roughly the same period of time, I wrote an email to Theodore Norvell and asked him pointedly: "Theo, did you ever notice in your ten years plus of maintaining the JavaCC FAQ that you never had to document a single new feature?" (Of course, I never received an answer to that.)
It was about half a year later that I finally wrote my Three Cheers for Norbert Sudhaus piece, which was admittedly a tad unfair -- I mean to Norbert Sudhaus, not to these bizarre individuals that I refer to as the "Norbert Sudhaus Fan Club". After all, Norbert Sudhaus's childish prank of running the victory lap in the 1972 Olympic marathon was almost certainly not inspired by malice of any sort. The behavior of the various "Norbert Sudhaus Fan Club" members, such as Van Bruggen, Tomassetti, Norvell, Howell, Helger, Francis André (and various others) is, on the other hand, just dripping with a kind of malice and resentment. (You are free to disagree, Dear Reader, but you will never convince me otherwise!)
Now, I'm aware that a lot of people view this kind of thing, the nothingburger-ism and the Norbert Sudhaus Fan Club as just some annoyance that one has to put up with, kind of like mosquitos on a camping trip. (Yeah, I hear ya. These people are like that. What is one to do?) Finally, I write an article like this because, on consideration, I don't think that people should be allowed to get away with some of this behavior -- in particular, falsely representing that some dead project really is alive and being actively developed, when it so obviously is not. It really should be damaging to these people's reputation. Dishonesty should not be tolerated and I refuse to be browbeaten to accept that I just have to put with this kind of chronic dishonesty that pervades a space.
Granted, nobody has to personally like me, but is it so unreasonable to expect a baseline of honesty from people?
Well, this article is based on the (maybe pathologically stubborn) stance that it is not unreasonable...
So, that said, now it's time to get back to my coding work, which I actually find far more enjoyable than writing an article like this one.