Nested Lookahead Redux

(The TLDR: Nested lookahead was always broken in legacy JavaCC. This is finally fully addressed (as of 11/26/2022) in JavaCC 21. However, since this fix has a lot of potential to break existing code, for now, the fix is only in effect if you put LEGACY_GLITCHY_LOOKAHEAD=false at the top of your grammar(s). Meanwhile, you should …

Nested Lookahead Redux Read More »

I shall tell you my plans but then I shall have to kill you, Mr. Bond Van Bruggen

Legacy JavaCC 21 Posts / By revusky

There is a scene that repeats itself in various James Bond films, usually towards the end. The archvillain has captured Bond. However, instead of just killing him, he has to do it in some very creative, slow manner, using some sort of Rube Goldberg contraption. Meanwhile, he tells Bond gleefully what his plans are for …

I shall tell you my plans but then I shall have to kill you, Mr. ~~Bond~~ Van Bruggen Read More »

Some Niggling Whitespace Issues

Legacy JavaCC 21 Posts / By revusky

A very annoying detail in computing (annoying in particular because the whole thing seems so utterly pointless) is this whole issue of line endings in text files. In Microsoft Windows, the operating system that dominates the desktop, a line ending is denoted by "carriage return" followed by "line feed", or CR-LF for short. (Namely the …

Some Niggling Whitespace Issues Read More »

Token Chaining, JavaCC’s Dark Underbelly?

Legacy JavaCC 21 Posts / By revusky

♫ I can still hear you saying ♫ ♫ You would never break the chain ♫ ♫ Never break the chain… ♫ Fleetwood Mac What we see frequently with JavaCC is cases where the original authors were actually quite perceptive: they anticipated certain problems and put in some dispositions to deal with them, but unfortunately, …

Token Chaining, JavaCC’s Dark Underbelly? Read More »

Tokens, Then and Now

Legacy JavaCC 21 Posts / By revusky

I recently completed a major refactoring of how the generated Token API works in JavaCC 21. Though this constitutes a quite revolutionary set of changes under the hood it may actually not be all that noticeable to many (or most) users. I think there are basically two groups of users (not mutually exclusive) who would …

Tokens, Then and Now Read More »

JavaCC 21 now has assertions!

Legacy JavaCC 21 Posts, Tips and Tricks / By revusky

(*Note that this article is largely superseded by the one here) JavaCC 21 now has assertions. There are actually two kinds of assertions: The assertion condition is expressed in Java code. The assertion condition is a lookahead expansion. The first kind of condition looks like this: ASSERT {someCondition()} Optionally, the assertion can have a message, …

JavaCC 21 now has assertions! Read More »

Announcement: JavaCC 21 includes parsers for the latest Java and Python versions!

Legacy JavaCC 21 Posts / By revusky

As the title says: as of this writing, the included grammars for Java and Python both support the latest versions of the language, JDK 17 and Python 3.10 respectively. Contextual Keywords It turns out that, from a language support perspective, the only new stable feature in JDK 17 is the concept of sealed classes. This …

Announcement: JavaCC 21 includes parsers for the latest Java and Python versions! Read More »

Context-Sensitive Tokenization, Next Installment, Activating and De-activating Tokens

1 Comment / Legacy JavaCC 21 Posts / By revusky

Sometimes, when you complete a major code cleanup, features that were previously pie in the sky become low-hanging fruit to pluck. The new feature that I describe here, the ability to activate and deactivate tokens is such a case. It resulted from my rewriting of the lexical code generation that I describe here. In an …

Context-Sensitive Tokenization, Next Installment, Activating and De-activating Tokens Read More »

Major Milestone: The Lexical Code Generation is completely rewritten

1 Comment / Legacy JavaCC 21 Posts / By revusky

Dear Readers… This blog has been dark for about two months now, but not because I was inactive in the project. Quite the opposite actually. What happened over the last couple of months is that I went into full-blown obsessive mode and managed to rewrite the remaining part of JavaCC that had been resisting my …

Major Milestone: The Lexical Code Generation is completely rewritten Read More »

The Dreaded “Code too large” Problem is a Thing of the Past

Legacy JavaCC 21 Posts / By revusky

"It’s too big! It doesn’t fit!" The above does not refer to any particular pornographic feature film, but rather, to a longstanding problem in JavaCC: if you write a very big, complex lexical grammar, the generated XXXTokenManager would fail to compile, with the compiler reporting the error: "Code too large". Well, this has now been …

The Dreaded “Code too large” Problem is a Thing of the Past Read More »