Build process improvements

Hi. Building JavaCC 21 should be strengthened, by checking that the new version of the generated javacc parser (the new jar) can parse / build the current version of the javacc project, thus checking non regression. I assume each new version is built with the previous version.

When you want to improve the parser (even for a small thing like a final in a template), you cannot see the results on javacc itself, as it is generated with the old version. It would be nice to get in the IDE, in a source folder, the newly generated javacc. And once you’re happy with it, rotate it to the current version.

Well, it seems you are talking about rebootstrapping the build with the newer javacc.jar.

It’s easy to test whether you can rebootstrap. From the top-level directory:

ant jar
cp javacc.jar bin
ant clean jar test

In fact, I typically check whether this works before a commit or before putting up a new jarfile on the server. I consider the ability to drop in the newest jarfile and rebuild to be perhaps the best overall functional test available, so it would be foolish not to use it! (It is true, however, that I never formalized this as part of the build/test or anything. You’re right about that.)

As for updating the jarfile in the bin directory in git, I do that with some frequency. I just checked right now and if you run:

java -jar bin/javacc.jar

at the moment, you see that that jarfile was built on 6 May, nearly 3 weeks ago. Sometimes I would have rebootstrapped more recently than that, but it’s probably pretty rare that the bootstrap jarfile is more than a month old.

But you know, the funny thing about all this is that other comparable projects are very conservative about rebootstrapping. For example, ANTLR 4.x is not built using ANTLR 4, but using ANTLR 3. (I honestly don’t know why. For me, it would make more sense if ANTLR 4.8 is built with an ANTLR 4.7 jarfile, say, but… well… that’s their project…)

If you look at the legacy JavaCC project, their bootstrap jarfile here:

If you run:

jar tfv bootstrap/javacc.jar

from the top-level directory of the legacy JavaCC project, you see that this jarfile was built on 16 January 2008. That already provides an indication of how utterly stagnant the project is. JavaCC 21 typically won’t build with a jarfile that is more than a couple of months old, because I implement new features, like negative LOOKAHEAD, and then, when I am confident that this works, I use the new feature internally, so you need a fairly recent version to bootstrap the current build. Of course, another way of putting this is that I regularly burn my boats, as it were!

The fact that there has been no need to rebootstrap the jarfile in the legacy project for 12 years (and probably it was not even necessary then) what does this tell us about the real rate of progress there?

So you do not automate rebuilding the JavaCC parser by itself. What will happen if you publish a buggy jar? How do you revert back (through a safe & tested procedure)?

In JTB, I first try to build the new parser, then use it to rebuild itself, then to build the test parsers an then I execute the test pasers.
I feel more comfortable following this order because the test parsers used to not provide as much coverage as the JTB parser. Once I extend the tests parsers I probably will revert the order.

What about the coverage of the test parsers in JavaCC 21 against the parser itself? Regarding options combinations and productions combinations?

To tell the truth, I’m not 100% sure what you are asking me now. I don’t quite know what you mean by “rebuilding the JavaCC parser by itself”.

Generally speaking, I tend to have a very aggressive stance towards working with the codebase, so I’m always trying to move forward rather than backward. If I have introduced a bug, I try to figure out where it is and fix it, of course. And just keep moving forward relentlessly…

That said, on a couple of occasions in the last few months, I think I did re-bootstrap very aggressively and I had introduced a bug and couldn’t figure out what was going on. So I just ended up checking out the code from a few days before and going from there! And that was a setback because I then had to reconstruct what I had been doing in the previous days. These things happen. One step back, but then pretty soon, two steps forward!

I worked a bit more on my nothingburger essay recently and this section:

Maybe this explains my own philosophy towards software development. I think it’s the r right philosophy. Works for me at least. You can see the results… compared to producing not a single new feature in 17 years, I think… well… enough said…

Another concern about the project structure / build files.
I’m on Eclipse.
I expect to have, like I have with legacy javacc projects, the project structure like the most widely used one, ie maven structure, with src/main/javacc, src/main/java, src/main/resources, src/test/javacc, src/test/java, target/generated-sources all as source folders, and have the Eclipse incremental build system working smoothely with the ant builds or the maven builds, that is same compiler levels, reasonable definitions of errors / warnings (-> goal is to have 0 errors and 0 warnings!).
Here we have tests examples that are not source folders, inconsistent packages hierarchy, a strange “build” directory not as a source folder…
I’d like to use the ant build file only for special targets (clean, tests, jar), not for common javacc/java compiling, and rely on the IDE for these laters. Of course for the moment no plugin handles javacc 21, so parsing must be done through the ant build file.
So now I have errors on examples in Eclipse but not in the console. Sigh.
I also have errors on Potential NullPointerExceptions… Some discussions will be needed about the compiler settings, and the use of the final modifier for performances… Pff.

I’m a bit puzzled about what you say about the Eclipse incremental build. I think that the Eclipse incremental build is working once you have the parser classes generated. For example, if you just checkout the project from git hub and run:

ant test

and then in Eclipse import the project from the file system, for example, what happens? I think it works pretty okay. What needs fixing?

But anyway, in principle, I’m not against somebody (you, for example) coming in and reorganizing the project somewhat along more standard lines. The build system (if you can call it that) has not changed very much since I forked the project and called it FreeCC back in 2008. The thing is that I have something that works for me and it just has never been a big priority to do anything with it. (I always have better things to do.)

I’m not against somebody coming in and cleaning it up and modernizing it. (Migrating to maven or gradle or something, for example.) But I would say this:

If somebody is going to come in and cause some upheaval, moving directories around and all that and changing how things work, in my view:

That person (you in this instance) is tacitly saying that he intends to do some real work on this – i.e. to learn the code, and take ownership of some problems and do some work.

I’m not going to accept that anybody thinks that his sole role in this project is going to be mucking around with build scripts. The fact is that I have this ant build file that is a bit over 100 lines long and it may not be great but it is very simple. Anything that replaces it should not be very much more complicated, in my view. So, nobody can define their role in this project as Mr. Maven or Mr. Gradle…

In any case, it simply makes no sense to come in and redo the build system if you are not then going to do any work on the code! Basically, people have to understand that this here is NOT a nothingburger project!

If you have not noticed, the Eclipse JavaCC plugin automatically triggers the Eclipse incremental build system when a grammar file is saved: it runs the jtb/jjtree/javacc generation(s), grabs the console output, updates the markers, then the Eclipse javac compiler compiles the generated classes. Same when you import / rebuild (all) / unit test the project through the Eclipse commands, it runs all grammars’ generations for you. So the plugin is not useful only for editing code.

So what I ask for is to have the ant build and Eclipse build system aligned and work in a coherent manner. I have this for the current Eclipse JavaCC plugin / JavaCC / JTB, so I ask to have it for the future Eclipse JavaCC 21 plugin / JavaCC 21 / JTB 21.

By the way, an interesting mechanism (for JavaCC / JJTree, I had it in JTB) is to write generated classes only if they have changed, saving full javac recompilation. When you have 200-300 productions in your grammar, it makes a great difference, when you fix only one.

Actually, I have not noticed this because I cannot use your eclipse plugin. Not specifically your plugin, just any JavaCC plugin, not yours or Clément Fournier’s IntelliJ plugin, for example, since it doesn’t support any of the newer constructs like INJECT, which I make extensive use of.

But anyway, I was curious how much time can be saved by any of this and, to be honest, it doesn’t look like very much. Granted, a faster turnaround is better, yes, but:

If I run ‘ant clean compile’ that takes 6 seconds on my normal work machine. I wanted to know how much of that is spent regenerating all the tree nodes. (A bit over 150 files.) I mean, how long this method here takes:

The answer is 240 ms. That’s on a 2014 vintage imac. (I don’t think many serious developers are working on a machine that is much slower than that…) Oh, and if I just regenerate the parser files without compiling, like: ‘ant clean parser-gen’ that is about 4 seconds. So not regenerating all of the node files just saves a quarter of a second out of about 4 seconds.

I think you’d find something quite similar with recompilation. Recompiling all of the little Node classes simply doesn’t add very much time.

Basically, a full rebuild of JavaCC 21 itself takes about 6 seconds on my work machine and it is hard for me to imagine that you could reduce it by more than a single second by messing with this. The difference between 6 seconds and 5 seconds would not be very noticeable and also, the time is surely quite a bit shorter when you are doing it from within Eclipse, since Eclipse is doing everything in the same JVM and can have a lot of information cached.

It’s also possible that there is some bigger difference in the code you are working on, but you would have to do a similar benchmark to what I do above. Again, it would be better to have somewhat faster turnaround, but it doesn’t look like that much can be achieved. In my view, if you can’t get anything close to a binary order of magnitude (double the speed) it is hard for me to get very excited about this.

1/ Your javacc21 project is small. 2400 lines for javacc, 2900 for java.

My projects are bigger: Uniface: 8500 & 3700; PL/SQL: 4000 & 2500 & 1000

2/ Compare things fairly: between the Ctrl-S, when you save the grammar file, and the end of all the classes compilation.

I do not care typing the () for a production :slight_smile: . I do care not to have to move the mouse to another view, and type ant clean compile or click on a launch menu or right click on an ant target, while I can do nothing for the same result :frowning:

Well, that’s a combined 5300 lines. I don’t know how many people have a much bigger grammar than that. You do, apparently, but I think 5000 lines is quite big. But, in terms of this conversation, the important issue is surely the percentage improvement that can be achieved by not rebuilding all the files every time. It doesn’t look like you can get much improvement this way. Some improvement, yes, but not anything that would make much difference to the user’s overall experience. I don’t think so.

As for typing ‘ant clean compile’ that was just how I benchmarked the difference. I don’t have to type that. In eclipse, I can just do Ctrl-B for rebuild and it invokes the ant task to rebuild the parser and then uses its own internal compiler to rebuild the rest. The thing is that I didn’t know offhand how to measure the time, so the time comparison I gave you was for running the ant task from the command line. I mean, even with the admittedly imperfect set up now, you don’t have to rebuild from the command line!

Oh yes, you can save a lot of time, like 90%. I did it for JTB, as I said.
For each generated class, add in a comment the generated source checksum (on the first line for example); once you have regenerated the file in memory (in a StringBuilder), compute the new checksum and compare it with the one in the old file; if they match, don’t rewrite the file.

And don’t tell me that you cannot do this easily with FreeMarker generation.

OK, timing is not my real issue.
What I want is that the ant compile task gives the same results that the Eclipse compile phase, in terms of compiling settings and resulting markers, that is I don’t want to see markers in Eclipse that I do not see in the ant console output. I almost all the time do not use these last ones.I use the “refresh workspace / project” option in the launch configurations to keep Eclipse build state up to date (ie rebuilt) after an ant task.

Well, this is a real open source project. (A real one, not a nothingburger!) So, if you have an itch, you can scratch it. Any PR that is well motivated, I would probably merge without much pushback at all.

Understand that when I say that I don’t have any particular problem with the current setup, I’m not saying that it’s perfect and I don’t want it to be touched. I’m just saying that for myself there is no huge problem that needs to be resolved with this. I just look at this again (trying to look at it with “newbie eyes”) and what I see is something fairly close to being a vanilla Eclipse java project, isn’t it? The only special characteristic it has is basically here:

That is because, yes, you do have to generate the various files with that. (I did not write that file by hand. It was generated by some wizard in Eclipse, but it was not so easy to figure out how to do that! That was a few months ago, I think.) I’m not the kind of person who spends huge amounts of time mucking around with this sort of thing so, on the one hand, I tend to think that, yes, the whole thing can probably be improved significantly. So, if you had a PR, I would probably just merge it without arguing too much.

The main concern I would have with this sort of issue is just keeping things very simple. The ant build file is a bit over 100 lines and these various Eclipse configuration files, like .project and so forth, are very simple, very close to just being vanilla Java project. So, the bottom line is that I am open to any improvements of this, as long as it doesn’t turn into something outrageously complex. My internal wetware has only so many CPU cycles to deal with complexity and if I want to conserve that to deal with complexity in the actual code! Not some overly complicated build configuration. I hope you understand my point…