Documenting the GlassFish v3 Schema
Once work had started on GlassFish v3 the decision was made to drop schema validation for the domain.xml configuration file. In previous versions, you could be assured that your xml was valid because we shipped a DTD to enforce that. In v3, however, that is no longer the case. The decision was made because v3 is very different in some fundamental ways. In v3, you can add an arbitrary container to GlassFish and configure it via domain.xml. This dynamic nature of the document structure makes validation difficult at best so the decision was made to drop validation altogether. However, the problem still remains of how to determine what the document should look like.
First things first, you should avoid editing that xml file by hand. You should be able to do everything you need via either the admin GUI console or the asadmin CLI tool. That said, that still doesn't necessarily help you know what values can go where. The question has come up dozens of times even among the development team. Various teams have updated and migrated their respective sections of the document leaving some confusion about the new schema. (My own work on the grizzly-config updates is probably the biggest offender here). There are, to my knowledge, two different attempts to fix this.
Tim Quinn developed the first publicly available tool. You can find that here. I had my own going on locally as well. I borrowed some ideas from Tim's approach and came up with this. My version differs from Tim's in that it runs in container as an asadmin command. This has a few advantages but most importantly is that I think it's easier to use. The output I generate is also a bit more accessible than Tim's but then Tim's was really a rough first cut. I'm not sure he's done much with it since. I, on the other hand, still get questions about the grizzly-config schema changes so this is near and dear to my heart.
At the moment, it only generates a javadoc-like HTML output. I plan on adding a DTD and XSD option but there are some disconnects between the internal Java API used to configure GlassFish and what a user sees in domain.xml that make this not so trivial. You can find those details here. The HTML generated reflects the currently valid structure of the document and not the content. This structure changes as you add/remove modules such as JRuby support and the like. The files are generated in the config directory (<glassfish>/domains/domain1/index.html for most people).
The color scheme is a work in progress. The main blue is a bit jarring, I think, but I just haven't had much time to play with such things lately. You can find sample output here. In the detail frame you'll see that some elements have properties defined. This lists all the documented properties on that attribute. In a perfect world, that list would be exhaustive but that's not always possible. e.g., Some configuration elements such as JDBC connection pools configure third party libraries and capturing all possible properties there is impossible. Nevertheless, for all internal GlassFish items this list should give you much of what you need. If you find a missing property, don't hesitate to file an issue with the GlassFish tracker and we'll try to update our documentation.
If you find an issue with the doc tool itself, please file an issue and I'll do my best to correct it. This is an unofficial contribution to GlassFish, though, so bugging the GlassFish lists isn't likely to help much. I hope this tool helps you as you evaluate v3. We really are very excited about this one.
Re: Deprecation in the JDK
Everyone hates the deprecation policies of the JDK team but it's certainly an understandable policy. Millions and millions of lines of code exist out in the wild and there's no telling what calamity would befall us if methods were actually removed the JRE libs. On the other hand, people still use those deprecated methods despite their failings and the presence of better options. I was reading Joseph Darcy's blog entry on this policy and ran across an interesting comment.
The idea is pretty simple: make compiling fail against the deprecated methods without actually removing them from code. This preserves binary compatibility while preventing any new code from compiling against these methods. This would actually include new releases of applications written before the method was deprecated as these new versions would no longer compile while the previous releases would still run.
This seems to be the best of both worlds. Building on the idea from comment, I'd offer this enhancement: Introduce an @Obsolete annotation. This would supplant the @Deprecated annotation on these methods. Any usage of a method or class (or field?) with this annotation would return an error at compile time. Reflective access to these items would also be disallowed just to drive the point home. Of course, the annotations would need to include facilities to describe the alternatives to use for these old methods to provide the same functionality as the javadoc comment. Given this simple addition, I think we'd finally start to make some progress on modernizing our code.
Your project needs someone like me
I have a mild compulsion to clean up code. When i see mixed indentation (usually tabs vs spaces), I start getting twitchy. When I see star imports, places where old loops can be replace with the "new" foreach loops (thereby reducing line noise in many cases), or any of a number of different "improvements" to the code I start itching to start cleaning up. But there's (at least) one problem: not everyone wants their code cleaned up.
For many, such janitorial efforts hide any actual functionality changes and they would prefer such changes be made in a separate commit/task. While there's some merit to that complaint, it's also often true that time for mere janitorial work will rarely/never get allocated. So without taking advantage of opportunities, it will never get done. Unless you work on a project that values such changes, these changes will either be made alongside functional fixes or not at all.
To resolve this tension, there are things both camps can do. If your project is open source, you could implement something like The Linux Kernel Janitor Project. This project is used as one possible entry point for beginners to begin to learn the system and make some changes without necessarily having to know enough to effect major changes. This kind of project can work in a corporate setting as well, though most companies would prefer to get some "value" from their employees as quickly as possible.
If a janitorial project is not an option, then try to grant some leeway for intermittent clean up. If code isn't maintained, it can become so crufty over time that making actual bug fixes or functional enhancements can be difficult. On the flip side, those like me that like to get code into shape need to show some restraint and courtesy.
Sometimes its just pride that rebels against having someone else muck with "my" code that leads people to object to such changes. But in many cases, drastic and pervasive reformats or "tweaks" to code can lead to impossible to decipher merge conflicts for people doing significant changes in those files. It can be tempting (and gratifying) to do a deep clean on a file or whole sets of files but you have to consider the impact of those changes. You might have to settle for one or two things to clean up this time and catch the rest later. At the end of the day, it's about making a working product and if your cleanups cause delays for others as they sort out what exactly you did, then you're not exactly helping out.
Of course, prevention is the best medicine so each project should choose a set of standards and conventions. These could include a set of libraries that everyone agrees to use. This can lead to a set of standard/best practices in the project rather than everyone choosing their own favorite XML libary for their subsystem, e.g. It should certainly include code style conventions. For me, I choose the Sun conventions. There's nothing magical about them. I like them and encourage their use. But whatever works for your team, just pick something and stick to it. This will help reduce the need in the future for large scale janitorial work.
The criteria for what is the "right" way to do things can change over time. e.g., the foreach loop is newer than most java code, I'd wager. On older code you're still likely to see many while loops or old style for loops that could be migrated to the new style. Your team's style/practice guidelines should be periodically reviewed and update to account for the change landscape in both language features and third party libraries. Once any changes are agreed upon, you need to prepare yourself for those changes to start trickling in. These can be done in an organized sweep and clean up of the code or piecemeal by individuals as they find them.
Either way, you're going to want people like me in there to help.
Some Post JavaOne Fun
It's friday night. Finally back at the hotel without a meeting or party or a session to go to. What else is there to do but port benchmarks of debatable value to my favorite new non-Java language: Fan. Earlier in the I was directed to this blog post detailing some performance problems with groovy. Yes, I know the blog is old. That's not the point I want to make here. This week at Javaone, there was a presentation doing some more performance comparisons between languages on the JVM. This one caught my eye because it's the first one I've seen in the wild that included Fan. So this got me thinking about year old post and the ray tracing exercise. How would fan hold up? I decided to find out. Because there's no better way to spend a Friday night after long conference week, right?
The Fan code isn't idiomatic (I'm not that bored tonight). It's just a quick and dirty port from the Java source to Fan. For reference, I reran the Java version and then the Fan version. This test is running on OS X and Java 1.6u13. Without further hand waving, here's the results:
time java -cp . ray 8 512 real 0m14.210s user 0m12.443s sys 0m1.313s time fan tracer::RayTest 8 512 real 0m17.700s user 0m15.832s sys 0m0.672s |
As you can see, the performance is really quite good. I'll probably play with the source over the next few days and see if I can't improve it a bit. The fan code is pretty rough so there's probably a fair bit to be done to speed that up a bit. I'll attach the source so if anyone else is interested the source will be available. I have to say, though, that's not too shabby at all.

The Kindle is Here!
I finally broke down and bought a Kindle. I've been eyeballing them since the first one came out and have been daydreaming about them since the Kindle 2 pics first leaked. After reading countless previews and reviews and raves and rants, I decided it was time. Sure, it's expensive. Yeah, it's "only a single function device." blahblahblah. The fact is I love to read, I live in a NYC apartment, and I already have an entire wall devoted to bookshelves crammed full of my books. There's just not that much space here to keep buying more and more books. Any my library hardly ever has what I want on the shelf. When they have it at all, there's a waiting list. So this makes a lot of sense for me in a number of ways. I've had it for about an hour now so I don't have any deep dive experience with it as such, but as far as first impressions go, it's a big win. And the first thing I did after browsing through the user's guide? I bought Brandon Sanderson's latest book "The Hero of the Ages." Now if only I didn't have to work.
String Concatenation Revisited
I had intended to do some follow up numbers to my previous post but I got a bit sidetracked by work and the like. My simple tests all work with one String that's created then thrown away. This test helped me resolve the question I had when I started down that road but stops short of a more general answer. Then I saw this pingback which led me here. There's some nice analysis and insights to consider. So given the shortcomings of my little benchmark and the comments there, I wanted to expand my test a bit and see what things look like when the loop doesn't throw away the data. The test is simple enough again:
import java.util.*; import java.text.*; public class ConcatenationTest { private static long concat(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = "Loop " + x + " of " + count + " iterations."; } return System.currentTimeMillis() - start; } private static long append(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = new StringBuilder("Loop ") .append(x) .append(" of ") .append(count) .append("iterations.") .toString(); } return System.currentTimeMillis() - start; } private static long concatAcrossLoops(int count) { long start = System.currentTimeMillis(); String s = ""; for(int x = 0; x < count; x++) { s += "Loop " + x + " of " + count + " iterations."; } long time = System.currentTimeMillis() - start; System.out.println("concatAcrossLoops time = " + time); return time; } private static long appendAcrossLoops(int count) { long start = System.currentTimeMillis(); StringBuilder s = new StringBuilder(); for(int x = 0; x < count; x++) { s.append("\nLoop ") .append(x) .append(" of ") .append(count) .append("iterations."); } long time = System.currentTimeMillis() - start; System.out.println("appendAcrossLoops time = " + time); return time; } public static void main(String[] args) { int count = 10000; List concats = new ArrayList(); List appends = new ArrayList(); List concatsAcross = new ArrayList(); List appendsAcross = new ArrayList(); for(int x = 0; x < 10; x++) { concats.add(concat(count)); appends.add(append(count)); concatsAcross.add(concatAcrossLoops(count)); appendsAcross.add(appendAcrossLoops(count)); } String header = "concats appends concats across loops appends across loops"; String format = "%7d %9d %22d %22d\n"; System.out.println(header); for(int x = 0; x < 10; x++) { System.out.printf(format, concats.get(x), appends.get(x), concatsAcross.get(x), appendsAcross.get(x)); } } } |
And then the results:
| concats | appends | concats across loops | appends across loops |
|---|---|---|---|
| 48 | 14 | 18990 | 1276 |
| 27 | 11 | 14581 | 1442 |
| 4 | 4 | 13206 | 1253 |
| 3 | 3 | 13478 | 1438 |
| 4 | 4 | 12651 | 1444 |
| 4 | 3 | 12485 | 1403 |
| 4 | 3 | 12608 | 1318 |
| 4 | 3 | 13152 | 1312 |
| 3 | 4 | 12535 | 1390 |
| 4 | 3 | 12444 | 1329 |
Notice after the first two loops the numbers for all runs drops. As the JIT compiler kicks in, we get some optimization but as you can see concatenation across loop iterations is incredibly much more expensive. In this case, StringBuilder is still the clear winner.
update
There was a typo in the original test. I was calling toString() in the appendsAcrossLoop test which was entirely unnecessary. (I forgot to remove that call when adapting from the earlier iteration.) The new results are below. I included them here rather than just replacing the table above as it shows just how expensive that toString() is.
| concats | appends | concats across loops | appends across loops |
|---|---|---|---|
| 42 | 15 | 16562 | 4 |
| 5 | 8 | 12564 | 5 |
| 4 | 3 | 11601 | 2 |
| 4 | 2 | 11141 | 2 |
| 4 | 3 | 11025 | 3 |
| 3 | 3 | 11260 | 3 |
| 3 | 3 | 11062 | 3 |
| 3 | 3 | 11738 | 2 |
| 4 | 2 | 11078 | 2 |
| 4 | 2 | 11130 | 3 |
Isn't it time for GCJ to die?
What's the motivation for gcj these days? Originally, everyone wanted a GPLd JVM so gcj kinda made sense. At least in spirit. It's never been a functional equivalent for an actual JVM, though. I've seen nothing but problems with it for years in IRC channels. It's partial implementation of the spec has led to endless confusion for uncounted newbies coming to the java channel for help. It doesn't help that the ideologues at Debian, et. al, continue to package gcj as if it were java. Well, we have a GPLd JVM now. Everything about it is open source (or just about done...).
GCJ, as I see it, serves no more useful purpose than allowing those in charge of it to hold on to some ideal (or maybe pride). I know this is inflammatory for a good number of people, but why persist? Is it the native compilation you like? The slow, misbegotten catastrophe that it is? It's slower than running java bytecode and seems to eliminate several key features of Java (like dynamic classloading). Even before Sun GPLd their (our? I'm a Sun guy after all...) JVM, gcj adoption was miniscule at best. So, what's the point? Can't we move on from gcj? Or at a minimum, stop packaging it as the default JVM when it's not actually a java implementation? That'd work for me. I'm just tired of seeing newbies getting tripped up by some distro's ideological navel gazing.
Changing the Current Directory
One of the more common requests I see online from beginners (and from a not-so-beginner just now) is how to change the current directory. This one is really simple, so here's a quick snippet and the output.
System.out.println(new File(".").getAbsolutePath());
System.setProperty("user.dir", System.getProperty("java.io.tmpdir"));
System.out.println(new File(".").getAbsolutePath());
And the output:
/Users/jlee/.
/tmp/.
See? Simple.
String Concatenation Options
There's a new inspection in IDEA 8 (might just be in the EAP at this point) that will convert string concatentation to a variety of different approaches. One of these options is to use String.format(). I started applying this option to some code I'm working because it's certainly more readable than some of the concatenation stuff I'd been doing. But I started thinking that I should probably profile this before I get too crazy with it just to make sure I'm not hamstringing myself with this. So I wrote a simple test to see what the fastest option was and I was a little surprised by the results.
First, let's see the code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | import java.text.*; public class test { private static long concat(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = "Loop " + x + " of " + count + " iterations."; } return System.currentTimeMillis() - start; } private static long format(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = String.format("Loop %s of %s iterations.", x, count); } return System.currentTimeMillis() - start; } private static long format2(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = MessageFormat.format("Loop {0} of {1} iterations." , x, count); } return System.currentTimeMillis() - start; } private static long append(int count) { long start = System.currentTimeMillis(); for(int x = 0; x < count; x++) { String s = new StringBuilder("Loop ") .append(x) .append(" of ") .append(count) .append("iterations.") .toString(); } return System.currentTimeMillis() - start; } public static void main(String[] args) { int count = 1000000; for(int x = 0; x < 10; x++) { System.out.println("concat = " + concat(count)); System.out.println("String.format = " + format(count)); System.out.println("MessageFormat.format = " + format2(count)); System.out.println("append = " + append(count)); System.out.println(); } } } |
This admittedly naive "benchmark" runs through four options and prints out the basic timing results. I've compiled the results below in a table:
| concat | String.format | MessageFormat.format | append |
|---|---|---|---|
| 408 | 3164 | 9099 | 376 |
| 338 | 2876 | 8559 | 340 |
| 300 | 3013 | 8655 | 398 |
| 342 | 2938 | 8511 | 311 |
| 308 | 2911 | 8570 | 310 |
| 306 | 2924 | 8726 | 320 |
| 316 | 3019 | 9006 | 414 |
| 306 | 2994 | 8673 | 331 |
| 346 | 3022 | 9588 | 311 |
| 312 | 2988 | 8590 | 313 |
As you can see both format methods take considerably more time than the other two. I was a little surprised to see this though the magnitude of the difference was more surprising than than the difference itself. So neither of those are options you'll want to consider in areas that get called often. The one that was really suprising for me was the relative similarity between concatenating strings and appending using StringBuilder. While StringBuilder was generally faster than string concats, the difference was rather minimal and in some runs actually slower. What this says to me is that the generally accepted "wisdom" that String concatenation is slower than using StringBuilder is clearly wrong.
On the other, I'm not really a performance expert so I might be doing something stupid here or missing something fundamental. The test seems rather straightforward, though. What do you think?
Dealing with NullPointerExceptions
One of the most common problems that begginers run into (and some 'experts' though they can handle it i should hope) is the dreaded NullPointerException. One of the most frustrating things about the NPE is that it doesn't say what was null. All you get is a line number in the stack trace so if you're doing a lot on that line, it's not always clear. There are various ideas about how to clean that up in java 7. There are other solutions here and there as well so we'll see if any of that makes it into Java 7 or not. But we don't have Java 7 yet so none of that really helps. So here's my methodology/suggestions for dealing with them now.
First, let's list why NPEs happen. That will help you find many NPEs just by looking at the code. The most common (and obvious to the seasoned developer) is that you didn't initialize a variable. Now, local variables must be initialized so the compiler helps you out a little there. However, instance and class fields do not so you'll need to be careful there. Also, you can silence the compiler complaints about uninitialized local variables by setting them to null. Obviously, if you don't reassign these references before trying to use them, you'll get an NPE.
Another source is method return values. If you can't see the code being called, there's no real guarantee that you'll get a non-null reference back out of it. To be safe, all these values should be check for nulls before using them. Of course, your own methods might have bugs in them such that an null return might not be obvious. Or it might be intentional. In any case, you need to be mindful of the risks of using return values.
Dealing with them is simple enough though apparently not that obvious to a beginner. As I mentiond earlier, the exact line is mentioned in the stack trace in the error logs or on the console. Given the discussion above, you can often spot the offending reference just by looking at that line of code. But if you do a lot of things one a line of code like I usually do, it can sometimes be less than obvious. The solution is simple enough here, too: break the line down into simpler bits. If you have chained method calls, for example, create local variables for each step and print out the results:
foo.bar().bob(getDoug()).dude(getCar());
This becomes:
System.out.println("foo = " + foo);
Bar bar = foo.bar();
System.out.println("bar = " + bar);
Doug doug = getDoug();
System.out.println("doug = " + doug);
Bob bob = bar.bob(doug);
System.out.println("bob = " + bob);
We can essentially rule out the results of getCar() as that would result in an NPE on another line if that null gets passed into dude(). But this should be enough to highlight the exact value that is null. Once you correct that, you can recombine all that back into online if you like. There you go. NPE found and fixed.
It's nothing fancy or complex. Just a little legwork to get you over the hump. This sort of thing becomes second nature to seasoned programmers but isn't always the most obvious to beginners. I hope it helps some of you on your way.