When a language is only as good as its tools

Ruby programs can be written in unicode, meaning that you can label a real lambda. Once, when working on a compsci class assignment to build a simple regular expression interpreter, I had a similar flash of playfulness. Since Java is written in pure unicode, I decided to use the real (uppercase) lambda Λ as the method name for my lambda transitions and a few other such tweaks.

Naturally, the code compiled and ran perfectly, and it was smooth sailing until the time came to generate javadoc. On the first pass, it failed before it even generated the HTML output. The non-ascii method names led to a slew of garbled errors and character encoding clashes. Tweaking the probable command line settings I was eventually able to get it to generate the HTML files, but there was no clear way of getting valid markup or properly setting a UTF-8 content encoding, so I was left with ugly links like ?? everywhere. It felt like I was back in the 90's, even though I was using the most recent version of Java (at the time) with generics and all.

What is the point in having a language that can be written in unicode if its surrounding tools are completely agnostic to anything but ASCII? I can only hope the situation will improve.