Sun Java Wireless Toolkit 2.5 for Linux

April 13th, 2007

The Sun WTK is finally available for Linux again, you may now download the WTK v 2.5.1. The last WTK version which was available for Linux was 2.2, after that WTK was windows-only, and now by popular demand the Linux version is back.

The WTK is the primary tool for JavaME (MIDP) development, i.e. for writing midlets to run on mobile phones.

I’ve run the half-marathon

March 29th, 2007

maraton-startLast Sunday I participated in Warsaw’s half-marathon. I’ve run it to the end, 21km, in just under 2 hours and 5 minutes; that is an average speed of ~10.1km/h. I was lucky the weather was good, warm and sunny.

All in all, it was a great experience, and I’m really happy I participated. I would recommend it to anyone to run a half-marathon, at least once. And it’s not that hard: I’ve only had 2 weeks of training (running in the park) prior to the course. For two days afther the half-marathon, my legs muscles were hurting a bit..

My wife waited for me at the finish line. We were expecting I’d take 2h30min, and she was surprised by my performance :)
I’ve got a medal (everybody who finished got one), and there will also be a nominal dyploma mentioning my time.

And a tip for prospective long course runners: you may run as slow as you want, as long as you keep running (not stopping or walking). If you get tired, or you can’t breath enough, slow down the pace as much as you need, but keep running — this is the most efficient way to gain time (not stopping).

Menstral and Javia Calculator become open source

March 17th, 2007

I have just made available the source code for my projects
Menstral and Javia Calculator under the liberal MIT License.

To get the source code you need a subversion client.
The svn repositories are here:
Menstral Source Code and
Javia Calculator Source Code.

Enjoy,
Mihai

Going open source?

March 2nd, 2007

I’m thinking about open-sourcing my two JavaME (mobile phone) projects, Menstral and Javia-Calculator.

My reserve to open source them until now was caused mainly by a hope to transorm these projects in a commercial offering at some point. So my recent inclination to open-source them also means that I’m not planning to make any money from these projects anymore.

The main reason for open-sourcing is to allow the programs to evolve (bug-fixes & enhacements) even as I have less time (and inclination) to do everything myself. Of course I’m aware that just open-sourcing a project doesn’t automatically create a community around it, and this is the open-source challange.

I have to choose a code repository (SVN). The choices I consider are: SourceForge, Google Projects, and self-hosted. Self-hosted is more work, but also allows for the greatest flexibility.

Considering the license: MIT or GPL? I personally like the MIT license more, as I consider it less restrictive (and much shorter) than GPL. On the other hand, GPL protects against the possibility of competitor closed-source projects freely taking advantage of the code.

Running the half-marathon

March 2nd, 2007

I’m going to participate in Warsaw’s Half-Marathon which takes place on March 25 — three weeks from now. For me this is the first time I take part in an official half-marathon. The running length is about 21km, and the time limit is 3 hours (the runners who don’t complete the course in this time have to abandon). My main target is to just run it to the end, but as a secondary goal I’d like to do it in under 2h30.

Since today I began training, which consists in running in a park which is not too far from where I live. So I have about 3 weeks for the training; I’ll post a table with the distance I run during this time.

Initially I wanted to run a full marathon (42km), but this is an ambitious goal for a sedentary developer like me. Compared to a marathon, a half-marathon is much easier (more than twice easier).

Here is the starting list where I appear under the name ‘preda’, with T-shirt number 465. It looks it will be crowded, with about 1000 runners.

Training:

Day Distance (km) Time (min) Speed (km/h)
1 5 - -
2 9 60 9
5 12.5 83 9
7 12.5 78 9.6
10 16 103 9.3
12 12.5 78 9.6
14 16 97 9.9

Practice your Handwriting

February 14th, 2007

Let’s consider this hypothetical scenario:

A core-IT company is recruiting highly-skilled IT professionals. Every recruiting candidate must fill a number of standard forms, for example:

  • non-disclosure agreement (NDA)
  • code sample agreement
  • self-identification form
  • employment application
  • bank account information
  • itemized expenses sheet

Some of these forms are accompanied by corresponding explanatory documents, which explain how and why to fill the forms, etc. (these accompaniatory documents are just for the information of the candidate, they don’t have to be filled or returned).

And now, let’s propose two realistic ways of implementation:

Scenario A

The recruiter send an email which has attached a zip file containing a mix-and-match of a dozen of Word and Excel and PDF documents. The candidate (an IT professional) tries at first to fill-in these documents electronically, on the computer (in order to print them already filled-in), but quickly realizes that they were not designed to be filled on the computer. They are made for ‘pencil & paper’ filling, and trying to fill them on the computer is a pain, not worth the effort (imagine editing text in a jpeg image, to get an idea).

So that’s what the IT professional does: prints them all, and proceeds with the totally inhabitual activity (for an expert touch-typer) of hand writing, taking care to respect the ironic indications that appear on the documets, like this:

Name of Candidate (please print): ________

To translate, in this context please print means: please use all-capitals while hand-writing — no, it doesn’t make reference to using a printer device.

Having finished the labour-intensive activity of hand-writing (potentially going through a couple of iterations), the candidate signs the documents and admires the finished work: what an interesting font I have..

Note: of course these paper documents have a penchant for redundancy: the candidate has to fill his name, address, etc, repeatedly on each document.

The candidate takes the bunch of papers with him, and presents them to the recruiter, who enters the paper-written data into the company’s recruiting information management system.

Scenario B

The recruiter provides the candidate with a ‘candidate ID’, which the candidate uses to log in on the restricted-access recruiting web site.

On the web site he finds a (single, comprehensive) web form that he fills in (in the browser, not on paper, btw). When he submits the web form, his data is registered in the recruiting information system, and a number of nice PDF documents are dynamically generated, all personalized and already filled with his data (name, address, etc).

The only thing missing on these generated PDF documents is the candidate’s hand-written signature, so he prints and signs them (talking about digital signatures in our situation would be too large a leap…)

Later the candidate presents to the recruiter the signed documents. The recruiter only has to deposit the papers in the archive, because the data was already entered in the information system (it happened when the candidate submitted the web form).

Well, that’s it. What scenario would you bet is used by our hypothetical high-IT company, for recruiting high-IT professionals?

Logo Generator

February 7th, 2007

Today I had some fun with Python and the result is a cute icon generator (inspired from 9-block patches and indenticon).

The images are dynamically generated so you’ll get a fresh set at each reload:
http://blog.javia.org/icons/medium

They are also available in large http://blog.javia.org/icons/large and small http://blog.javia.org/icons/small sizes, enjoy :)

Carnival of the Mobilists at Mobbu

January 29th, 2007

Mobbu is hosting this week’s Carnival of the Mobilists, check it out.

URL Design

January 26th, 2007

The URLs that are used to access an web application are important, they are part of the interface to the application. Here I put together some random principles which should help towards a quality URL design.

  1. Keep URLs short. Short is easy to remember, to edit in the address bar, to write down, etc. The most often used URLs should be the shortest.
  2. Don’t use file extensions (such as .html, .gif, .php, etc.). The extension is an implementation detail that doesn’t need to be presented to the user.
  3. Normalize URLs. Every entity should have one canonic URL. Have other URLs that point to the same entity use HTTP redirects to the canonic URL.
  4. Normalize the domain name to the form without the leading “www.”, because it becomes shorter. The leading www. doesn’t add information for the user.
  5. A path which accepts to be continued with more segments should behave like a directory, i.e. should be normalized to have an ending slash.
    If http://x.org/user/john is a valid URL, then http://x.org/user should redirect to http://x.org/user/ (note the appended slash).
  6. If a path is valid, then any prefix path should be valid too.
    If http://x.org/admin/user/john is valid, then these should be valid too: http://x.org/admin/ and http://x.org/admin/user/ .
  7. Have the prefix URL allow exploration (browsing) to the possible suffixes that may follow it.
    In the case of http://x.org/country/brazil : http://x.org/country/ may provide a list of possible contry names, with links.
    In the case of http://x.org/user/john : http://x.org/user/ may provide an input field for entering an user name if the number of users is too large to enumerate them all in a list.
  8. Avoid ‘query’ parameters in URL. Instead of http://x.org/?p=101 use http://x.org/p/101 . An exception are the form-action URLs which get the parameters appended due to the form submission.
  9. Avoid session-IDs or other long binary (opaque) identifiers in URLs. If you really absolutely need an opaque identifier in the URL, then make the information as short as possible (number of bytes) and encode it in the most compact way (e.g. base64).
  10. Try to have homogenous levels, where all the continuations (suffix segments) of an URL are of the same kind.
    Example: If http://x.org/user/ is continued with ‘john’, ‘andy’, ‘mary’: it’s good, all the continuations are user names (nouns).
    But if http://x.org/user/ is continued with ‘john’, ‘edit’, ‘remove’, ‘102′, ‘new’: bad, because ‘edit’ and ‘remove’ are actions (verbs), ‘john’ is a noun, ‘102′ (noun too) is an id and not a name, ‘new’ is a noun but designates an action (add), etc. This level is not homogenous (it’s mixed-up).

It’s great if a web application has such a simple (and nicely structurated) URL space that the user, after some use of the application, aquires a mental map of the URL space. From that moment on he won’t have the sentiment of being lost in a huge URL spaghetti anymore. He achieves orientation: by simply looking at the URL he knows where he is and where he can go. He can even extrapolate from the observed URL structure and explore new URLs.

Google OpenID

January 24th, 2007

OpenID has two good ideas:

  • The user identifies himself using an URL
  • The same ID (URL) is used to login to multiple web sites

Google, on the other hand:

  • Has a large number of users, and a large internet footprint
  • Uses a single ID (the gmail account name) for login to multiple google services (gmail, analytics, adwords, sitemap, blogger, etc)

Google should realize that allowing the google users to use their same (google) ID on other (non google-affiliated, independent or rival) sites, while maintaining the single google logon, is something that the users want. OpenID offers an API for doing exactly that. In this situation, I see three possibilities:

  • Google embraces OpenID: Google starts offering an (automatic) OpenID URL for each existing google account, thus allowing the google accounts to be used on any OpenId-enabled web site
  • Google develops its own alternative single-logon API, and fights OpenID. The google IDs won’t be URLs in this case, which is unfortunate
  • Google does nothing and misses the single-logon opportunity

Midlet Signing

January 24th, 2007

Disclaimer: this article is in draft state, I plan to come back and improve it upon checking the facts, so take it with a grain of salt.

The fact that a midlet is signed by the author (vendor) certifies two things:

  1. that the author really authored that midlet
  2. that the midlet is in original state (i.e. it hasn’t been modified by a third party)

The purpose of signing a midlet is two-fold:

  1. to gain access to security-sensible operations on the phone (such as sending an SMS or opening a TCP connection)
  2. to make sure that a (malicious) third party can’t present a different midlet as the original one

Digital signing makes use of two concepts: public-key cryptography (a.k.a. asymmetric cryptography) and digital certificate.

Signing works like this: a secure hash (also called message digest) of the midlet is computed, and this hash is encrypted with the private key of the signer. The signature is checked like this: the encrypted hash is decrypted using the public key of the signer, and is checked for equality with a computed hash of the midlet. If the two (the signed hash, and the actual computed hash) are equal it means that the midlet signature was verified successfully.

The idea is that only the owner of the private key can sign the midlet (so that the signature is verified with the corresponding public key), and that any modification to the midlet after the signing will cause the signature verification to fail.

As you see, all that is needed to check the signature is a public key. The result of the signature verification is a yes/no to whether the midlet was signed by the owner of the private key corresponding to the public key used.

But in order to be meaningful, the result should be whether the midlet was signed by a real-world entity (the midlet author/vendor, a person or company). So the public key must be associated to the identity of some author/vendor. This association between a public-key and a real-world identity is achieved using digital certificates.

A digital certificate claims that some public key belongs to some entity (a person or a company). A digital certificate contains the pair: public key, name of the entity (who owns the key). A certificate is also signed (using the same signing procedure we’ve seen above) in order to prevent the situation that a fake certificate is built by a malicious entity which claims a different identity for its key.

The midlet is signed and the midlet signature is verified against a certificate (which contains the public key needed to check the signature, and the name of that key’s owner). This certificate is also signed, and the certificate signature is verified against another certificate, which contains the public key for checking the first certificate’s signature and the name of this key’s owner. This second certificate (which signs the first certificate) is at its turn signed with another certificate. And so on, we have a chain of signatures.

This chain of signatures ends with a certificate which is not signed by any other certificate (a detail is that this original certificate is self-signed, meaning that it’s signed with its own public key, but this has the same effect as not being signed at all). The only way to be sure that this self-signed certificate is not fake is to have it in a list of trusted root certificates.

For example, your browser has a list of trusted root certificates, which contains certificates from well-known and trusted companies such as Thawte and Verisign. Any chain of certificate signatures is followed until it ends with some such trusted root certificate.

But if the final (self-signed) certificate is not found in the list of trusted certificates, then it can’t be trusted (it’s authenticity can’t be verified), so the whole chain of signatures can’t be verified.

Now let’s go back from this general signing stuff to our midlet signing. Any mobile phone comes with a list of trusted root certificates (used for verifying midlet signatures). This list contains root certificates selected by the manufacturer (e.g. Nokia) and usually also contains the root certificate of the network operator (e.g. Orange) in the situation when the mobile phone is distributed by an operator.

Let’s consider a practical example: I am a midlet author. I write a midlet, and I want to sign it, what do I have to do?
First, I need to buy a certificate from some certification authority (i.e. seller of digital certificates), such as Thawte, Verisign, GoDaddy. Because I want to use this certificate for midlet-signing, I have to be careful to select a code-signing certificate (there are also other kinds of certificates, most notably SSL certificates used to certify the identity of web sites, which can’t be used for midlet signing). So I choose to buy a code-signing certificate from Thawte for $200/year (note, the price is per year, recurring, as with a domain name).

To buy the certificate, I first generate a pair of keys, private-key and public-key. I keep the private key secret, and I send the public key to Thawte for inclusion in the certificate. Thawte checks my identity (using perhaps some photo ID that I fax them, and a phone call from them), and afterwards sends me back a certificate containing my public key and my identity (my name). Most importantly, this certificate is signed with Thawte’s root certificate (this is what for I paid the certificate’s price of $200/year).

Using this certificate I just bought, I sign my midlet, and I start distributing the signed midlet. Now some user tries to installs my midlet on his phone. The phone automatically checks the midlet signature, i.e. it follows the signature chain, starting with the midlet, my certificate, and ending at Thawte’s root certificate. If the phone has Thawte’s certificate in its list of trusted certificates, it’s all dandy and the signed midlet is installed. But if it happens that the particular phone model doesn’t contain Thawte’s certificate in its list, the signature verification fails, and the midlet isn’t installed, and the user can’t use it at all. This situation means that I paid the certification money to Thawte for nothing.

Now you may think, perhaps this example is un-reallistic, I surely may buy a certificate which is recognized by all phones? Well, no. Different phone manufacturers (and even different models from the same manufacturer) have different sets of trusted root certificates. For some phones Verisign works and Thawte doesn’t, for other phones Thawte works and Verisign doesn’t, for others neither Thawte nor Verisign works, and so on. The practical solution: buy as many certificates from different certificate vendors as you can afford (all for the same public key), and use them all when signing your midlet. When your midlet is installed on a phone, if at least one of those certificates works, the midlet signature can be verified and all is good. The downside of this method: it costs a lot of money, and it generates a lot of clut and redundant work.

What’s more: while some phome models allow the user to edit the trusted certificate list (perhaps in order for the user to add a root certificate he finds missing), many phone models do not allow it. (I can’t understant why the manufacturer would disallow the user to add new trusted certificates, I consider this a crippleware tactic. For example, Nokia S40 2nd edition allowed user access to the trusted certificate list, while the more recent S40 3rd edition doesn’t allow it anymore.)

So, let’s turn back to my case study (and publicity plug). I developed two freeware midlets (Menstral and Javia Calculator), that I distribute free of charge. Let’s say that, for the security of my users (in order to protect them from a malicious distributor who tries to impersonate my midlets), I want to sign my midlets, as to to certify that it’s really me the author, and that the midlet hasn’t been tampered with. So what do I have to do? Pay $200/year for the certificate, and have my midlets fail to install anymore on part of the phones because my certificate isn’t recognized… this is why there are so very few signed midlets.

You’d think it can’t get worse than that? why, it can: Motorola, Nokia, Siemens, Sony-Ericsson and Sun saw there is a problem and they promptly found the ’solution’: create a new certificate, and have every phone support only this new certificate, and create a signing program which will bring them more money (and have the midlet authors pay).

This new initiative is called Java Verified. The new certificate, which is the only one supported on newer Nokia phones (see Nokia 5300 at “Java verified root certificate”) is called UTI Root (Unified Testing Initiative).

And how does this Java Verified, Unified Testing Initiative works? Simply: you pay! No, really: you send your midlet to one of the Java Verified Test Providers (like CapGemini), who runs a series of tests of your midlet. For example, they check that your application provides a Help command and an Exit command, that it doesn’t hang, that it doesn’t do anything malefic (but they don’t have access to the source code), etc. They do this by running the application a few times on a real device. If they find some problems they send you back a problems report, and (after attempting to fix it) you have to pay again for one more try at the Java Verified certification. Pay, and repeat.

How much does it cost? With CapGemini, it costs 240 euro for the first submission, and 210 euro for subsequent submissions (after failing certification on the previous attempts). This price is per midlet and per targeted device. That is, if you want your signed midlet to run on multiple phone models, you have to pay once for each phone model (see a list of devices supported by Java Verified, you have to pay for each one).

So Java Verified considers that a midlet author should specify the targeted device for the midlet, the targeted device being something like Nokia 5300 or Nokia 6131. So much for the portability of JavaME applications, which says that a JavaME midlet should be able to run on a wide range of devices, from different manufacturers, with different screen sizes, etc. (And Sun, who should be the main pusher of java portability, is a member of Java Verified.)

For example, my Menstral midlet is supposed to run on any MIDP device with a color screen at least 128×128 pixels in size. I guess such a ‘targeted device’ counts as many tens of Java Verified devices.

So again, that’s why there are so very few Java Verified midlets…

Midlet-Vendor

January 21st, 2007

A mandatory midlet attribute, which must appear in both the jad and the jar, is Midlet-Vendor; it designates, as the name suggests, the vendor (or the author) of the midlet. One problem with this attribute is that the structure of its value is not specified in the MIDP standard: it may contain any string (including whitespace). In this way, while the Midlet-Vendor attribute may still be useful for presentation to the human user, it is of limited use for authomatic processing.

Let’s consider some examples that illustrate the point:

Midlet-Vendor: Mihai Preda
Midlet-Vendor: mihai-preda

Midlet-Vendor: www.javia.org
Midlet-Vendor: myself

Midlet-Vendor: FooSoft
Midlet-Vendor: Foo Soft Inc., USA

In the first examples we see different ways of writing essentially the same vendor name, by varrying case or whitespace. While it looks the same to a human reader, certainly the vendor string is different from a computer perspective. This makes it difficult to automatically associate the different midlets with the same one vendor. There is also a second problem with such a vendor name which contains the name of the author: of course there are multiple persons with the same name (unless the author has some particularilly rare name), so it may happen that two distinct authors (with the same name) turn up being identified as the same vendor.

Let’s look toward a solution: We want the Midlet-Vendor to have a normal form, which allows to reliably compare two vendor strings in order to detect if they designate the same entity or not; and we want to eliminate the possibility that two different entities turn up using the same vendor string.

The solution that I propose is to have an URL (URI) as the content of Midlet-Vendor. In order to make it explicit that it’s an URL, the schema (e.g. http://) should be present.

Midlet-Vendor: http://javia.org/

The URL should point to a page which contains more information about the vendor; it could point to a company’s home page, to the blog of the author, etc.

One objection to this solution could be that the Midlet-Vendor is also intended for presentation to the human user (of the mobile phone), and the company URL is not as clear to a human as the company’s name. While I don’t necesarilly agree with this argument, a solution would be to have the Midlet-Vendor contain a free string (the company name, intended for user presentation) and an URL, intended for equality-comparison between different Midlet-Vendors. Examples of good Midlet-Vendor strings:

Midlet-Vendor: Mihai Preda (http://javia.org/)
Midlet-Vendor: Foo Soft Inc. (http://www.foo-soft.com/)
Midlet-Vendor: http://google.com/

The important point is this: the Midlet-Vendor should always contain an URL (eventually enclosed in brackets), and only the URL should be used for equality-comparison between different Midlet-Vendor strings.

But according to the MIDP standard, the full Midlet-Vendor string must be compared for vendor-equality — so in practice I would suggest to use only an URL as the content of the Midlet-Vendor.

Dreamhost

January 18th, 2007


I’ve been using Dreamhost for one year now, and I thought I’ll write a review, to see how the referral system works (disclaimer: I get referral fees for everybody who signs up following a link from this page).

The lowest plan costs about $10/month, and comes with about 180 GB disk space, and 1800 GB transfer/month; this is rather huge. I look on the status page for my account (that I share with two friends), and I see that together, our disk usage is at 1% of the space available, and out traffic usage never got as big as 1% (it always stays at 0%). So for practical reasons, the disk and transfer seem unlimited.

You can create up to 75 Linux user accounts, with shell (bash, ssh) access. You can host any number of domains, with any number of subdomains and mailboxes; all this free.

Dreamhost offers one-click installs of WordPress, phpBB, Joomla!, MediaWiki, activeCollab and many others.

I’ve installed and I’m using Django (Python web framework) without any problem (with FastCGI). Dreamhost also supports Ruby on Rails, although I don’t use Rails myself. For databases, DH supports MySQL (with many storage engines: MyISAM, InnoDB, BerkeleyDB, etc — InnoDB supports transactions) and of course SQLite (PostgreSQL is not offered yet, but may be added in the future if there is demand for it). Of course, you may have an unlimited number of MySQL databases. Dreamhost supports FastCGI. As web server, Apache versions 1.3 and 2.0 is available, with full .htaccess configuration (including mod_rewrite). PHP (versions 4.4.2 and 5.1.2) is of course available.

Dreamhost also offers subversion (svn) that you can use either through ssh, or access with the browser. I keep my development svn repository at DH, thus taking advantage of their data security and backup (if my laptop gets stolen, or if my hard disk explodes, I won’t lose my source code). DH also offers webdav (mod_webdav in Apache).

It is possible to compile and install at Dreamhost the applications or libraries that you need (if they are not already offered by DH) — this comes handy when developing a web application. For example, I’ve installed on my account: Python 2.5, Django (svn version), GDB, SCons, GeoIP, ImageMagick, readline, ares, and the most recent version of SQLite.

You also have low-level access to the DNS records (so you may add any DNS records on your domains: A, CNAME, MX, NS, PTR, TXT, etc). And the greatest thing: even though wildcard DNS is not officially supported, the DH support team was kind and, at my request, has just activated wildcard DNS on one of my domains (thanks DH, this is great!).

And Dreamhost offers a free domain name registration (com, org, net — which typically cost $9/year) with any account.

What’s more: on my Linux laptop, using FUSE and sshfs, I remotely mount my DH account directory as a mount point on my local filesystem. This way I can work with and edit all the files from DH just as if they were local on my laptop, which is very useful.

About the DH drawbacks: the DH servers do not seem as responsive as they could be. And an account doesn’t come with a dedicated IP included (you may add IPs at additional cost). Because of this, it’s not possible to do SSL (https://) on the hosted domains ‘out-of-the-box’ — you need to add a dedicated IP, which costs $4/month, for that.

In conclusion, Dreamhost offers a huge amount of liberty; it is as close to VPS (virtual private hosting) as it gets, while keeping the cost in the low shared-hosting range.

If you read thus far, don’t forget to use the promo-code WILDCARD when you sign up: it will get you the one year plan (which normally costs $119.40) down to $29, with one domain registration included (isn’t that incredible?).

Update

January 14th, 2007

european union flag
Since the 1st of January, 2007, Romania (and Bulgaria) is a member of the European Union. That’s great, congratulations to Romania and Bulgaria for making it. (I’m Romanian myself, and yes this EU accession does feel good).

The open-moko phone (FIC Neo) is now scheduled for release in February 2007. I can’t wait to get one. Its release becomes even more interesting in light of the recent Apple’s IPhone annoncement, as the two phones, despite many similarities, are at the two extremes of freedom: Open-Moko comes with full open source code, while IPhone doesn’t allow instalation of third-party applications and comes with a 2-year contract.

The MIDP 3.0 (JSR 271) early draft is available for download. This draft is the best way to find out now what MIDP 3.0 will bring; it should be an interesting read for anybody involved with JavaME. The most important addition seems to be the concept of libraries (which can be shared between midlets).

For the first time, I am a participant in the Carnival of the Mobilists (edition #58), with my article JAD is Bad (which argues that the JAD concept introduces un-needed complexity).

I’ve started working with Django, which is a Python web framework. A nice thing about Django is that the source code is relativelly small and readable, so it’s easy to check the code out in order to see how it works or to fix things. (reading code is also a good way to learn a programming language). I’ve already submitted a patch regarding the append-slash redirection.

I’ve written a small Python library called Python Mobile User Agent, which analyzes the User-Agent HTTP header in order to detect whether it belongs to a mobile device (and attempts to extract the mobile’s vendor/model from the User-Agent). This is useful for server-side content adaptation, where a web application generates a different variant of a page for a mobile browser vs. a desktop browser. The library is based on Craig Manley’s perl/php MobileUserAgent library. I’ve put my library on Google code hosting (under the MIT license) in order to get a feeling of how Google code hosting works.

Menstral’s new feature: SMS transfer

December 7th, 2006

Menstral is a menstrual calendar (also called called cycle or ovulation calendar) application I wrote one year ago.

Menstral 1.9.0, released today, adds an interesting new feature that I’ve called SMS Transfer: it allows you to send your calendar data, through SMS, from one Menstral instance to another Menstral instance running on a different phone.

This is useful for the users who change/upgrade their mobiles, as they expect to be able to continue using the applications from their old mobile on the new one (this expectance includes the continuity of the persistent RMS data).

I wonder what other mobile applications out there are using the SMS Transfer. I’d suspect this technique will become more popular as the developers realize that they have to catter for their users’ habit of periodically switching mobiles.

And one more prediction from my foreseeing bag:
The young(ish) mobile users will grow out of the pictures/ringtones through MMS thing (which never really took off anyway), as soon as they’ll discover they can now send menstrual calendars through SMS.. wow!

(On the same funny line, a surprising observation is that a significant part (perhaps more than a half) of Menstral’s users are Men!)

Trackback

December 6th, 2006

I’d like to thank the blogs that linked to my article “Assembly Java”, a post written with a certain amount of irony, as I can’t consider its so-called hints and tips good advice myself. This was the very first publicly read article from my blog and it marked, so to say, my debut in the blogging society.

First C. Enrique Ortiz mentioned my post on his Mobility Weblog. David Beers, on the Software Everywhere blog, wrote about Optimization vs object orientation followed by a concise conclusion, pointing to the right middle ground between high-level design and low-level optimizations. He makes the point very well, [..] don’t start thinking of Java as if it were assembly language or you’ll miss out on the advantages it still holds over C for many kinds of mobile development. Edoardo Schepis questions whether is still there any Java ME Programmer Thinking in Object and warns against optimization extremists; his post is an entry on this week’s Carnival of the Mobilists. And most recently, Thomas Landspurg on the TomSoft blog writes about J2me mobile development practices.

Javia Calculator early access starts today

December 4th, 2006

My newest application, Javia Calculator,
is available for early-access download; go try it out while it’s hot.

Javia Calculator is (you guess!) a calculator for mobile phones (requires CLDC-1.1).

Well, one more calculator… but one won’t stay in your way when you’re trying to get the result.

It features trigonometric functions, logarithms, user-defined functions and constants, history navigation, implicit save, and an efficient input method.

JAD is BAD

December 2nd, 2006

JAD is an acronym for Java Application Descriptor. JAR comes from Java Archive.
In the MIDP world, the JAD is a small text file (with the extension .jad) which contains a few lines of information about a Midlet application.

An example JAD file:

MIDlet-1: Menstral, /M.gif, M
MIDlet-Name: Menstral
MIDlet-Vendor: Mihai Preda
MIDlet-Version: 1.8.11
MicroEdition-Configuration: CLDC-1.0
MicroEdition-Profile: MIDP-1.0
MIDlet-Jar-URL: http://menstral.net/Menstral.jar
MIDlet-Jar-Size: 31105

Noteworthy inside the JAD are the MIDlet-Jar-URL line, which indicates the location of the JAR, and the MicroEdition-Configuration and MicroEdition-Profile lines, which represent the required versions of the CLDC and MIDP that the device must support in order to be able to run the application.

One way to install applications is: the phone downloads the JAD, checks whether the required versions of CLDC and MIDP are satisfied, and if the answer is positive it proceeds to download the JAR file containing the application.

Another way is to directly download the JAR. This works because every JAR contains inside a special file, the Manifest, which contains mostly the same information as the JAD. That is, the phone can do without the JAD because it can get the same information from the Manifest found in the JAR.

So, if the JAD is duplicated inside the JAR (in the Manifest), and the whole download/installation can proceed very well without the JAD, what is the reason for the JAD’s existence?

Well, the idea is that the JAD is small, but the JAR is not so small. If your phone can’t run the application (because it doesn’t satisfy the required conditions specified in the JAD), it’s better to just download the JAD to find it out (and skip the bigger JAR download in this case).

This is the only gain of using the JAD/JAR pair compared to using only the JAR: you don’t have to download the (big) JAR in the situation when your phone can’t run the application.

But here comes the surprising affirmation: the JAD is actually a bad thing, and the fact that this whole JAD concept got into the MIDP standard is a flaw, and we, as midlet developers, should favor the “JAR-only” distribution, thus driving the JAD to practical obsolescence even if it is in the standard.

Here is why the JAD is BAD:

The JAD is not needed

1. The situation when the JAD saves bandwidth by avoiding the JAR download is infrequent.

Most users download midlets that run on their device much more frequently then they download midlets that don’t run on their device. This is because at the point when they decide to download a midlet, they have strong reasons to suppose that it will run on their phone. One factor is that the web server which offers the download can detect the phone’s capabilities (e.g. by using the User-Agent info sent by the phone) and can only offer for download midlets which can run on the particular device. Another factor is that the midlets’ web page contains textual information about their requirements, such as “This midlet runs on any MIDP 1.0 mobile phone”, or “this midlet runs only on Nokia, MIDP 2.0 phones” etc, that the users can evaluate before initiating the download.

2. The JAD benefit, infrequent as it might be, is small.

Even in the (rare) situation when the JAD saves the user from downloading the JAR, it only saves a few tens of Kilobytes of bandwidth, which is not likely to make a big difference to the user. It’s not like the JAD-avoided JAR download is a must-have lifesaver feature that users simply can’t do without — quite the opposite.

3. The JAD’s benefit can be harnessed without using the JAD.

Here it gets interesting: the JAD concept isn’t needed in order to get the bandwidth save mentioned before. The same bandwidth saving can be obtained without the JAD.
This is how:
- have the archiver which creates the JAR archive always position the Manifest file at the beginning of the archive.
- have the mobile’s downloader incrementally extract the Manifest while the JAR is downloading. As soon as the Manifest has been extracted, but before the whole JAR was downloaded, inspect the Manifest to see if the device can run the application. If it can’t, stop the download without transferring the rest of the JAR, thus saving bandwidth.

The MIDP JSR, instead of the over-designed JAD contraption, could have simply used two non-normative guidelines: the java archiver (jar) should place the Manifest at the beginning of the archive, and the mobile’s software should inspect the Manifest as soon as it becomes available during the download. But I guess the standard-makers were too busy to consider the simple solution.

The JAD comes with a cost

So, we’ve seen it could have been just as good without the JAD, but one might argue: one the other hand, it’s not like having this JAD thing hurts either. Well, it does hurt:

1. The users

There are two main ways to install midlets: over the air (downloading them directly on the device), and by copying the JAR from your desktop to the phone’s filesystem using a data connection (infrared, bluetooth, usb, etc). The second way of installing can’t use the JAD, but must use the JAR. So the midlet download sites can’t offer just the JAD to their users, they must offer the JAR too (in order to support the filesystem-istall).

The user who wants to download a midlet is presented with the choice: here is the application you want, Foo.jad and Foo.jar. The user stops and thinks: which one should I choose? or maybe I have to use both? or perhaps I’ll try one and then the other, and see which one works? and what’s that jad thing anyway? etc.

I just think of all the users that installed on their phone’s filesystem only the JAD, and afterwards wandered why it doesn’t work. Even myself, before being a midlet developer I was a midlet user, and I kept installing on the phone’s filesystem both the JAD and the JAR, because I didn’t understant their roles (of course, on the phone’s filesystem the JAD was useless and ignored). It took me a few midlet installs and google searches until I realized that the JAR is enough and that I don’t need to bother with the JAD.

The general principle is: don’t make the users choose/think, and they won’t get confused.

2. The developers

The JAD adds significant complexity, even if this is not obvious at the first sight. For example, think of the conflict resolution rules for the situation when the same key has different values in the JAD and in the Manifest. Did you consider the signed midlets?

Or think about which keys should be placed only in the JAD, which ones should be placed only in the Manifest, and which ones should be placed in both the JAD and the Manifest. And the complexity applies also to the device developers who have to implement the standard.

Conclusion

I suggest that midlet distributors offer only the JAR (and drop the JAD), as this will reduce the confusion of the users.

The guidelines concerning the placement of the Manifest at the beginning of the JAR, and the early inspection of the Manifest during download should also gradually come into use.

Mobile Java

November 28th, 2006

In the previous post I described what I call “Assembly Java”. The name itself, Assembly Java, is paradoxical (oxymoronic) through the opposition between a high-level programming language (Java) and a low-level programming style (’assembly language’).

The previous post was intended, I guess, as a (subtle) critique of Java used as a low-level programming language.

In the constrained mobile (MIDP) environment, the Java language becomes deprived (through the low-level programming style) of its advanced features, such as classes, virtual dispatch, polymorphism, packages, reflections, and, to some degree, garbage collection. Java turns into a C-like low level language, while still missing some of the C features that would have been useful in the restricted environment, such as: data structs, stack-allocated instances (’auto’ variables, object instances that aren’t allocated on the heap), arrays of structs (with the structs stored inside the array, not with pointers stored in the array), inlined functions, preprocessor (used for conditional compilation and macro expansion), enums (useful for the ’switch-dispatch’ pattern, among other things), first-class functions (’function pointers’), etc.

But don’t get my critique too harsh: I’m an enthusiast mobile Java developer myself, and I would recommend Java as the first choice (by far) for mobile application development.

There is also the time aspect: due to the quick evolution of the hardware, the mobile environment is becoming less and less constrained, and thus mobile Java programming is gradually getting closer to the standard Java programming, employing less low-level optimizations and more high-level design. Also the mobile VM (java virtual machine) should be expected to become smarter in time and, why not, perhaps the java compiler (javac) might employ some compile-time optimizations (such as method inlining, and a bunch of expression and loop optimizations) as well.

In the meanwhile, a smart obfuscator such as Proguard can help with the size-reduction of the compiled classes (through automatic class and method renaming, package renaming and elimination, class-merging, method inlining), thus allowing you to keep your design/code simple, clear and clean, while still taking advantage of these optimizations. (note: class-merging is still experimental in proguard, and method inlining is planned for the future).

For the developer, the first concern should be to (quickly) implement the application and get it to work correctly, and only afterwards does the performance (and optimizations) come into discussion. This is simply because the performance of an unfinished or failed project is of little importance.

Getting the application done is no easy task. You should take advantage of clean design, simple code, clear and small interfaces, hidden (private) implementations, because all these help you to develop your application, to modify and adapt it, and to reuse your code.

You should strieve to achive beautiful code, kind of an art-work. A measure for this is to imagine that your code is open-source, and all your friends look at it and discuss it. If you’d still be proud of your code were it public, it’s a good sign.

The so-called guidelines from my previous post are rather bad advice, and were intended for your entertainment (note that #0-#10 is in fact 11 items, not 10 as stated in the post). By negating (complementing?) them you might get some sound design advice.

Let’s try the negation trick:

#0. Don’t merge classes.

You should not merge un-related functionality together in a single class. You should keep the class design clear and neat because it makes the code easier to understand, to modify, and to reuse between projects. As classes do incur some space overhead, you should avoid the extreme of class inflation, the situation where you have a large number of very small classes (classes with just one or two methods inside). Perhaps a reasonable number of classes might be somewhere between 4 and 20 classes, depending on project size and design complexity.

Proguard the obfuscator offers an experimental merge-class feature. This is another reason to keep your design/code clean, with nice classes, and you can always get the merged-class benefit at the obfuscation step if you want it, with no negative impact on your class design.

#1. Don’t merge methods.

The space overhead of a function declaration is so small, that you shouldn’t worry about it. Do split big methods if it increases the clarity of the code. Strive for having each method do just one thing. Try to have methods with small and clear signatures (small number of parameters) and clear semantic (’what does this method do?’).

#2. Switch or polymorphic dispatch?

Using the switch() instead of the polymorphic dispatch is generally considered bad OOP practice. You may nevertheless want to do it sometimes to avoid having a large number of small classes. Ponder well when choosing between the two.

#3. Instantiation vs. reuse

While object instantiation economy is not bad, don’t sacrifice too much of your design quality for the sake of it, because at the same time object instantiation / garbage collection is not prohibitively expensive when done with measure. The places where you should keep an eye on object instantiation is inside tight loops (e.g. inside a for loop, it may be a bad idea to repeatedly instantiate / garbage collect a temporary object — you may want to instantiate it just once outside the loop, and reuse it during the loop).

I’ll stop here, saving you the remaining ‘guidelines’.

The 10 principles of Assembly Java

November 27th, 2006

Assembly Java is a particularly economical and efficient style of usage of the Java language. These are the principles of Assembly Java:

#0: Define as few classes as possible. Ideally, use a single class.

How: Pack as much as possible in each class. Merge conceptually-unrelated classes together, when possible.
Why: Every class incurs a space-overhead (of about 200 bytes) in the final jar. Having everything in one class also improves the class-locality (as all the methods/fields you use are likely to be from the same class).
Limit: The minimum number of classes you may use is given by the number of classes that you have to extend, as you have to define a new class for each inheritance.

#1: Define as few methods as possible.

How: pack as much of sequential logic as you can in a single big method. Don’t split methods solely for clarity’s sake; merge them.
Why: Performance gain by avoiding method call overhead. Space gain by avoiding the overhead of the method declaration.
Limit: You need a separate method for each method you overload, in particular for each abstract method you implement.

#2: Avoid polymorphism; use switch() instead.

How: Rather than declaring a base interface or abstract class, and extend it in order to polymorphically customize behavior,
use a single concrete class with switch statements for all variable behavior. Declare integer constants identifying the behavior cases.
Why: Size gain by reducing the number of classes (#0 above) (abstract classes and interfaces are as bad as any other class). Performance gain by avoiding the virtual dispatch.

Bad Assembly Java:


abstract class Animal {
  abstract String saySomething();
}

class Dog extends Animal {
  String saySomething() { return "how how"; }
}

class Cat extends Animal {
  String saySomething() { return "meow"; }
}

Good Assembly Java:


class Animal {
  static final int DOG=1, CAT=2;
  String saySomething(int kind) {
    switch (kind) {
      case DOG: return "how how";
      case CAT: return "meow";
      default: return null;
   }
}

#3: Avoid object instantiation (object creation).

How:
Reuse the same object time and again. All classes should have a init() or reset() or reinit() method (instead of a constructor) which would allow the re-use of the old object.
Don’t return objects from methods (because this likely requires the creation of the object to be returned), instead have the method take an out object parameter, which is passed in by the caller (who can reuse it) and is filled by the method with the result value.
Why: Performance gain by avoiding the object allocation overhead, and by reducing the garbage-collection overhead. Memory footprint reduction by decreasing the number of allocations.

#4: Don’t use packages. Put everything in the root package.

Why:
The package names incur some space overhead in the compiled classes.
Due to class-merging (#0), the package concept is anyway already irrelevant.
Having everything in a single package allows global access to package-visible methods and fields, thus enabling technique #5a below.

#5: Avoid method calls.

Why: Performance gain by avoiding the method call overhead. Also see #1, Define as few methods as possible.

#5a: Avoid accessor methods (getters, setters). Access directly the data members.

How: having everything in a single package (#4) allows access to all the members except the private ones. I.e., you don’t have to make a data member public
just because you want to access it, package-visibility is enough.

#5b: Inline small methods.

Why: avoids call overhead and reduces the number of methods.
How: use a preprocessor and preprocessor macros for the inlining. As the Java language doesn’t have a default preprocessor, you have the liberty to use any preprocessor you like. For example you may use CPP (’C Pre Processor’) from GCC, but there are plenty of other choices.
Note: don’t trust the compiler or the virtual machine to inline your small methods (#7).

#6: Don’t use String. Use char array instead.

How: have a large char array, and use a pair of indices (begin, end) inside the array to delimitate your string.
Why:
Reduces the number of object creations and method calls. A char array (with a pair of indices) allows you to extract substrings without object creation,
and to iterate over the characters without the overhead of a method call for each character (String.getAt()).
Also enables a method to return a string without requiring object instantiation (by passing the char array as an out parameter to the method).

#7: Don’t trust the compiler or VM with the optimizations.

Why: the only optimizations you can rely on are the ones that you explicitly code yourself. Conservatively consider the compiler and the VM as dummy as possible.
Bad:


doSomething(2*n, 2*n+1);

Good:


int doubleN = n << 1;
doSomething(doubleN, doubleN + 1);

#8: Avoid array of objects, use multiple elementary arrays instead.

This is best described by an example. Let’s say you need an array of points, where a point is a pair of two integer coordinates (x, y).
Instead of the first-shot solution:


static final int MAX_POINTS = 100;
class Point {
  int x, y;
}
Point myPoints[] = new Point[MAX_POINTS];
{
  for (int i = 0; i < MAX_POINTS; ++i) {
    myPoints[i] = new Point();
  }
}

Use the much-improved solution:


int Xs[] = new int[MAX_POINTS];
int Ys[] = new int[MAX_POINTS];

Why: you gain one less class, and a lot of memory footprint from avoiding the many object instantiations.

#9: Use the all-static singleton.

Every time you have a class that will have a single instance, you should take the opportunity to use the singleton pattern.
The preferred way of implementing the singleton is to declare all the fields (methods and data) static. This avoids the need to allocate an instance of the singleton class (saves memory), and allows for the class-merging of the all-static singleton with any other (non-singleton) class, thus saving one more class definition (#0).

#10: Avoid exceptions.

Why: The try/catch block incurs a space overhead (in the compiled class). There is also a performance penalty when an exception is thrown/caught.
How: Don’t throw. Don’t catch. Don’t define new exception classes (#0), use the standard Error or RuntimeException instead (these are unchecked exceptions, so you don’t have to catch them) if you really have to throw at times.

Conclusion

The name Assembly Java makes reference to Assembly Language, which too is renowned for its outstanding performance and ressource efficency.

So these are the 10 principles of Assembly Java (#0 - #10 above), use them well.

Update 2006-11-30: there is a follow-up to this article, called ‘Mobile Java’, that you might want to read as well.