Chapter 19

Extending Your Programs with Native Methods


CONTENTS


The Java Virtual Machine provides sophisticated capabilities for creating user interfaces, performing Internet and WWW accesses, and for general-purpose programming. Unfortunately, the Java Virtual Machine has some (deliberate) limitations. First and foremost, the JVM only supports functionality that applies to many platforms. That means that unaided Java applications cannot take full advantage of their host platforms. The Java development team also introduced some artificial limitations in the interest of security. For the most part, these restrictions are not onerous. In fact, users can have a relatively high degree of confidence that Java applications and applets are not malicious. Without such assurances, the World Wide Web community would surely have revolted at the notion of self-downloading, self-executing code in a Web page. Just imagine the prospect of a cross-platform virus that attaches itself to Web pages! In some cases, however, these benevolent restrictions prevent applications from making the most of Java.

Consider these situations:

Native methods can solve all of these problems. A Java class can declare a native method to indicate that the actual code for the method is provided in another language. (At present, native methods must be implemented in C. Support for other languages is in the works.) That code is compiled to the native machine code of your particular platform, hence the name "native method." Before it can be used by the Java Virtual Machine, the compiled code must be linked into a dynamically loadable library suitable for use on the target platform. Finally, the DLL or .so file must be installed on the target computer.

There are two parts to every native method, the Java declaration and the "native" implementation. As you do with all other methods, you declare a native method inside a Java class. Native methods can be final, static, or synchronized. They can throw exceptions and be inherited (unless they are final, of course). Because the implementation is compiled for the target platform, it can take full advantage of the capabilities of that platform. Native methods provide a powerful means of extending the Java Virtual Machine.

You have probably guessed by now that native methods are not cross-platform by nature. Your native methods will be as portable as the C code in which you write them and the APIs they call. Because the most common reason for creating native methods is to use a platform-specific API, native methods tend not to be very portable.

Because the dynamically loadable library must be installed on the target machine prior to execution of the Java classes that need it, you will have to face all the issues of software distribution and configuration management. On the other hand, native methods can only be called from Java applications, which have to be distributed and installed themselves. So, the added difficulty of distributing multiple native-method libraries should not pose too great a hurdle.

When Not to Use Native Methods
Here are some questions to ask yourself before you decide to use native methods:
  • Are you writing a full-blown application instead of an applet? Some Web browsers will prevent applets from calling native methods not distributed in the base classes. In some cases, the browser allows the user to enable native method calls from applets. Bear in mind that many users will be reluctant to deliberately short-circuit security features.
  • Can you manage the platform-specific code? Because the native methods are compiled for each platform, you will encounter significant configuration management hassles. More than likely, you will also have to cope with multiple versions of the native methods, one for each platform. This is particularly true when you are dealing with operating system features. (I know, C is supposed to be a portable language, but isn't that why you are using Java?)
  • Is it acceptable to mix object-oriented and procedural code? Keep in mind that the native methods will be written in an object-oriented style of C. You will be working with regular C structures instead of classes. These structures do not provide full information hiding. (No protected or private specifiers.) Ultimately, you will be working in C, with no encapsulation, no inheritance, and no polymorphism. It will require great discipline to maintain good object semantics in C. Most native methods are self-contained, so these should not be serious limitations, but you should be aware of the difficulties ahead.
If you answered "yes" to all of these questions, then feel free to proceed. If any of the questions trouble you, then you should probably think twice before implementing that class with native methods. The Java packages contain hundreds of classes, and the method you are looking for just might be there already, buried within an obscure class.

An Overview of Native Methods

All functions in Java are object methods. Even the declaration part of your native methods must be contained within a class. Therefore, the first step to creating native methods is to define one or more class interfaces. If you are dealing with a large number of functions, it will be useful for you to partition them into logical, coherent groups. These should form the nucleus of each class. There are basically two approaches for defining a class interface that will wrap the native methods. Suppose you are dealing with native methods to provide file access. Assume that your target platform's standard runtime library supports multithreading. If it does not, you will have to serialize access to all RTL functions.

One approach we could take is to create a File class that contains a mixture of native and Java methods that implement an abstraction of a disk file. (This is the approach used most often by the Java development team.) As an alternative design, we could create two classes: a File class that presents an abstract view of files, and a FileImp class that contains all of the native methods relevant to files. The File class would then contain a reference to a single global instance of FileImp. See Figure 19.1.

Figure 19.1: Encapsulating a native interface.

Note
For concrete details on how the Java development team designed this solution, look in the JDK source code. The source code is available from JavaSoft at http://java.sun.com/products/JDK/1.0.2. Check out the file src/java/io/File.java. The File class is an excellent example of integrating native methods with Java methods.

Ultimately, both techniques result in a File class that presents a clean abstraction of a file. The first approach (the single-class approach) has a significant drawback in terms of maintainability, however. Whenever the File class changes, the dynamically loadable library must be rebuilt. In fact, you would need to regenerate the C header and C stub files and recompile the library for every interface change in File.

With the second approach (the split interface), FileImp is insulated from changes to File. Likewise, changes to FileImp do not affect File. The methods of FileImp do not need to be a direct mapping from the target API. You can give them abstract names and easily define an interface, which will rarely (if ever) need to change. For example, the POSIX API defines two sets of file-manipulation functions. Roughly, the sets correspond to those which use handles returned by open, and those which use a structure pointer returned by fopen. The Win32 API defines a group of file-manipulation functions that use a Handle returned by CreateFile. Your FileImp class would define a single method called Open as illustrated by Listing 19.1. Now any client of File can call Open without regard to the underlying system-level API.


Listing 19.1. Definition of the native Open method.
class FileImp {
      .
    public native boolean Open(String Pathname);
      .
 }

After you have designed your class interfaces, you will need to create the Java classes. These will be identical to any other Java class, except for the use of the keyword native. Native signals to javac that this method body will be provided by a library loaded at runtime. Therefore, you do not provide a method body for a native method. The method declaration looks the same as if you were declaring an abstract method. This is no coincidence. In both cases, you are telling javac that you want to defer the definition of this method until later. With an abstract method, you are saying that "later" means a subclass. With a native method, "later" means runtime loadable C code.

When and Where Can I Use "nativ"?
You can put "native" on almost any method declaration. How does javac interpret an abstract, native method? Not very well. You can provide a method body for an abstract native method in C, but it will never be called. A static native method behaves just as you would expect. Its semantics are the same as they would be in a static method defined in Java. A static native method is always passed a NULL for its this argument.
Native methods that are overridden in a subclass work normally. In fact, you can mix native and Java methods arbitrarily when subclassing.

After creating and compiling your Java classes, you will use javah to create two C files: a header file and a stub file. (See Chapter 14, "Using javah" for usage information for javah.) The header file contains a structure definition and function declarations (prototypes) for the native methods. The stub file contains some "glue" code, which you should never need to change. Because both files created by javah are generated automatically, you should never change either of them. javah maps the instance variables and methods of the class directly into the C files. So, if you ever change the Java class, you must regenerate the C files. If you do not, strange and unpredictable behavior will result. For example, javah created the header in Listing 19.3 from the class in Listing 19.2.


Listing 19.2. This is the original PortfolioEntry class.
class PortfolioEntry {
    String  TickerSymbol;
    float   LastQuote;
    float   BoughtAtPrice;
    float   LastDividends;
    float   LastEPS;
 }


Listing 19.3. The structure definition javah created from PortfolioEntry.
typedef struct ClassPortfolioEntry {
     struct Hjava_lang_String *TickerSymbol;
     float LastQuote;
     float BoughtAtPrice;
     float LastDividends;
     float LastEPS;
 } ClassPortfolioEntry;

Now suppose that you add a new integer member, NumberOfShares, to the Java class, right after the TickerSymbol. If you do not recreate the header file, your C code will be using the old structure layout. Therefore, your C code would access every member from LastQuote. Your C code would use the offset for NumberOfShares (an integer) as if it were LastQuote (a float)! This mismatch can cause some of the most irreproducible and hard-to-find bugs. A similar problem can arise with the methods. In that case, you are even more likely to have problems, because any change in a method's signature means that you have to regenerate the stub and header files.

One good way to avoid this problem is to make sure that you rarely need to change your classes after you write them. Good design will help there, but changes are inevitable. Creating a makefile is the best way to ensure that you always use the most up-to-date header and stub files. Some version of "make" is available for every platform supported by the JDK. Later in this chapter, you will find sample makefiles for Java classes with native methods.

Javah creates two files for you, but it leaves the most important file up to you. Along with the header and stub files, you will need to write an implementation file. The implementation file must contain the definitions of the functions that the header file declares. All of the really interesting things happen in the implementation file. Figure 19.2 depicts the relationships among the four files-the Java source, the C header, the C stubs, and the C implementation files.

Figure 19.2: The relationship between the Java class file and the three C files..

At this point, a simple example will illustrate the interaction between the Java class and the three C files.

Who Am I? A Java Class to Identify the User

To demonstrate the entire process, from the Java class all the way to the implementation file, we will develop a class that can return the current user's name as a string. This example will use Windows NT native methods.

Note
There is already a mechanism in Java to do this. (See the documentation for the java.lang.System class.) Normally, you should not create a native method to accomplish something already available in the base packages. Because this is a simple demonstration, we will conveniently disregard the System.getProperty method.

The Class Definition

What do we really want to know? For now, we just want to know the user's login name. Various platforms have different API calls to get this information. Good design dictates that we should have a method name that is descriptive of what we want to know, not how we learn it. So, let's just call this method Username. It will return a Java string.

Remember that there are no global functions in Java. Even though we really only need one function, we have to put it in a class. At first, this might seem like a lot of trouble. After all, who wants to define a complete class just to get one method? But think about the future. Will this class always have just one method? Odds are that, like everything else, the requirements for this class will evolve over time. Perhaps there are other details about users that would be of interest to your application (or even a future application). The extra few minutes spent here in creating a class give you a framework in which to build additional functionality later.

Listing 19.4 shows the Java source for the UserInformation class.


Listing 19.4. The Java class definition of UserInformation.
 1:  class UserInformation {
 2:     static {
 3:        System.loadLibrary("UserInformation");
 4:     }
 5:     public native String Username();
 6:  }

Here is a line-by-line description of the code:

Line 1: This line defines the class UserInformation. There is nothing fancy about this class; it inherits directly from java.lang.Object. Although this class does not show it, it is usually a good idea to make native methods-sometimes entire classes-"final." Otherwise, anyone can inherit from your native methods and possibly subvert their behavior.
Line 2: This line defines a static block of code. Static blocks are executed when the class itself is loaded. You can think of a static block as a constructor for the class as a whole.
Line 3: When this class is loaded, direct the runtime system to load the dynamically loadable library UserInformation. Exactly how the library is located and loaded varies by target platform. See the appropriate "Configuring Your Environment" section for your platform later in this chapter.
Line 5: This line defines the method Username as a native method that returns a Java String object.

Now, compile this class using javac. (For details on javac usage, see Chapter 9, "javac: The Java Compiler.")

> javac UserInformation.java

This will produce the usual class file UserInformation.class. The next step is to create the header and stub files using javah. (For details on javah usage, see Chapter 14.)

> javah UserInformation
> javah -stubs UserInformation

This will produce two files, UserInformation.h and UserInformation.c. These are the header and stub files. You can disregard UserInformation.c (until it is time to compile, of course.) Listing 19.5 contains the header file UserInformation.h as created by javah. Your output file may vary in some details, depending on your platform and version of the JDK.


Listing 19.5. The header file UserInformation.h created from the UserInformation.class class file.
 0   /* DO NOT EDIT THIS FILE - it is machine generated */
 1   #include <native.h>
 2   /* Header for class UserInformation */
 3
 4   #ifndef _Included_UserInformation
 5   #define _Included_UserInformation
 6
 7   typedef struct ClassUserInformation {
 8       char PAD;   /* ANSI C requires structures to have at least one member */
 9   } ClassUserInformation;
 10  HandleTo(UserInformation);
 11
 12  #ifdef __cplusplus
 13  extern "C" {
 14  #endif
 15  struct Hjava_lang_String;
 16  extern struct Hjava_lang_String
      *UserInformation_Username(struct HUserInformation *);
 17  #ifdef __cplusplus
 18  }
 19  #endif
 20  #endif

Notice the structure definition on lines 7 through 9. Because we did not define any members of the Java class, javah created this dummy structure. This will keep C compilers happy, but you should not attempt to use PAD to store information. The Java specification makes no guarantees about the contents of PAD. For our purposes, the only line of real interest is line 16, where the declaration for Username appears. This is the function that our implementation file must define. Notice that instead of taking a pointer to a ClassUserInformation structure, the C function gets a pointer to an HUserInformation structure. See the sidebar "Where Did HUserInformation Come From?" for details on this structure.

Where Did Come From?
The structure HUserInformation, which is passed to each of the C functions, is declared as a result of the HandleTo() macro, which is defined in \java\include\oobj.h as this:
#define HandleTo(T) typedef struct H##T { Class##T *obj; \
struct methodtable *methods;} H##T
So, HandleTo(UserInformation); expands to
typedef struct HUserInformation {
  ClassUserInformation *obj;
  struct methodtable *methods;
} HUserInformation;
This structure provides the bookkeeping that will allow C functions to behave like class methods. To access an instance variable from your native method, follow the obj pointer to the instance variable structure declared by the header file. For example, if the Java class UserInformation had a member variable UserID, its native methods could reference that member as hUserInformation->obj->UserID.
The methods pointer also allows your native method to get information about, and even invoke, the other methods of your class.
This structure is intended to be opaque to native methods. Because it may change in the future, the include file interpreter.h provides macros to hide the details of HUserInformation. In your native methods, you would use unhand(hUserInformation) to get access to the ClassUserInformation pointer. By using the macros, you insulate your code from future changes to Java internals. See "Using Java Objects from Native Methods" later in this chapter for details.

Finally, the time has come to write the implementation file. By convention, the implementation filename ends in "Imp". For UserInformation, our implementation file is named UserInformationImp.c. Listing 19.6 shows the implementation file for UserInformation, followed by a line-by-line description of the code. (We will not explore the target platform's API functions. After all, this is a book about Java, not C.)


Listing 19.6. The implementation file UserInformationImp.c.
 1:  /* UserInformationImp.c                                    */
 2:  /* Implementation file for Java class UserInformation      */
 3:  #include <StubPreamble.h>
 4:  #include "UserInformation.h"
 5:  #include <winnetwk.h>
 6:
 7:  struct Hjava_lang_String *UserInformation_Username(
 8:     struct HUserInformation *hUserInformation
 9:  )
10:  {
11:    char szUserName[128];
12:    DWORD cchBuffer = sizeof(szUserName);
13:    if(NO_ERROR == WNetGetUser(NULL,szUserName,&cchBuffer))
14:       return makeJavaString(szUserName, sizeof(szUserName));
15:    else {
16:       printf("UserInformation_Username: GetLastError = %x\n",
17:          GetLastError());
19:       return makeJavaString("", 0);
20:    }
21: }

Here is a line-by-line description of the implementation file:

Line 3: Every native method implementation file must include StubPreamble.h. Indirectly, it provides all of the macros, definitions, and declarations that enable C code to interoperate with the Java Virtual Machine.
Line 4: Include the header file created by javah. This brings in the structure definitions ClassUserInformation and HUserInformation.
Line 5: Include the Win32 header file, which defines WNetGetUser-the function we will use to find the user's name.
Lines 7-9: This function signature must be identical to the corresponding function declaration in UserInformation.h.
Lines 11-12: Declare two local variables-a character array that will hold the C string of the user's name, and a double word that will indicate the buffers size.
Line 13: Retrieve the user's login name by calling WNetGetUser and checking the return value for failure. If WNetGetUser fails because the buffer is too small, cchBuffer will contain the actual buffer length needed.
Line 14: If the call to WGetNetUser succeeded, construct a Java string object from the C string. The function makeJavaString is declared in javaString.h, along with several other functions which C code can use to manipulate Java string objects. Not all Java classes have such convenient C interfaces. In general, you will use the classes' inherent capabilities by calling the Java code from your C code.
Lines 15-19: If the call to WGetNetUser failed, print an error message on stdout and construct an empty Java string to return. Returning an empty string is more conscientious than returning a NULL. Java has garbage collection, so you do not need to worry about the eventual destruction of these strings.

Now we just need to compile and link the two C files. For complete details on building the code, see the following sections called "Building Native Methods for Solaris" and "Building Native Methods for Windows 95/NT," whichever is appropriate for your target platform. For now, the sample makefile in Listing 19.7 will make things more convenient. (It will also make sure that your header and stub files stay current with respect to your .java file.) Remember to put tabs, not spaces, at the beginnings of the action lines.


Listing 19.7. Sample makefile suitable for use on Windows 95/NT.
 cc         = cl
 LFLAGS     = -MD -LD
 CLASSPATH  = .;$(JAVA_HOME)\lib\classes.zip
 
 CLASS      = UserInformation
 
 all: $(CLASS).dll $(CLASS).class main.class
 
 main.class: main.java
     javac main.java
 
 $(CLASS).class: $(CLASS).java
     javac $(CLASS).java
 
 $(CLASS).dll: $(CLASS)Imp.c $(CLASS).class
     javah -classpath $(CLASSPATH) $(CLASS)
     javah -stubs -classpath $(CLASSPATH) $(CLASS)
     $(cc) $(CLASS)Imp.c $(CLASS).c -Fe$(CLASS).dll $(LFLAGS) mpr.lib javai.lib

Once you compile your native methods into a dynamically loadable library, you can create a simple class to exercise this native method from an application. When you are testing an application that uses native methods, the command-line Java interpreter "java" can be a tremendous help. In this case, the class main simply creates an instance of UserInformation and prints the results of Username(). Figure 19.2 shows the sequence of events that makes this happen.

>java main
Your username is: mtnygard

Now that you have gone through the entire process one step at a time, you can delve into the nuts and bolts. The next sections will refer back to this example from time to time as they explore the details of the interaction between native methods and pure Java code.

The Nuts and Bolts of Native Methods

Because native language methods are so closely tied to a particular platform, the difference in procedures is more pronounced than with the JDK itself. For example, the compilers for different platforms have wildly different command-line arguments. Because of the possible variations and permutations, this section cannot present an exhaustive reference on the native compilers. Therefore, you should always consider the appropriate compiler reference as the ultimate authority. The procedures in these sections worked at the time this was written, using JDK 1.0.2 on the following platforms: SPARC Solaris 2.5, x86 Solaris 2.5, Microsoft Windows 95, Microsoft Windows NT 3.51, and Microsoft Windows NT 4.0 Beta. The Solaris platforms use the usual "cc" compiler. The Windows platforms have been tested using Visual C++ 2.0 and 4.0.

Configuring Your Environment for Solaris

If you have not already done so, you should modify your PATH variable to include the /java/bin directory. In addition, setting your JAVAHOME and CLASSPATH variables properly will help javah run smoothly.

When your class's static block calls System.loadLibrary, the Java runtime will search for the library in the current directory. If you intend to install your dynamic libraries in any other directory, you will need to set your library search path to include that directory. For example, if your libraries are stored in java_lib in your home directory, use the following command in Bourne and Korn shells:

$ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/java_lib
$ export LD_LIBRARY_PATH

In C shell, use this:

% setenv LD_LIBRARY_PATH "$LD_LIBRARY_PATH:$HOME/java_lib"

Building Native Methods for Solaris

Use the following command to compile the code and link the dynamic library:

$ cc -G UserInformation.c UserInformationImp.c \
> -o libUserInformation.so

You will probably need to use the -I flag to tell the compiler where to find the Java header files:

$ cc -G -I${JAVA_HOME}/include -I${JAVA_HOME}/include/solaris \
> UserInformation.c UserInformationImp.c \
> -o libUserInformation.so

The linker will create libUserInformation.so in your current directory. For details about the linker options, refer to the man pages for cc and ld.

To execute System.loadLibrary("libname"), the Solaris Java Virtual Machine will search for a library named liblibname.so.

Configuring Your Environment for Windows 95/NT

You should either make these changes to your C:\AUTOEXEC.BAT file or to a batch file you will run every session.

Setting your JAVA_HOME and CLASSPATH variables properly will help javah run smoothly. In addition, if you have not already done so, you should modify your PATH variable to include the %JAVA_HOME%\bin directory.

When your class's static block calls System.loadLibrary, the Java runtime will search for the library in the PATH variable, as well as the directory in which the executable lives. If you intend to install your dynamic libraries in any directory other than these, you will need to set your path to include that directory.

To make the compile process a little easier, you might want to set the IncLUDE and LIB variables to point to the Java files:

set IncLUDE=%JAVA_HOME%\include;%IncLUDE%
set LIB=%JAVA_HOME%\lib;%LIB%

Building Native Methods for Windows 95/NT

Use the following command to compile and link the dynamic library:

C:\> cl UserInformation.c UserInformationImp.c
-FeUserInformation.dll -MD -LD [other_libs] javai.lib

Note
You must use Visual C++ 2.0 or higher to compile the dynamically linked libraries. In particular, Visual C++ 1.52 or below produce 16-bit code, which will not work with Java.

The linker will create UserInformation.dll in the current directory. For details on the linker options, refer to the Visual C++ online manual.

To execute System.loadLibrary("libname"), the Windows 95/NT Java Virtual Machine will search for a library named libname.dll.

Troubleshooting Native Methods

Here are some common exceptions you might see when you run your program.

java.lang.UnsatisfiedLinkError no hello in LD_LIBRARY_PATH
   at java.lang.Throwable.(Throwable.java)
   at java.lang.Error.(Error.java)
   at java.lang.LinkageError.(LinkageError.java)
   at java.lang.UnsatisfiedLinkError.(UnsatisfiedLinkError.java)
   at java.lang.Runtime.loadLibrary(Runtime.java)
   at java.lang.System.loadLibrary(System.java)
   at UserInformation.(UserInformation.java:5)
   at
java.lang.UnsatisfiedLinkError: Username
   at main.main(main.java:6)

This exception appears on Solaris systems. It means that you have a library path set, but the particular library is not in it. You need to modify your library path to include the directory where your library lives.

java.lang.NullPointerException
   at java.lang.Runtime.loadLibrary(Runtime.java)
   at java.lang.System.loadLibrary(System.java)
   at UserInformation.(UserInformation.java:5)
   at
java.lang.UnsatisfiedLinkError: Username
   at main.main(main.java:6)

This exception also appears on Solaris systems. It indicates that you do not have a library path set at all, and that the runtime cannot find your library without it. You should either set your library path or move your library to the current directory.

Unable to load dll 'UserInformation.dll' (errcode = 485)
Exception in thread "main" java.lang.UnsatisfiedLinkError:
   no UserInformation in shared library path
   at java.lang.Runtime.loadLibrary(Runtime.java:268)
   at java.lang.System.loadLibrary(System.java:266)
   at UserInformation.<clinit>(UserInformation.java:3)
   at
java.lang.UnsatisfiedLinkError: Username
   at main.main(main.java:6)

This is essentially just the Windows version of the same problem. Again, the solution is to copy the library into a directory that is in the PATH, or to modify the PATH to include the library's directory.

java.lang.UnsatisfiedLinkError: Username
        at main.main(main.java:6)

If you get this exception by itself, without a larger walkback above it, then your library is missing the function being called by this native method.

As you can see, when you are using native methods, most of the problems show up when the runtime attempts to load the library. Unfortunately, there is no simple solution. This is basically an installation and configuration management problem. Worse yet, there is no way for one class to catch exceptions that occur when the JVM is loading another class.

The Method and the Function

There are two parts to every native method: the Java declaration and the native language definition. The Java Virtual Machine provides enough capabilities that the native language component can do virtually everything that a typical Java method can. This section will examine the two sides of a native method: the Java declaration and the native language definition. The next few sections will examine how the native language component works with the JVM to make all of this work.

Starting with the Java declaration, consider the method signature from the UserInformation example earlier in the chapter.

public native String Username();

It looks more or less the same as any other method. The native keyword means that the method body is not needed. (In fact, it is not permitted.) The native keyword tells the compiler that the definition (the method body) is provided in a different language.

In this example, no arguments were needed. However, you can pass arguments to native methods and get return values from them. These arguments and return values can include objects. See "Arguments and Return Values" for specifics.

We used javah to create the header file for the native side. Take a look at the function signature created by javah.

struct Hjava_lang_String *UserInformation_Username(struct HUserInformation *);

By examining this piece by piece, you can see how the Java to C linkage works. First, the return type is declared as struct Hjava_lang_String *. In the native language, all objects are accessed through a handle-a special structures that allows access to an object's instance variables and the class's methods (the object's vtable.) You can translate Hjava_lang_String directly to "java.lang.String". Handles are always passed as pointers for efficiency.

As you can see, the function name itself is comprised of the class name, an underscore, and the method name itself. If you wanted to add a new native method to find out the user's disk quota, it would be named something like UserInformation_DiskQuota. Because javah can generate all of the function declarations, there is rarely a need to create these names, but because the names are not "mangled" (the way early C++ to C translators did), the resulting code is very readable.

The final item in this signature is the argument struct HUserInformation *. It is an automatic parameter, which will be explained in the next section.

Arguments and Return Values

Pure Java methods can pass arguments to native methods just as they would to any other method call. Native methods can return any value that a Java method would. How does this work when two such vastly different languages are involved? It works because the Java Virtual Machine exposes some of its internals to the native language. At the same time, the header files that support native methods use features of the C preprocessor and compiler to shield you, the developer, from the guts of the Java implementation. Essentially, when you are developing a native method, you are hooking right into the most fundamental aspects of Java's implementation. As a result, you must be careful to follow the rules, or you risk writing fragile code and upsetting the JVM itself.

Simple data types map directly from Java to C. Some difficulties arise because arrays are first class objects in Java, whereas they are aggregate data types in C. Arrays appear in C code as Java objects. Table 19.1 shows the data type mapping from Java to C.

Table 19.1. Mapping some Java data types to C data types.

Java typeC Argument C Structure
Primitive Types
boolean long long
char longlong
short longlong
ushort long long
int longlong
float float float
double float float
Complex Types
Object struct Hjava_lang_Object struct Hjava_lang_Object
boolean[] long long
char[] unicode struct HArrayOfChar
short[] long struct HArrayOfLong
ushort[] unsigned short unsigned short
int[] longstruct HArrayOfLong
float[] float struct HArrayOfFloat
double[] float struct HArrayOfFloat
Object[] HArrayOfClass struct Hjava_lang_Object

The most important argument for any native method is the first parameter passed to all native methods. It is referred to as an automatic parameter. Look back at the example earlier in this chapter. The function UserInformation_Username was passed a struct HUserInformation * as the first (and only) argument. This pointer serves the same function as the implicit this pointer in C++. That is, the first parameter passed to a native method points to the object instance for which that method is being called. You can use this pointer to access the instance variables of the class. With it, you can even call other methods of the object. This automatic parameter takes the form of a handle. (See the sidebar "What are Handles, Anyway?".)

The confluence of handles, multithreading, and garbage collection leads to some interesting consequences for your C code. First of all, your C code must be fully re-entrant. Because a Java application can have an arbitrary number of threads, any one of those threads can call your native method-even if another thread is already executing in that method, unless you synchronize the method. See "Multithreading and Native Methods." Also, remember that the garbage collection runs in an idle thread, but another thread may request that the garbage collector run immediately.

How does this affect your native methods? The garbage collector may relocate objects in memory! As long as you maintain a reference to your arguments, particularly your this pointer, it will not be relocated. However, if you try to store the obj pointer from the object handle (for example, hUserInformation->obj) in a global variable, two very bad things will happen. First, the garbage collector may relocate or even destroy the object. This is because each object has a reference count-a count of the outstanding handles to that object. When the object's reference count hits zero, it is put into a pool that can be garbage collected at any time. When your method returned, the handle passed to it went out of scope and the object's reference count was decremented. Although you still have a copy of the handle, the object may get destroyed without warning. For instance, another thread may cause the reference count to hit zero, ven though your variable still has a pointer. Then your variable points to deallocated memory. The second problem with this scenario comes from the multithreading itself. Any function that uses global data is suspect in a multithreaded program. It is very difficult to prove that a function using global data will behave correctly when faced with re-entrant calls. Here again, the most likely consequence is an invalid pointer. The JVM is well protected, but it can still be crashed. Never believe anyone who tells you that it is impossible to get a core dump or GPF from Java! Anytime native methods are involved, the JVM is only as robust as those native methods.

In Table 19.1, all of the Java array types appear as handles in the object instances. Java treats arrays as first-class types and implements them as objects. When passing an array to a native method, the JVM translates it into a C-like array. However, the array instance variables are still (Java) arrays when accessed from C. Therefore, the C code must treat it as an object.

Returning primitive and complex data types works exactly the same way as primitive and complex arguments. Sometimes, as in the UserInformation example, this may involve constructing new Java objects in C code. The next section explains how to accomplish this feat.

Using Java Objects from Native Methods

In native methods, all objects appear as handles. Handles are at the heart of the Java object model, but you will be shielded from directly manipulating handles. A group of C macros and functions permit you to deal with handles largely as an abstract type. So what can you do with a handle?

Accessing Instance Variables from Native Methods

For starters, you can access all of an object's instance variables-all those that are visible to your code, at least. In order for your native method to access the data members of your object, you dereference the handle with the macro unhand(). Unhand() is defined in the header file interpreter.h, which is included by StubPreamble.h. Listing 19.8 shows an example of accessing an object's instance variables from a native method.


Listing 19.8. Using unhand() to access instance variables.

typedef struct ClassUserInformation {
   long iUserID;
} ClassUserInformation;
HandleTo(UserInformation);
...
void UserInformation_SetUserID(struct HUserInformation *hUI,
    long newValue)
{
   unhand(hUI)->iUserID = newValue;
}

What Are Handles, Anyway?
In the Java implementation, a handle is used to maintain references to objects. Handles allow the use of native languages to truly be "two-way." That is, the native methods can access instance variables and call other methods, just as pure Java methods can. Handles are at the very heart of the Java object model. In code, a handle is a small structure with two pointers, which together form an object instance. One points to the instance variables, the other to the method table (the vtable in C++ parlance). Handles for your classes are created automatically when you create the header file. (See the sidebar "Where Did HUserInformation Come From?")
By using pointers to handles to pass objects around, Java attains great efficiency. Because object manipulations work through handles, there is a single item of interest to the garbage collector, and a natural mechanism for marking objects in use or discarded.

The unhand() macro returns a pointer to an ordinary C structure. (For handles to objects of your class, it returns a pointer to the structure defined in your header file.) In all cases, the members of the C structure have the same names as the members of the Java class. The types may be different, however. Check Table 19.1 for the type mapping. You can read and write to the members of the structure. In fact, you can even pass pointers to them as arguments to other functions. They behave in all ways as members of ordinary C structures, because that is exactly what they are. When you use unhand() on a handle, you are actually looking at the fundamental Java implementation of an object. The structures you see using the handle are the real structures; there is no behind-the-scenes marshalling before the call.

Accessing Class Variables from Native Methods

Class variables (static variables) do not appear in the instance variable structure generated by javah. This is because they are not stored with each instance of a class, but rather with the class structure itself. The Java runtime provides a function that you can use to get a pointer to the class variable getclassvariable.

long *getclassvariable(struct ClassClass *cb, char *fname);

Notice that this function, like many other Java runtime functions, requires a pointer to the class structure. Fortunately, every object instance carries this pointer around. The obj_classblock macro from interpreter.h takes an object handle as an argument and returns a pointer to the class structure.

Why does getclassvariable return a pointer to a long? It returns a pointer because the native language code should be able to modify the class variables, as well as see them. The return type is long, but in C, all pointers to data are the same size. So, it is safe to cast the returned pointer to a pointer to whatever type the class variable really is.

Be warned that unless your native methods are synchronized, modifying class variables can be problematic. Always remember that, even though this method is in C, it really is being called from a multithreaded environment. There may be other threads executing Java methods or native methods. They may be doing any number of things, including accessing the class variable at exactly the same time as the native method. Any time global data is involved, whether it is class-wide or truly global, synchronization is an issue. See "Multithreading and Native Methods" for more details.

Wouldn't it be nice if your native language code could also call Java code? Yes, it really would. Fortunately, we can do exactly that. It all starts with the object handle. (Where else?) You have to use a slightly more complicated mechanism than you might think at first. Calling another C function from a function pointer in a structure is something familiar to all of us, something like:

hUserInformation->Username();                 // DO NOT DO THIS!

Although this would have the net effect of transferring execution to the other function, it is not sufficient. Here's why: When calling a C function, a stack frame is created with the return address and some arguments. (Implementations vary somewhat; this is necessarily a very general description.) In Java, there is much more context provided for a function call. In addition to the processor stack, the JVM maintains a great deal of stack information on its own, internal stack. (Presumably, combining this stack with the processor stack, both in hardware, is one of the benefits of a "Java chip.")

In order to make the method call, we really have to ask the JVM to do the call for us. Here are the three functions declared in interpreter.h that will make these calls:

HObject *execute_java_constructor(ExecEnv *, char *classname,
   ClassClass *cb, char *signature, ...);
long execute_java_dynamic_method(ExecEnv *, HObject *obj,
   char *method_name, char *signature, ...);
long execute_java_static_method(ExecEnv *, ClassClass *cb,
   char *method_name, char *signature, ...);

Each function serves a specific purpose and is described separately in the following section.

Calling a Java Constructor

Native language code can construct an instance of any Java class that is currently loaded. Here again, the native code is hooking directly into the implementation of Java. Therefore, the new object will behave exactly as it would if you had constructed it in Java code. The general form of a native language constructor is this:

hNewObject = (HClassName *)execute_java_constructor(NULL,
    "ClassName", NULL, "ConstructorSignature", ...);

The first parameter here is the ExecEnv or exception environment that applies to the constructor call. You can safely pass a NULL here, and the constructor will operate with the default exception environment. If you are using sophisticated exception handling in your native code, please see the section "Native Methods and Exceptions."

The interesting pieces of this call are shown in italics. First of all, in order for the native method to use the object that gets constructed, it must know the object's class. Usually, this comes from including a header file for that class. Only a few classes have headers distributed with the JDK: ClassLoader, String, Thread, and ThreadGroup. However, by using javah, you can create header files for whatever classes you need. (See Chapter 14 for details.) If you do not need to access the object's methods or instance variables, you can cast the return value from execute_java_constructor to an HObject pointer. HObject is the handle to the base class Object.

Loading Java Classes from Native Language Code
If you need to load a class directly from native code, you can call the function:
int LoadFile(char *filename, char *directory, char *SourceHint);
It is safe to pass NULL for SourceHint. This function will cause the ClassLoader to find the file and load it into memory. Once the class is loaded, its static block is executed. Be aware that the static block can throw an exception.
The ClassLoader also provides native code for the DoImport function.
int DoImport(char *name, char *SourceHint);
This will behave exactly as if the Java code had contained an "import name;" line.

The next item of interest in this call is the class name. This is a text string that specifies what class is to be instantiated. This class must already be loaded, or reside on the class path. When you pass a class name, the Java runtime must perform a lookup on the class itself. However, if you have a pointer to the class itself, you can leave the class name argument NULL and pass the ClassClass pointer in the third argument. For example, if you wanted to clone an existing object of any class, you might use code like this:

struct HObject *Cloner_Clone(struct HCloner *hCloner,
                             struct HObject *hObj)
{
   HObject *hNewInst;
   hNewInst = execute_java_constructor(NULL, NULL,
                 obj_classblock(hObject), "(A)", hObj);
   return hNewInst;
}

This native method can clone any object that has a copy constructor. The macro obj_classblock is another of the handle convenience macros. For any object, it will return a pointer to the class structure for that object. Many of the built-in functions of the Java runtime require a pointer to the class structure "ClassClass".

The constructor signature is used to select which constructor to invoke. It is a character string that specifies the number and type of arguments to follow. Table 19.2 shows the possible characters and their meanings.

Table 19.2. Signature characters and their meanings.

Character
Meaning
A
Any (object)
[
Array (object)
B
Byte
C
Char
L
Class
;
End class
E
Enumeration
F
Float
D
Double
(
Function argument list start
)
Function argument list end
I
Int
J
Long
S
Short
V
Void
Z
Boolean

By concatenating characters from this set, you specify what arguments are to follow. Two characters have special significance: the parentheses indicate the beginning and end of the argument list. For a constructor, parentheses should enclose the entire argument list. For other methods, the signature will also indicate the return type of the method. In the preceding example, the signature is "(A)", which means the constructor must simply take one argument-an object instance. This implies that our constructor could receive objects of other classes as arguments. If we knew the class name, say Foo, for example, we could write the signature as "(LFoo;)", which means the constructor must take an instance of class Foo or its subclasses, as an argument. Here is a revised version of the Clone method, which takes into account the class name of the argument passed in:

struct HObject *Cloner_Clone(struct HCloner *hCloner,
                             struct HObject *hObj)
{
   HObject *hNewInst;
   char     signature[80];

   sprintf(signature, "(L%s;)", classname(obj_classblock(hObj)));
   hNewInst = execute_java_constructor(NULL, NULL,
                 obj_classblock(hObject), signature, hObj);
   return hNewInst;
}

Notice that, although "name" is a simple member of struct ClassClass, we use the classname macro (from oobj.h) instead of a direct access. In general, you should never access a structure member directly. Instead, use one of the macros provided in oobj.h or interpreter.h (both included by StubPreamble.h). The macros shield you from future changes to the structures. What if there is no macro to access a particular member? Then you probably are not meant to use that member in the first place!

After the constructor signature, you can place a variable length argument list. These are the arguments that will be passed to your constructor. You should be careful to make sure that the type, number, and order of your arguments match the signature. Otherwise, your constructor may be called with garbage for arguments.

If there is an error while calling the constructor, execute_java_constructor will return NULL.

Calling a Java Method

Now for the really good part. You can call any Java method from native code. It all works through the handle to the object instance. If the method to be called is static, see the next section. For dynamic methods-including final and native methods-use the execute_java_dynamic_method function:

long execute_java_dynamic_method(ExecEnv *, HObject *obj,
    char *method_name, char *signature, ...);

The first parameter is the same as the first argument to execute_java_constructor-an exception environment. Again, it is safe to pass a NULL for this parameter. If you require more sophisticated exception handling for the native code, see "Native Methods and Exceptions."

The second parameter is the object instance itself. This can be any valid object handle, whether it was passed from Java code as an argument or constructed in the native method itself.

The next parameter is the name of the method to be invoked. If this method does not exist in the object passed, execute_java_dynamic_method will throw an exception.

Finally, the fourth argument is the signature of the instance method. Again, the signature indicates the type and number of arguments to be passed to the instance method being called. This must be the same number and type of remaining arguments in the call to execute_java_dynamic_method.

The signature for a method differs slightly from the signature for a constructor. A method signature needs to indicate the expected return type. The signature can show this with an extra character after the closing parenthesis. For example, consider a method that takes arguments of two classes and a float and returns a byte:

public byte[] FunkyMethod(Object fc, SecondClass sc,
                          float difference);

This method would have the signature "(ALSecondClass;F)B". The call to execute_java_dynamic_method would then have three arguments after the signature: a generic object, an instance of SecondClass, and a float.

The return value from execute_java_dynamic_method depends on the method being invoked. It is declared as long, however, so you will need to cast it to the appropriate type. (Because long is wide enough to hold any primitive type or a handle, it is safe to cast it to whatever the method really returns.) Be careful, though. Because you are calling the method by a function, not directly, the compiler cannot possibly notify you of any changes in the method definition.

Calling a Static Java Method

Calling a class method from native code is similar to calling a dynamic method. Use the execute_java_static_method function:

long execute_java_static_method(ExecEnv *, ClassClass *cb,
   char *method_name, char *signature, ...);

This is entirely analogous to the execute_java_dynamic_method function, with one crucial difference. Instead of taking an object instance as a parameter, execute_java_static_method requires a class structure. This class structure can come from an object instance, using the obj_classblock macro, or it can be retrieved using the FindClass function:

ClassClass* FindClass(ExecEnv *ee, char *classname, bool_t resolve);

FindClass will return a pointer to the class structure for any class name you pass in. This function may cause several things to happen. First, the named class may be loaded, which will in turn cause the named class's static block to execute. Exceptions may be thrown by FindClass itself, if it cannot find the class, or by the static block of the named class, when it is loaded.

Like the other runtime functions that can throw exceptions, FindClass can take an exception environment as an argument. As usual, it is safe to pass a NULL here, in which case FindClass will use the current exception environment.

Multithreading and Native Methods

Even though the native methods are written in C, they still execute within the context of the Java environment, including multithreading. Native methods will sometimes have sections of code that modify global (class or application) data, which modify important state variables, or which must call non-re-entrant functions. (Some platform-specific APIs fall into this category.) These sections are called critical sections, because it is critical that no more than one thread executes the section of code at a time. Critical sections are not unique to native methods: the same issues exist when dealing with multithreading in pure Java code.

In order to ensure that the native methods maintain the application in a consistent state, these native methods will have to be synchronized with other threads. The simplest way to accomplish this is to declare the native methods as synchronized in the Java code. Sometimes, however, this will be insufficient, either for performance reasons (for example, a method that does a long computation, but only changes the condition variable infrequently), or because a multithreaded application needs to use an existing object that does not have synchronized methods (most do not).

In these cases, your native code can perform the synchronization directly. In Java code, you could put the critical section inside a synchronized block. The native method analogue to a synchronized block directly uses a monitor.

Monitors prevent two threads from simultaneously executing a section of code. Monitors were first introduced in C.A.R. Hoare's seminal paper "Communicating Sequential Processes" (Communications of the ACM, Vol. 21, No. 8, August 1978). Typically, each condition variable has a monitor associated with it. The monitor acts as a lock on that data. Unless a thread holds the monitor, it cannot modify or inspect that data. Obviously, a thread should hold the monitor as briefly as possible.

Note
In Java, critical sections are usually methods. You can synchronize blocks of code smaller than methods. However, this leads to complex code, with many failure modes. Deadlock prevention at the method level is fairly straightforward, but it becomes rapidly more difficult when dealing with many small blocks of synchronized code.
Even when dealing with native methods, it is best to use synchronization at the method level.

Monitors provide the Java runtime system support for thread synchronization. Every object instance has a unique monitor attached to it. In this case, the entire object is considered a condition variable. Through a trio of functions, native methods can also use monitors.

void MonitorWait(unsigned int, int);
void MonitorNotify(unsigned int);
void MonitorNotifyAll(unsigned int);

These functions are analogous to the wait(), notify(), and notifyAll() functions in Java:

monitorWait This function blocks the executing thread until the monitor is notified. If you encounter a deadlock, the first place to start is to check each occurrence of monitorWait().
monitorNotify This function awakens no more than one waiting thread. It signals that the monitor is now available. It must only be called from the thread that holds the monitor. If an unhandled exception occurs while a thread holds the monitor, any threads waiting on the monitor will be blocked indefinitely.
monitorNotifyAll This function awakens all threads waiting on the monitor. It signals that the monitor is now available.

Tip
Monitors are re-entrant, which means that a thread that already holds a monitor will not deadlock if it calls monitorWait again.
Monitor operations are atomic. They are not subject to race conditions (although the code that calls them is).

The Java implementation of monitors has a nice feature that helps to prevent deadlock. Consider this scenario: a synchronized native method calls monitorWait, to wait for another synchronized method (native or not) to notify the monitor. Because the first method is synchronized, it already holds the monitor when monitorWait is called. How is the second method ever supposed to execute? The implementation of monitorWait makes this possible by releasing the monitor on entry. This allows the second method to execute and notify the monitor. Once the monitor is notified, the first thread awakens. On exit from monitorWait, the first thread acquires the monitor again.

All three of the monitor functions are declared as receiving an unsigned integer. To get the monitor for an object, use the obj_monitor macro on the object's handle.

Here is an example of a synchronized string. This is an example of a producer/consumer problem. (Most synchronization problems can be reduced to one of two classic examples, producer/consumer and the colorfully named "dining philosophers problem." The dining philosophers problem is discussed in Chapter 20, "Working with Threads.") In a producer/consumer problem, one thread produces data items while another consumes them. Both operations must be synchronized so that the consumer gets each item exactly once. Listing 19.9 contains the Java definition of the synchronized string class, SyncString. Listing 19.10 shows the native method implementations of the SyncString.Set and SyncString.Get methods. Note once again that this could easily have been accomplished without the use of native methods. You will find that native methods are only useful in a very small set of circumstances.


Listing 19.9. Java class for SyncString.
class SyncString {
    String str;
    int bAvail;

    public native synchronized void Set(String s);
    public native synchronized String Get();

    static {
       try {
           System.loadLibrary("SyncString");
       } catch (UnsatisfiedLinkError e) {
           System.err.println("Cannot find library SyncString.");
           System.exit(-1);
       }
    }
}


Listing 19.10. Native language implementation for SyncString.
#include <StubPreamble.h>
#include "SyncString.h"


void SyncString_Set(struct HSyncString *this,
                    struct Hjava_lang_String *str)
{
    while(unhand(this)->bAvail == TRUE)
        monitorNotify(obj_monitor(this));

    unhand(this)->bAvail = TRUE;
    unhand(this)->str = str;

    monitorNotify(obj_monitor(this));
}

struct Hjava_lang_String *SyncString_Get(struct HSyncString *this)
{
    Hjava_lang_String *hstrTemp;

    while(unhand(this)->bAvail == FALSE)
        monitorWait(obj_monitor(this));

    unhand(this)->bAvail = FALSE;
        hstrTemp = unhand(this)->str;

    monitorNotify(obj_monitor(this));

    return hstrTemp;
}

Note
There are other functions that allow native code to construct new monitors not associated with an object. Use of these functions is strongly discouraged, due to the increasing difficulty of deadlock prevention in the face of many monitors.

Native Methods and Exceptions

Native methods can throw exceptions just like a regular Java method. As with any other method that throws an exception, the native method in the Java class must declare the exceptions that it can throw. This looks exactly like the throws clause for any other method.

Your native code throws the actual exception using the SignalError function:

void SignalError(ExecEnv *ee, char *Exception, char *SourceHint);

The first argument is the exception environment to use for the exception. Generally, you will pass a NULL here to indicate the current environment.

The second argument is the fully specified exception name. This name includes the package in which the exception is defined. It should also be specified as a path, replacing the usual period separators with forward slashes (that is, UNIX-style path separators).

The third argument is a source code hint. You can pass a string that describes the error or provides some useful details. This string will appear on the walkback if the exception is not handled. It is safe to pass a NULL here.

Using Java Strings from Native Methods

Because there is an "impedance mismatch" between Java strings and C strings, there are several convenience functions that allow C code to use the Java strings. Java strings offer several advantages over C strings, including reference counting, automatic memory management, and Unicode support. On the other hand, no C runtime library function or API function will expect a Java string. The utility functions provide a bridge between Java and C strings.

These functions are defined in javaString.h, which will be included automatically if you include StubPreamble.h:

Summary

Native methods provide a powerful means of extending the capabilities of the Java runtime environment. Although native methods are inherently platform-specific, there are certain cases where they might be used. By tapping directly into the implementation of the Java object model, the native language functions can access instance and class variables, construct objects, and call static and dynamic methods. Native methods can be synchronized like ordinary Java methods. They can also throw exceptions. To bridge the gap between Java strings and C strings, the Java runtime environment provides a number of utility functions.

The steps to constructing native methods are as follows:

  1. Define the class interface.
  2. Write the Java classes.
  3. Compile the Java classes.
  4. Use javah to create the header and stub files.
  5. Write the implementation file.
  6. Compile the C files into a dynamically loadable library.
  7. Install the dynamically loadable library on the target platform.