2004-05-19 The Java
Specialists' Newsletter [Issue 088] - Resetting
ObjectOutputStream
Author: Dr. Heinz M. Kabutz
If you are reading this, and
have not subscribed, please consider doing it now by
going to our
subscribe page. You can subscribe either via
email or RSS.
Welcome to the 88th edition
of The Java(tm) Specialists'
Newsletter. Our readership has increased to 100
countries, with the recent addition of Botswana. A
special welcome to my neighbouring country :-)
Please remember to forward these newsletters to
friends and colleagues who would be interested in
joining. The bigger the readership, the more
pressure I will be under to write new, good,
newsletters :-)
For all the Chinese readers
of this newsletter, please read our
Mandarin translation of the Design Patterns Brochure
and tell us what you think of it. [For the English
version, please
click here] Please let me know if would be able
to
translate some of our newsletters
into Mandarin.
There comes a time in any
company, when it becomes important to know what its
"vision" and "mission" are. A vision and a mission
are supposed to help staff be more focused and
provide better consistent service to customers.
After some thought, we came up with the following
Vision for Maximum Solutions
- The Java(tm) Specialists (not set in concrete
yet):
Maximum Solutions
develops and provides the best training in the world
for professional Java programmers.
There is a saying that goes
round: "Those who cannot do, teach." The impression
is that trainers are usually teaching others because
they themselves are not good enough to be in the
real world. The saying is perhaps a bit unfair, so I
am determined to change this perception. A part of
our Mission is that in order to stay "the best" at
training, we make sure that all our trainers are
active in developing real
software.
That said, our collection of
courses is growing:
- Java Course on Java 2
Standard Edition
- Java Course on Java 2
Enterprise Edition
- Design Patterns in Java
- Design Patterns in
Delphi
- Java
Data
Objects
- UML and Object
Orientation (By Thanassis Tsintsifas)
- Webservices (By Thilo
Frotscher)
- Java Performance Tuning
(By Jack Shirazi)
Please
let me know via email if you would like more
information about the courses that we offer. I don't
mind getting lots of emails, so please don't
hesitate to email me :-) [the SPAMmers have no
qualms in sending me lots of exciting offers]
Resetting ObjectOutputStream
A class with many mysteries
is
java.io.ObjectOutputStream. For
instance, when and why should you reset the stream?
Let's look at an example.
First we have class Person, which is the class that
we want to send over the network:
public class Person implements java.io.Serializable {
private final String firstName;
private final String surname;
private int age;
public Person(String firstName, String surname, int age) {
this.firstName = firstName;
this.surname = surname;
this.age = age;
}
public String toString() {
return firstName + " " + surname + ", " + age;
}
public void setAge(int age) {
this.age = age;
}
}
Next we have the code that
Receives lots of Person objects and code that Sends
them:
import java.net.*;
import java.io.*;
public class Receiver {
public static void main(String[] args) throws Exception {
ServerSocket ss = new ServerSocket(7000);
Socket socket = ss.accept();
ObjectInputStream ois = new ObjectInputStream(
socket.getInputStream());
int count=0;
while(true) {
Person p = (Person) ois.readObject();
if (count++ % 1000 == 0) {
System.out.println(p);
}
}
}
}
import java.net.Socket;
import java.io.*;
public class Sender {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
Person p = new Person("Heinz", "Kabutz", 0);
for (int age=0; age < 1500 * 1000; age++) {
p.setAge(age);
oos.writeObject(p);
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
The output was:
java Receiver:
*snip*
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
java Sender:
That took 19548ms
When we run this, we will see
lots of People objects on the Receiver side, but all
the age values will be 0, even though we changed the
age on the Sender side. Why is this?
When you construct an
ObjectOutputStream and an ObjectInputStream, they
each contain a cache of objects that have already
been sent across this stream. The cache relies on
object identity, rather than the traditional hashing
function. It is more similar to a
java.util.IdentityHashMap than a normal
java.util.HashMap. So, if you resend the same
object, only a pointer to the object is sent across
the network. This is very clever, and saves
network
bandwidth. However, the
ObjectOutputStream cannot detect whether your object
was changed internally, resulting in the Receiver
just seeing the same object over and over again. You
will notice that this was quite fast. We sent
1'500'000 objects in 19548ms (on my machine). (well,
we only sent one object, and 1'499'999 pointers to
that object).
There seemed to be some
problem with sending the same Person object many
times, especially if the contents of that Person
changed. Due to the optimisation in
ObjectOutputStream, only the pointer to the Person
would be sent each time. So, what would happen if we
simply sent a new Person each time? Let's try it
out...
import java.net.Socket;
import java.io.*;
public class Sender2 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
This seems to run fine for a
while, until we all of a sudden see an OutOfMemory
error on both the Receiver and the Sender2. Someone
once challenged regarding the pathetic speed of
Java. They claimed that Java was so slow that the
Garbage Collector could not even keep up with
objects that were being read over the network. It
sounded strange to me that Java should run out of
memory so after some questioning, we traced the
problem to the object cache growing in the Receiver
and never being cleared. Since the Person objects
are always distinct, they are put into the cache on
both sides of the ObjectOutputStream. The Receiver's
side cannot clear entries from the table, since it
does not know which entries the Sender might send
again. It then keeps on growing until the
JVM runs out of memory.
Resetting ObjectOutputStream
One hack^H^H^H^Hsolution to
the OutOfMemory problem is to every time that you
send an object also reset the cache on both sides.
Let's try out what that does to our performance:
import java.net.Socket;
import java.io.*;
public class Sender3 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
oos.reset();
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
When I ran that, it worked
without causing any OutOfMemory Errors, so I should
be happy. But am I happy? I am old, after having to
wait for 314242ms for it to complete, i.e. 16 times
longer than with Sender. Sender was fast, but
incorrect. Sender2 ran out of memory. Sender3 was
correct, but slow. Is there no better way?
The problem with reset() is
that it clears the cache of ALL objects, even
constants such as the Strings "Heinz" and "Kabutz".
So, we end up sending these constants over the
network time and time again!
Unfortunately the reset() is an all-or-nothing
approach, so the entire cache will be lost. But
perhaps, if we don't clear it all the time, we can
get the advantage of speed and correctness? Let's
try that out:
import java.net.Socket;
import java.io.*;
public class Sender4 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
if (age % 1000 == 0) oos.reset();
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
Because I don't reset the
cache on every call, Sender4 can avoid sending the
Strings "Heinz" and "Kabutz" over the network
1'500'000 times in just 66015ms. Infact, it only has
to send these Strings 1'500 times. If we reset the
ObjectOutputStream too frequently, we will increase
the network bandwidth, and if we do not reset it
often enough, we will increase the burden of our
Garbage Collector. Like all things in
Java
Performance Tuning, you have to
set it to the correct
number, not too big and not too little.
What About RMI?
I seem to recall that at some
point, RMI used the ObjectOutputStream mechanism to
convert the parameters of your functions into a
byte[]. The interesting part was that it would make
the ObjectOutputStream, write the objects, and then
close the ObjectOutputStream. This is akin to
resetting the stream each time that you write to it.
Depending on how you would
want to transfer your data between two machines, and
depending on how many times there will be identical
objects sent across the network, it may pay you to
use ObjectOutputStreams directly, and be careful to
reset the stream before you run out of memory.
In the last newsletter, I
suggested that you could use the
sun.* classes in your code. I did
not emphasize strongly enough that you should be
very careful of using
sun.* classes in your code, since it would make your
Java
code non-portable between JVM
vendors. This is a newsletter for
Java Specialists so I
will sometimes leave out such obvious details :-)
However, several readers mentioned that you could
achieve the same with a SecurityManager, which I had
forgotten about. I guess if you were not able to use
the SecurityManager, you could generate a stack
trace and find out who called you. However,
generating a stack trace would be rather inefficient
(another obvious fact that is hardly worth
mentioning ;-)
I want to personally thank
you for taking the time to read my newsletters. They
are a wonderful hobby for me and I thoroughly enjoy
publishing them as a
free
resource to other Java
Specialists. Please remember to forward them to
friends, mention them on mailing lists, tell
colleagues, etc. so that others can also enjoy them
:-)
Lastly, I am collecting
quotes of what my happy readers think of my
newsletter. If you have some nice words that would
make others subscribe to The
Java(tm) Specialists' Newsletter, would you
please
send them to me?
Kind regards
Heinz |