Java Io Classes Are Impossible To Understand

Java designers managed to create almost every combination of: Input, Output, Stream, Reader, Writer, Buffered, Print. to name classes. And even for very simple tasks, you have to combine 2 or 3 of them.


The java io library uses the DecoratorPattern and it already enables one to combine features like buffering. It's not hard and it's not impossible to understand, otherwise how would millions of programmers use it all the time? If you understand the DecoratorPattern, argument overloading, and return types - and you lay out the menagerie of Java IO classes - then you should also be able to understand how they work.

But, some would argue that OoLacksMathArgument partly because the programmer predetermines possible data structures using methods which are not supported on any solid mathematical foundation. They select classes which are on file in a library, on the basis of ideas which are contrived, not derived from axioms, postulates, and definitions. Thus an ad hoc query, for example, must specify objects relative to the structure the programmer imposes on the data, but that structure may be somewhat arbitrary. -- NonTopAnonymousDonor

However, trying to use Java to beat up on ObjectOrientation is a bit of a StrawMan.

There are plenty of OO styles not represented in the Java libraries. For example, a features, like "buffered", could be considered generally orthogonal and selected in a buffet restaurant kind of way. Decorator tends to assume that you can add functionality by nesting layers. An alternative goes something like this:

  // Feature Smorgasbord
  iofm = new IOfeatureManager();
  iofm.featureX = on;
  iofm.featureY = off;
  iofm.featureZ = foobar;
  ...etc...  // note there will be defaults we don't have to set
  ios = new IOstream(iofm);
  ...use ios...
The above 'featuremanager' idea offers less functionality than the Decorator pattern. For instance, supposing I come up with a new feature Foo. With the Decorator pattern, I simply write a FooReader?. Then I use this to wrap other readers or have other readers wrap it. Easy. How would I do this with featuremanager? What happens if someone else comes along with a new feature Bar and wants to combine it with Foo and various other features that exist? Java has lots of broken/bad APIs but sometimes the solution they've provided is the most suitable given the requirements. -- AdewaleOshineye

How does the feature-manager prevent decorator?

Feature manager doesn't _prevent_ Decorator. However, Decorator means you don't need feature-manager.

It appears that the problem with Decorator is that it does not offer "typical" defaults, but rather starts with a blank slate (or a very small slate). Perhaps there is a general trade-off between having good automatic defaults and being extendable feature-wise?

If your [you?] code "extendable feature-wise" then you'll end up with a large number of possible combinations. It's trivial to combine these features so there's no real need for the vendor to try predict what any particular programmer wants as the default. The standard libraries can't do everything for you. That's why organizations end up building their own libraries of reusable code because they can look at previous projects and identify the combinations they actually used. This is a good thing.

Perhaps the audience of Java is a large IT shop with internal standards groups. The problem is that some use Java where something like PHP may be more appropriate because of the lower barrier of entry. It is sort of a phalic symbol kind of thing where people think that what is good for General Motors is good for Joe's Bar And Grill also. See also "Ease of Use Vs. Ease of Implementation Change" below.


Antidote: the programmer simply learns to select the best possible combination of basic commands in view of the problem at hand.

When it takes 5 days to figure out how to print "Hello World", something is F'd.

Here we list a few idioms for opening files and file-like things in java. This is not meant as an argument pro or con java io, but rather as a handy reference for those who would rather approach the io javadocs with an example in hand. (Honest, I've had to poll the local java mailing list to figure these things out.) -- WardCunningham

An Applet reads the text/plain output of a cgi script.

    URL script = new URL(getDocumentBase(), "script.cgi");
    input = new BufferedReader(new InputStreamReader((InputStream) script.getContent()));
An Applet gets an image packaged with the class.

    Image icon = getImage(getCodeBase(), "icon.gif");

It's obvious that Java's IO classes aren't impossible to understand since many people understand them well enough to use them effectively. I highly recommend Sun's tutorial (http://java.sun.com/docs/books/tutorial/essential/io/index.html). -- EricHodges

Part of the problem with the IO classes is that they offer a lot of flexibility but the language doesn't seem to offer an easy way to circumvent all that flexibility and just do the one simple thing you want to do, like just read in all the contents of a medium-sized text file. My Java toolbox contains a simple little FileReader class that wraps all that so I only had to figure it out once. Luckily, OO makes that reuse a little easier. -- francis

Exactly. Once you've whipped up your own FileReader? with readLines and read methods (like Python) you can use it like just any other Reader and benefit from being able to reuse any code that expects a Reader. I find it hard to conceive of another approach that offers all these benefits with this level of simplicity. -- AdewaleOshineye


A good interface design is one that is simple when you need it to be simple and only exposes its guts when you need them. Java IO is turned inside out, showing all its guts.

Sun did not provided a set of convenience methods that provide the most often used combinations of file/stream/io.

One could write them. Then they would have their own library of convenience methods/utilities.

If The majority of users have to write their own to get a simple interface that is simply there in most other languages, then I would consider that interface a design smell. It is almost like having to write your own add operation.

Or one could just download commons-io from the Apache project which already does this.

I would suggest Sun mention this in their documentation rather than give the user a headache when they just want echo a simple something to a file.


Java's I/O support is no more complex than that offered by C++, except you have to deal with reading ASCII as Unicode sometimes. Other than that a lot of the stuff I used to have to write is provided by Sun. And my I/O code isn't peppered with "#ifdef"s for multiple OSs.

The real question is, how did you become convinced that I/O libraries don't need all these options? -- EricHodges

Here's a simple exercise for you, Eric. You get a file name, read 4 bytes from the offset 0xA0 in the file as an int (big endian), those will represent another offset in the file. Go to this later offset , where you'll read a text stream as follows: first 4 bytes are the stream length, next the encoding (one of: ASCII, UTF-8, UTF-16 ) follow by '\n' and then the text in the corresponding encoding. Gice that text to a Parser that expects a Reader (line oriented).

So let's see how conveninet is that great java.io design. -- CostinCozianu

Does this have anything to do with my question? -- EricHodges

It has to do with the trolling mode some of top's foes (including you) have engaged in, unwittingly, maybe. I think it's not the case to defend any position as long as top attacks it, java.io may be relatively comprehensible but it is not a monument of good IO design, basically the observation that there are too many classes and too many methods is absolutely right, what is even funnier is that you have so many and so little power. Now let's see some code, please. -- ??

What will my code add to the discussion? -- EricHodges [Nothing - don't fall for the troll-bait. -- tms]

The fact that you can't do it easily will tell you how little power you have in those heavy java.io classes. -- Costin?

		String fileName = "file.name";
		RandomAccessFile aRandomAccessFile =
			new RandomAccessFile(fileName, "r");
		aRandomAccessFile.seek(0xA0);
		int offset = aRandomAccessFile.readInt();
		aRandomAccessFile.seek(offset);
		int streamLength = aRandomAccessFile.readInt();
		String costinsEncodingName = aRandomAccessFile.readLine();
		String charsetName =
			translateCostinsEncodingNameToJavaCharsetName(costinsEncodingName);
		InputStreamReader anInputStreamReader =
			new InputStreamReader(new FileInputStream(fileName), charsetName);
		anInputStreamReader.skip(
			offset + 4 + costinsEncodingName.length() + "\n".length());
		parser.parse(anInputStreamReader, streamLength);
Perhaps I misunderstand the requirements, but I found it relatively easy to write this code. I don't understand how this relates to the presence or absence of options and the ease of understanding issues being discussed on this page. -- EricHodges

Ok, now that you wrote a little bit better code, I can only comment that if ten years ago I'd have opened a file, moved the cursor to the position that I need, only to reopen it again the next statement and move this other cursor to the same position I'd have been fired on the spot. How in the hell can you justify such a thing other than the bad design of Java libraries? So after you already explored an unsafe solution, now you came back with a safe, but pretty uggly one. If you were programming this in any other language you'd never had reopened the same file. -- Costin

You sound like you're angry at me. Is that your intention? I'd do the same thing in C. How would you write this in C or C++? -- EricHodges

Let me have a try, seems to be straight forward:

            FileInputStream fis = new FileInputStream(filename);
            // --------- Goto 0xA0
            long offset = (long)0xA0;
            long skipped = 0;
            while (skipped<offset) skipped += fis.skip(offset - skipped);
            // --------- Read offset
            offset = new DataInputStream(fis).readInt();
            skipped += 4;
            assert offset >= skipped;
            // --------- goto offset
            while (skipped<offset) skipped += fis.skip(offset - skipped);
            // --------- read length
            int length = new DataInputStream(fis).readInt();
            assert length > 0;
            // --------- read encoding, a little dumb, but safe
            StringBuffer encBuffer = new StringBuffer();
            for (int b = fis.read(); b!='\n' && b!=-1; b=fis.read()) encBuffer.append((char)b);
            String encoding = encBuffer.toString();
            // --------- create reader
            // It will be better to add a filter that will terminate after reading length bytes here.
            BufferedReader reader = new BufferedReader(new InputStreamReader(fis, translateToJavaEncodingName(encoding)));
            // --------- parse
            parser.parse(length, reader);
I have tested it briefly with a file, seems to work as expected. -- OliverChung

I guess this is what you were asking for

            public class PartialFileReader? extends Reader {
                        private InputStream? backedInput;
                        private boolean firstRead = true;
                        private int length;
                        private String encoding;
                        public PartialFileReader?(InputStream? input){
                                    this.backedInput = input;
                        }
                        public void close(){
                                    backedInput.close();
                        }
                        public int read(char[], int off, int len){
                                    if(firstRead){
                                                //-- Go to offset A0, using backedInput.skip(16*10).
                                                //-- Read a new offset using skip|mark|reset to go to new address.
                                                //-- read stream size, assign it to length. read encoding.
                                                firstRead = false;
                                    }
                                    // read from input stream and decode it using String(byte[], encoding)
                                    // decrement length using number of bytes read.
                                    // fill string in char[] and return.
                        }
            }
Then you can use it like

            InputStream? fileIn = // Get InputStream? from the file
            Reader reader = new PartialFileReader?(fileIn);
            yourParser.use(reader);
Now your parser would simply call read() or readline and get all the string, not knowing there is offset resolution going on in there. Suppose in the future you support oher kind of file format. Your Parser doesn't have to know that (it simply think it reads from plain text file) But your can implement another reader to handle file format.

What I tried to suggest was that unlike in any other language that I know of, an object (file descriptor, etc) that represents a file open for random access (RandomAccessFile?) is not type compatible with InputStream?, nor with OutputStream?, and we have to write adapters to wrap RandomAccessFile?. This makes in my opinion for an unnecessary complication when accessing random files. It's true that probably most file access is either write only (append only) or read only, however there's no logical reason to treat files with random access differently.

At the other extreme, I think, is DotNet who has a single abstract base class Stream that is used for everything (read, write, seek) . I don't know if this is the right solution, but it certainly makes a lot of things simpler. DotNet Stream class also has the advantage that it supports true asynchronous I/O, whereas in Java poor developers have this whole new conceptual load to suffer with the JavaNewIo, which, I believe is the single most cumbersome part of the base Java platform.


Does that mean there's another API out there where this can be done easily?

At the risk of being a total troglodyte, I'd offer the Unix IO library as offering more power with less weight than the Java menagerie. The wrapper provided by both VisualWorks and VisualAge does a pretty good job of surfacing all of its expressivity in a perfectly usable (by the rest of Smalltalk) way - all in all, far better than the Java mess. -- TomStambaugh

Correct me if I'm wrong but don't the standard unix IO libraries tend to be relatively unsafe? Isn't that very power/complexity what leads to so many buffer overflows, memory leaks, etc?

Of course they're unsafe! So is coding in assembler. That's what wrappers (AlternateHardAndSoftLayers) are for. ts

Several script languages also handle the mapping of binary formats quite neatly. Having coded in both I'd probably agree with some of the sentiment here: for simple IO operations (90% of the IO you want to do), Java's IO system is just overly complex and confusing. -- Setok

I think that describing the Unix I/O API as having more power than Java is a total misrepresentation. Unix I/O provides five basic functions: open, read, write, ioctl and close. This provides only unbuffered, binary input and output. Translated into Java, the open/read/write/close functions are covered by two classes: FileInputStream? and FileOutputStream?. Is Java really much harder to use?

The ioctl function provides a generic interface for controlling streams. Ioctl is really a way to call any number of functions in the kernel, so the Unix I/O API is not five basic functions but an open ended set of functions that you have to learn on a device-by-device basis.

Note: for the sake of simplicity, I've ignored the fcntl function that is different from ioctl for no apparent reason, the select and poll functions used for multiplexing, scatter/gather I/O, the error handling functions, the signal handling functions for asynchronous I/O, the entire socket API, and the SysV IPC message API which is incompatible with the stream I/O and socket APIs.

Any reasonable application needs to implement I/O memory management, I/O buffering, and text input and output. None of these functions are trivial. The Unix I/O API leaves that entirely up to the application. Usually it is implemented by the standard library of the language. The C library provides buffering of file input and output, but does not support text streams with multibyte characters and I/O streams other than files and standard input/output/error. The C I/O library is so limited that many alternatives have been developed: many scripting languages implement their own I/O libraries directly on top of the Unix API and bypass the C I/O functions entirely. The Java library supports buffering, a huge number of text formats, and the NIO package supports asynchronous and scatter/gather I/O. How many alternative I/O APIs have been written for Java?


... java.io may be relatively comprehensible but it is not a monument of good IO design, basically the observation that there are too many classes and too many methods is absolutely right ...

So it would seem that even Costin shares our disagreement with the claim that JavaIoClassesAreImpossibleToUnderstand. I share Costin's observation that there are too many classes, too many methods, and not enough power in the Java IO classes. I'd go even further and make the same observation about Java in general. Perhaps, instead of deepening yet another "JavaSucks?" or "JavaIsDead" argument, we might just stop with the apparent consensus (minus the page originator) that the Java IO classes are possible - even straightforward - to understand.

-- TomStambaugh

I think the originator of this page, tries to state that it is impossible to understand what went through the minds of java.io designer to came up with seemingly weird choices. To understand a framework, you have to be in the same mindset as the framework's creator, and understand the why and the what for and when you need to do something you instantaneously click to the justified solution within that framework. From this perspective is not difficult to argue that JavaIoClassesAreImposssibleToUnderstand?.

There's a simple explanation to it, though. JavaIO evolved over a long period of time, and compatibility was paramount. After the initial mistake of making InputStream/OutputStream abstract classes instead of interfaces (that's why RandomAccessFile? ended up being type incompatible) and splitting the hair in seven with RandomAccessFile?, DataInput?/Output, Input/OutputStream?, Reader/Writer, things would have been refactored in any normal project. But Java SDK is not like any normal project. -- CostinCozianu

It looks like every project I've ever worked on that published an API. You can't fix a published interface. The best you can do is add a better version and hope people upgrade. -- EricHodges

I think we may be getting too tied up in precise definitions of "impossible" and "understand". My interpretation is that the original author was using exageration to make the point that Java I/O classes require a great deal of learning curve to allow one to use them effectively. I am not sure this is a controversial observation, nor would it be limited to either Java I/O classes or Java in general.

A more interesting discussion might be over the value of "few options with few capabilities" versus "many options with many capabilities". Another possibility would be the value of "few methods using switch parameters" versus "many methods avoiding switch parameters."

-- WayneMack

Too many methods? Reminds me of saying a song has too many notes. Are you saying that someone shouldn't want the services offered by the methods? Or that they are necessary because of poor design? Or you think you'll never need them so they shouldn't be there? Some examples would be interesting.

I like your metaphor of a song with "too many notes". Some folks like music with lots of notes. Some folks like music with fewer notes. When I use Java, I use the Java IO classes. I find them perfectly understandable and enormously bloated. Others may find them to their liking. Fair enough. -- TomStambaugh


Go slow in learning them. They are complicated but there is a reason for most of it. I like the CoreJava? books, http://www.horstmann.com/corejava.html. Volume 1 has a good chapter on IO in Java. Still, a nice handy fopen() and printf would be a good thing. -- RobertField


How do I share a buffer between an output and input stream? How do I put the contents of a string Writer into an Input Stream? Just curious.

Complicated questions. What do you want the shared buffer for? If it is for communicating between threads you can use PipedInputStream/PipedOutputStream that have an internal buffer but whose behaviour is unspecified. For the second question , the answer can only be messy, a StringWriter? is character oriented and probably UTF encoded, you can't write bytes into it. Therefore the bytes you'd expect from an InputStream may or may not have some surprises. In any case you can try new ByteArrayInputStream ( StringWriter?.toString().getBytes() ), and hope for the best.
"Impossible" to understand is an exaggeration. Sorry about that. But I only was able to make sense of it after reading Thinking in Java (which explains the decorator pattern) and Core Java (and don't know the main uses by heart yet). I still think it could be simpler. I'm used to read one or two tutorials and after that I use reference guides. But I wasn't able to do that with Java IO classes: The hierarchy doesn't show how to "decorate" the objects. There is a branch for Input/OutputStreams? and another for Readers/Writers. And then, you have the adapter classes InputStreamReader? and OutputStreamWriter?. Reading and writing operations are not always symmetrical: I guess it's usual to read files with BufferedReader? and write to them with PrintWriters? (why not BufferedWriters??). System.out and System.err are PrintSreams?, where they could be PrintWriters?. The list goes on... -- AnonymousDonor

Read the DesignPatternsBook or MarkGrand?'s PatternsInJava. They explain the ideas behind most modern OO libraries. And one does use usually BufferedWriters? to write to files.


I don't like the hierarchy of classes either. Most of my objects have already been cited, but I'd also like to say that the names aren't even that great: especially the way that 'InputStream's are supposed to be considered as something parallel to 'Readers' and similarly for 'OutputStream's and 'Writers'. In each case the former name implies a passive object and the latter implies a very active one. And then they have similar function names... so you read/write the stream, but ask the reader/writer to do the reading/writing. Ugh. A minor objection, and certainly I encounter similar difficulties when I name my own classes, but it rather grates on the nerves.

-- KarlKnechtel
The parallel of input streams and readers may seem redundant to US developers that only process ASCII text, but they are a godsend if you have to read text in multiple encoding. I have written programs that reads, and sometimes translates between, EBCDIC, BIG5, GB, UTF-8, UTF-16, etc. Having separate classes for streams and readers made it clear to every programmer whether he is dealing with a properly decoded text stream or a raw byte stream. Older implementations that perform such things in C are easily 10-100x longer and much more complicated than equivalent Java implementation.

-- OliverChung


Wow, Java 1.5 (a.k.a. 5.0) adds some constructors to PrintWriters? so that you can open a file for writing (with buffering) passing only its name, as a string. To me, it's a welcome feature. If i need something fancy, i will use the decorator pattern, but if i need something simple, now there are easier to use constructors. I don't know if reading from files has been simplified, though.

-- OP


I have to still admit to not seeing the point. OK, so they allow a lot of possibilities. Now, taking a look at the Tcl IO system. You have channels. In fact, everything to do with IO is channels. A channel can be read or write or both. You have a selection of commands to read and write data to those channels. The channels can be configured to do automatic translation (crlf, unicode, binary etc). You can attach callbacks to channels to be notified when you can read or write. If you want to get hardcore you can stack channels so it's possible to f.ex. add SSL to any channel, or other operations, offering you a lot of flexibility which you can safely ignore most of the time.

So I don't get it. I have yet to come across a single thing that I could do (or would need to do) with Java's object mess that I couldn't do in Tcl, and easier.

Keep the most common things easy. Like reading all data from a file:
    set file [open "myfile.txt" r]
    set text [read $file]
    close $file
That's the kind of thing people do 90% of the time. It is very easy to understand and easy to remember, even after your first go at it. With Java I still find myself looking up things in the API documents, browsing back and forth, remembering which classes to use. -- KristofferLawson

Firstly, the C API does not allow you to do that anyway: in C, you'll have to iterate through several consecutive fread calls, unless you have a very precise idea about the size of the file (which is *very* complicated to get dynamically). Next, you pretend your file to be a text file, but what's the encoding? How is the text variable encoded? UTF-8? ASCII? Something else? How can you be sure the guy that written the file used the same encoding as yours? People all over the world are forced every day to incorrectly write their own name just because the coder though that reading a text file was just a matter of pouring some bytes from the disk to the memory. Yes it seems easy that way you propose, but that's because it is just wrong. As usual, if a program doesn't have to actually work, it can meet any other requirement. The Java IO classes might indeed be difficult to undersdand... for people that don't bother to care about non-english speakers.

-- PhilippeDetournay


Ease of Use Vs. Ease of Implementation Change

Re: "...supposing I come up with a new feature Foo. With the Decorator pattern, I simply write a FooReader??. Then I use this to wrap other readers or have other readers wrap it. Easy. How would I do this with featuremanager?"

As the user of an interface, I don't want to have to care about how a new feature is added. I care about the interface being pleasant to me as an interface user (app developer). Adding a new feature to the I/O library should not be my concern under most circumstances. Maybe with yet more indirection both can be achieved, even it if results in ugly implementation code. But I'll leave that as a reader exercise.

As a user, the above "Feature Smorgasbord" approach is probably how I personally would want to interface with the I/O libraries. I don't want a goofy interface just to make the implementators' job easier. That shouldn't be the overriding concern for heavily-used libraries. Maybe there's a fundamental trade-off? Is Java not powerful or meta enough to give me a nice interface and be sufficiently extend-able? I'm just asking.

I'm not clear what you're asking in the rest. The Java I/O classes are by no means ideal, and are arguably quite flawed, but are at least a somewhat tolerable balance between extensibility, composability, and usability.

If there is such an inherent trade-off? I'm merely asking "why?".

Good question. I shall think on this. A hasty answer might start with, "because Java is a rather poorly-designed language", but I'd prefer to expand on that.

Feb 8, 2010 - Any results? I was quite curious to see your analysis.

At some point, I'll do a detailed analysis. That point, alas, is not now. I have Other Things To Do.

Thanks for the feedback.
See: MultipartFormDataParsingExample

EditText of this page (last edited February 8, 2010) or FindPage with title or text search