Inside JVM, text is represented in 16 bit Unicode. For I/O, UTF (UCS (Universal Character set) Transformation Format) is used. UTF uses as many bits as needed to encode a character.
Often programs need to bring in information from an external source or send out information to an external destination. The information can be anywhere: in a file, on disk, somewhere on the network, in memory, or in another program. Also, it can be of any type: objects, characters, images, or sounds.
To bring in information, a program opens a stream on an information source (a file, memory or a socket) and reads the information serially. Similarly, a program can send information to an external destination by opening a stream to a destination and writing the information out serially.
No matter where the information is coming from or going to and no matter what type of data is being read or written, the algorithms for reading and writing data is pretty much always the same.
open a stream
while more information
close the stream
open a stream
while more information
close the stream
For this kind of general I/O Stream/Reader/Writer model is used. These classes are in java.io package. They view the input/output as an ordered sequence of bytes/characters.
We can create I/O chains of arbitrary length by chaining these classes.
These classes are divided into two class hierarchies based on the data type (either characters or bytes) on which they operate. Streams operate on bytes while Readers/Writers operate on chars.
However, it's often more convenient to group the classes based on their purpose rather than on the data type they read and write. Thus, we can cross-group the streams by whether they read from and write to data "sinks" (Low level streams) or process the information as its being read or written (High level filter streams).
Low Level Streams / Data sink streams
Low Level Streams/Data sink streams read from or write to specialized data sinks such as strings, files, or pipes. Typically, for each reader or input stream intended to read from a specific kind of input source, java.io contains a parallel writer or output stream that can create it. The following table gives java.io's data sink streams.
|| Byte Streams
||Use these streams to read from and write to memory. You create these streams on an existing array and then use the read and write methods to read from or write to the array.
||Use StringReader to read characters from a String as it lives in memory. Use StringWriter to write to a String. StringWriter collects the characters written to it in a StringBuffer, which can then be converted to a String. StringBufferInputStream is similar to StringReader, except that it reads bytes from a StringBuffer.
||Implement the input and output components of a pipe. Pipes are used to channel the output from one program (or thread) into the input of another.
||Collectively called file streams, these streams are used to read from or write to a file on the native file system.
High Level Filter Streams / Processing streams
Processing streams perform some sort of operation, such as buffering or character encoding, as they read and write. Like the data sink streams, java.io often contains pairs of streams: one that performs a particular operation during reading and another that performs the same operation (or reverses it) during writing. This table gives java.io's processing streams.
||Buffer data while reading or writing, thereby reducing the number of accesses required on the original data source. Buffered streams are typically more efficient than similar nonbuffered streams.
||Abstract classes, like their parents. They define the interface for filter streams, which filter data as it's being read or written.
Bytes and Characters
||A reader and writer pair that forms the bridge between byte streams and character streams. An InputStreamReader reads bytes from an InputStream and converts them to characters using either the default character-encoding or a character-encoding specified by name. Similarly, an OutputStreamWriter converts characters to bytes using either the default character-encoding or a character-encoding specified by name and then writes those bytes to an OutputStream.
||Concatenates multiple input streams into one input stream.
||Used to serialize objects.
||Read or write primitive Java data types in a machine-independent format. Implement DataInput/DataOutput interfaces.
||Keeps track of line numbers while reading.
||Two input streams each with a 1-character (or byte) pushback buffer. Sometimes, when reading data from a stream, you will find it useful to peek at the next item in the stream in order to decide what to do next. However, if you do peek ahead, you'll need to put the item back so that it can be read again and processed normally. Certain kinds of parsers need this functionality.
||Contain convenient printing methods. These are the easiest streams to write to, so you will often see other writable streams wrapped in one of these.
Reader and InputStream define similar APIs but for different data types. For example, Reader contains these methods for reading characters and arrays of characters:
int read() throws IOException
int read(char cbuf) throws IOException
abstract int read(char cbuf, int offset, int length) throws IOException
InputStream defines the same methods but for reading bytes and arrays of bytes:
abstract int read() throws IOException
int read(byte cbuf) throws IOException
int read(byte cbuf, int offset, int length) throws IOException
Also, both Reader and InputStream provide methods for marking a location in the stream, skipping input, and resetting the current position.
Both Reader and InputStream are abstract. Subclasses should provide implementation for the read() method.
Writer and OutputStream are similarly parallel. Writer defines these methods for writing characters and arrays of characters:
int write(int c) throws IOException
abstract int write(char cbuf)throws IOException
int write(char cbuf, int offset, int length) throws IOException
And OutputStream defines the same methods but for bytes:
abstract int write(int c) throws IOException
int write(byte cbuf) throws IOException
int write(byte cbuf, int offset, int length) throws IOException
Writer defines extra methods to write strings.
void write(String str) throws IOException
void write(String str, int offset, int length) throws IOException
Both Writer and OutputStream are abstract. Subclasses should provide implementation for the write() method.
Constructors for some common streams, reader and writers:
File Input Stream
FileInputStream(String name) throws FileNotFoundException
FileInputStream(File file) throws FileNotFoundException
File Output Stream
FileOutputStream(String name) throws FileNotFoundException
FileOutputStream(String name, boolean append) throws FileNotFoundException
FileOutputStream (File file) throws FileNotFoundException
FileOutputStream (FileDescriptor fdObj)
Data Input Stream
Buffered Input/Output Stream
BufferedInputStream(InputStream in, int size)
BufferedOutputStream(OutputStream out, int size)
FileReader(File file) throws FileNotFoundException
FileReader (FileDescriptor fdObj)
FileReader (String name) throws FileNotFoundException
FileWriter(File file) throws IOException
FileWriter(String name) throws IOException
FileWriter(String name, boolean append) throws IOException
Input Stream Reader
InputStreamReader(InputStream in, String encodingName) throws UnsupportedEncodingException
Output Stream Writer
OutputStreamWriter (OutputStream out, String encodingName) throws UnsupportedEncodingException
PrintWriter(Writer out, boolean autoflush)
PrintWriter(OutputStream out, boolean autoflush)
BufferedReader(Reader in, int size)
BufferedWriter (Writer out, int size)
||Character Set Name
||ISO Latin-1 (subsumes ASCII)
||ISO Latin / Cyrillic
||Standard UTF-8 (subsumes ASCII)
OutputStreamWriter and InputStreamReader are the only ones where you can specify an encoding scheme apart from the default encoding scheme of the host system. getEncoding method can be used to obtain the encoding scheme used.
With UTF-8 Normal ASCII characters are given 1 byte. All Java characters can be encoded with at most 3 bytes, never more.
All of the streams--readers, writers, input streams, and output streams--are automatically opened when created. You can close any stream explicitly by calling its close method. Or the garbage collector can implicitly close it, which occurs when the object is no longer referenced.
Closing the streams automatically flushes them. You can also call flush method.
New FileWriter("filename") or FileOutputStream("filename") will overwrite if
"filename" is existing or create a new file, if not existing. But we can specify the append mode in the second argument.
Print writers provide the ability to write textual representations of Java primitive values. They have to be chained to the low-level streams or writers. Methods in this class never throw an IOException.
PrintStream and PrintWriter classes can be created with autoflush feature, so that each println method will automatically be written to the next available stream. PrintStream is deprecated.(though System.out and System.err are still of this type)
System.in is of InputStream type.
System.in, System.out, System.err are automatically created for a program, by JVM.
Use buffered streams for improving performance.BufferedReader provides readLine method.
User defined classes must implement Serializable or Externalizable interfaces to be serialized.
Serializable is a marker interface with no methods. Externalizable has two methods to be implemented
- readExternal(ObjectInput) and writeExternal(ObjectOutput).
ObjectOutputStream can write both Java Primitives and Object hierarchies. When a compound object is serialized all its constituent objects that are serializable are also serialized.
ObjectOutputStream implements ObjectOutput, which inherits from DataOutput.
All AWT components implement Serializable (Since Component class implements it), so by default we can just use an ObjectOutputStream to serialize any AWT component.
File class is used to navigate the file system.
Constructing a File instance (or Garbage-Collecting it) never affects the file system.
File class doesn't have a method to change the current working directory.
|File(File dir, String name)
||Creates a File instance that represents the file with the specified name in the specified directory
||Creates a File instance that represents the file whose pathname is the given path argument.
|File(String path, String name)
||Creates a File instance whose pathname is the pathname of the specified directory, followed by the separator character, followed by the name argument.
| boolean canRead()
||Tests if the application can read from the specified file.
||Tests if the application can write to this file.
||Deletes the file specified by this object.
||Tests if this File exists.
||Returns the absolute pathname of the file represented by this object.
||Returns the canonical form of this File object's pathname.
||Returns the name of the file represented by this object.
||Returns the parent part of the pathname of this File object, or null if the name has no parent part.
||Returns the pathname of the file represented by this object.
||Tests if the file represented by this File object is an absolute pathname.
||Tests if the file represented by this File object is a directory.
||Tests if the file represented by this File object is a "normal" file.
||Returns the time that the file represented by this File object was last modified.
||Returns the length of the file (in bytes) represented by this File object.
||Returns a list of the files in the directory specified by this File object.
||Returns a list of the files in the directory specified by this File that satisfy the specified filter.
FileNameFilter is an interface that has a method accept().
This list method will call accept for each entry in the list of files and only returns the files for which accept returns true.
| boolean mkdir()
||Creates a directory whose pathname is specified by this File object.
||Creates a directory whose pathname is specified by this File object, including any necessary parent directories.
||Renames the file specified by this File object to have the pathname given by the File argument.
Instances of the file descriptor class serve as an opaque handle to the underlying machine-specific structure representing an open file or an open socket.
Applications should not create their own file descriptors
RandomAccessFile lets you read/write at arbitrary places within files.
RAF provides methods to read/write bytes.
RAF also provides methods to read/write Java primitives and UTF strings. (RAF implements the interfaces DataInput and DataOutput)
File and RAF instances should be closed when no longer needed.
All reading/writing operations throw an IOException. Need to catch or declare our methods to be throwing that exception.
Read/Write methods throw a SecurityException if the application doesn't have rights for the file.
RAF cannot be chained with streams/readers/writers.
RandomAccessFile(File file, String mode) throws FileNotFoundException, IllegalArgumentException, SecurityException
Creates a random access file stream to read from, and optionally to write to, the file specified by the File argument.
RandomAccessFile(String name, String mode) throws FileNotFoundException, IllegalArgumentException, SecurityException
Creates a random access file stream to read from, and optionally to write to, a file with the specified name.
The mode argument must either be equal to "r" or "rw", indicating either to open the file for input or for both input and output.
Some RAF methods
long getFilePointer() throws IOException
Returns the offset from the beginning of the file, in bytes, at which the next read or write occurs.
void seek(long pos) throws IOException
Sets the file-pointer offset, measured from the beginning of this file, at which the next read or write occurs. The offset may be set beyond the end of the file. Setting the offset beyond the end of the file does not change the file length. The file length will change only by writing after the offset has been set beyond the end of the file.
long length() throws IOException
Returns the length of this file, measured in bytes.