java.lang.Object | |
↳ | java.nio.charset.CharsetDecoder |
A converter that can convert a byte sequence from a charset into a 16-bit Unicode character sequence.
The input byte sequence is wrapped by a ByteBuffer and the output character sequence is a CharBuffer. A decoder instance should be used in the following sequence, which is referred to as a decoding operation:
endOfInput
parameter must be set to false, the input buffer must be filled and the
output buffer must be flushed between invocations;endOfInput
parameter
must be set to true;The decode method will convert as many bytes as possible, and the process won't stop until the input bytes have run out, the output buffer has been filled or some error has happened. A CoderResult instance will be returned to indicate the stop reason, and the invoker can identify the result and choose further action, which includes filling the input buffer, flushing the output buffer or recovering from an error and trying again.
There are two common decoding errors. One is named malformed and it is returned when the input byte sequence is illegal for the current specific charset, the other is named unmappable character and it is returned when a problem occurs mapping a legal input byte sequence to its Unicode character equivalent.
Both errors can be handled in three ways, the default one is to report the error to the invoker by a CoderResult instance, and the alternatives are to ignore it or to replace the erroneous input with the replacement string. The replacement string is "�" by default and can be changed by invoking replaceWith method. The invoker of this decoder can choose one way by specifying a CodingErrorAction instance for each error type via onMalformedInput method and onUnmappableCharacter method.
This is an abstract class and encapsulates many common operations of the decoding process for all charsets. Decoders for a specific charset should extend this class and need only to implement the decodeLoop method for the basic decoding. If a subclass maintains an internal state, it should override the implFlush method and the implReset method in addition.
This class is not thread-safe.
Protected Constructors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Constructs a new
CharsetDecoder using the given
Charset , average number and maximum number of characters
created by this decoder for one input byte, and the default replacement
string "�". |
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Gets the average number of characters created by this decoder for a
single input byte.
| |||||||||||
Gets the
Charset which this decoder uses. | |||||||||||
Decodes bytes starting at the current position of the given input buffer,
and writes the equivalent character sequence into the given output buffer
from its current position.
| |||||||||||
This is a facade method for the decoding operation.
| |||||||||||
Gets the charset detected by this decoder; this method is optional.
| |||||||||||
Flushes this decoder.
| |||||||||||
Indicates whether this decoder implements an auto-detecting charset.
| |||||||||||
Indicates whether this decoder has detected a charset; this method is
optional.
| |||||||||||
Gets this decoder's
CodingErrorAction when malformed input
occurred during the decoding process. | |||||||||||
Gets the maximum number of characters which can be created by this
decoder for one input byte, must be positive.
| |||||||||||
Sets this decoder's action on malformed input errors.
| |||||||||||
Sets this decoder's action on unmappable character errors.
| |||||||||||
Sets the new replacement string.
| |||||||||||
Gets the replacement string, which is never null or empty.
| |||||||||||
Resets this decoder.
| |||||||||||
Gets this decoder's
CodingErrorAction when an unmappable
character error occurred during the decoding process. |
Protected Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Decodes bytes into characters.
| |||||||||||
Flushes this decoder.
| |||||||||||
Notifies that this decoder's
CodingErrorAction specified
for malformed input error has been changed. | |||||||||||
Notifies that this decoder's
CodingErrorAction specified
for unmappable character error has been changed. | |||||||||||
Notifies that this decoder's replacement has been changed.
| |||||||||||
Reset this decoder's charset related state.
|
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
From class java.lang.Object
|
Constructs a new CharsetDecoder
using the given
Charset
, average number and maximum number of characters
created by this decoder for one input byte, and the default replacement
string "�".
charset | the Charset to be used by this decoder. |
---|---|
averageCharsPerByte | the average number of characters created by this decoder for one input byte, must be positive. |
maxCharsPerByte | the maximum number of characters created by this decoder for one input byte, must be positive. |
IllegalArgumentException | if averageCharsPerByte or
maxCharsPerByte is negative. |
---|
Gets the average number of characters created by this decoder for a single input byte.
Gets the Charset
which this decoder uses.
Charset
which this decoder uses.Decodes bytes starting at the current position of the given input buffer, and writes the equivalent character sequence into the given output buffer from its current position.
The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.
A CoderResult
instance will be returned according to
following rules:
out
argument that has not already been filled.
The endOfInput
parameter indicates that the invoker cannot
provide further input. This parameter is true if and only if the bytes in
current input buffer are all inputs for this decoding operation. Note
that it is common and won't cause an error if the invoker sets false and
then can't provide more input, while it may cause an error if the invoker
always sets true in several consecutive invocations. This would make the
remaining input to be treated as malformed input.
This method invokes the decodeLoop method to implement the basic decode logic for a specific charset.
in | the input buffer. |
---|---|
out | the output buffer. |
endOfInput | true if all the input characters have been provided. |
CoderResult
instance which indicates the reason
of termination.IllegalStateException | if decoding has started or no more input is needed in this decoding progress. |
---|---|
CoderMalfunctionError | if the decodeLoop
method threw an BufferUnderflowException or
BufferOverflowException . |
This is a facade method for the decoding operation.
This method decodes the remaining byte sequence of the given byte buffer into a new character buffer. This method performs a complete decoding operation, resets at first, then decodes, and flushes at last.
This method should not be invoked while another decode
operation
is ongoing.
in | the input buffer. |
---|
CharBuffer
containing the the characters
produced by this decoding operation. The buffer's limit will be
the position of the last character in the buffer, and the
position will be zero.IllegalStateException | if another decoding operation is ongoing. |
---|---|
MalformedInputException | if an illegal input byte sequence for this charset was encountered, and the action for malformed error is CodingErrorAction.REPORT |
UnmappableCharacterException | if a legal but unmappable input byte sequence for this charset was encountered, and the action for unmappable character error is CodingErrorAction.REPORT. Unmappable means the byte sequence at the input buffer's current position cannot be mapped to a Unicode character sequence. |
CharacterCodingException | if another exception happened during the decode operation. |
Gets the charset detected by this decoder; this method is optional.
If implementing an auto-detecting charset, then this decoder returns the detected charset from this method when it is available. The returned charset will be the same for the rest of the decode operation.
If insufficient bytes have been read to determine the charset, an
IllegalStateException
will be thrown.
The default implementation always throws
UnsupportedOperationException
, so it should be overridden
by a subclass if needed.
UnsupportedOperationException | if this decoder does not implement an auto-detecting charset. |
---|---|
IllegalStateException | if insufficient bytes have been read to determine the charset. |
Flushes this decoder. This method will call implFlush. Some decoders may need to write some characters to the output buffer when they have read all input bytes; subclasses can override implFlush to perform the writing operation.
The maximum number of written bytes won't be larger than
out.remaining(). If some decoder wants to
write more bytes than an output buffer's remaining space allows, then a
CoderResult.OVERFLOW
will be returned, and this method
must be called again with a character buffer that has more remaining
space. Otherwise this method will return
CoderResult.UNDERFLOW
, which means one decoding process
has been completed successfully.
During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.
out | the given output buffer. |
---|
CoderResult.UNDERFLOW
or
CoderResult.OVERFLOW
.IllegalStateException | if this decoder hasn't read all input bytes during one decoding process, which means neither after calling decode(ByteBuffer) nor after calling decode(ByteBuffer, CharBuffer, boolean) with true as value for the last boolean parameter. |
---|
Indicates whether this decoder implements an auto-detecting charset.
true
if this decoder implements an auto-detecting
charset.Indicates whether this decoder has detected a charset; this method is optional.
If this decoder implements an auto-detecting charset, then this method may start to return true during decoding operation to indicate that a charset has been detected in the input bytes and that the charset can be retrieved by invoking the detectedCharset method.
Note that a decoder that implements an auto-detecting charset may still
succeed in decoding a portion of the given input even when it is unable
to detect the charset. For this reason users should be aware that a
false
return value does not indicate that no decoding took
place.
The default implementation always throws an
UnsupportedOperationException
; it should be overridden by
a subclass if needed.
true
if this decoder has detected a charset.UnsupportedOperationException | if this decoder doesn't implement an auto-detecting charset. |
---|
Gets this decoder's CodingErrorAction
when malformed input
occurred during the decoding process.
CodingErrorAction
when malformed
input occurred during the decoding process.Gets the maximum number of characters which can be created by this decoder for one input byte, must be positive.
Sets this decoder's action on malformed input errors. This method will call the implOnMalformedInput method with the given new action as argument.
newAction | the new action on malformed input error. |
---|
IllegalArgumentException | if newAction is null . |
---|
Sets this decoder's action on unmappable character errors. This method will call the implOnUnmappableCharacter method with the given new action as argument.
newAction | the new action on unmappable character error. |
---|
IllegalArgumentException | if newAction is null . |
---|
Sets the new replacement string. This method first checks the given replacement's validity, then changes the replacement value, and at last calls the implReplaceWith method with the given new replacement as argument.
newReplacement | the replacement string, cannot be null or empty. Its length cannot be larger than maxCharsPerByte(). |
---|
IllegalArgumentException | if the given replacement cannot satisfy the requirement mentioned above. |
---|
Gets the replacement string, which is never null or empty.
Resets this decoder. This method will reset the internal status, and then
calls implReset()
to reset any status related to the
specific charset.
Gets this decoder's CodingErrorAction
when an unmappable
character error occurred during the decoding process.
CodingErrorAction
when an
unmappable character error occurred during the decoding process.Decodes bytes into characters. This method is called by the decode method.
This method will implement the essential decoding operation, and it won't
stop decoding until either all the input bytes are read, the output
buffer is filled, or some exception is encountered. Then it will return a
CoderResult
object indicating the result of current
decoding operation. The rules to construct the CoderResult
are the same as for
decode. When an
exception is encountered in the decoding operation, most implementations
of this method will return a relevant result object to the
decode method, and some
performance optimized implementation may handle the exception and
implement the error action itself.
The buffers are scanned from their current positions, and their positions will be modified accordingly, while their marks and limits will be intact. At most in.remaining() characters will be read, and out.remaining() bytes will be written.
Note that some implementations may pre-scan the input buffer and return a
CoderResult.UNDERFLOW
until it receives sufficient input.
in | the input buffer. |
---|---|
out | the output buffer. |
CoderResult
instance indicating the result.Flushes this decoder. The default implementation does nothing and always
returns CoderResult.UNDERFLOW
; this method can be
overridden if needed.
out | the output buffer. |
---|
CoderResult.UNDERFLOW
or
CoderResult.OVERFLOW
.Notifies that this decoder's CodingErrorAction
specified
for malformed input error has been changed. The default implementation
does nothing; this method can be overridden if needed.
newAction | the new action. |
---|
Notifies that this decoder's CodingErrorAction
specified
for unmappable character error has been changed. The default
implementation does nothing; this method can be overridden if needed.
newAction | the new action. |
---|
Notifies that this decoder's replacement has been changed. The default implementation does nothing; this method can be overridden if needed.
newReplacement | the new replacement string. |
---|
Reset this decoder's charset related state. The default implementation does nothing; this method can be overridden if needed.