|
| StreamTokenizer () |
| Default constructor. More...
|
|
| StreamTokenizer (TextReader sr) |
| Construct and set this object's TextReader to the one specified. More...
|
|
| StreamTokenizer (string str) |
| Construct and set a string to tokenize. More...
|
|
void | Display () |
| Display the state of this object. More...
|
|
void | Display (string prefix) |
| Display the state of this object, with a per-line prefix. More...
|
|
bool | NextToken (out Token token) |
| Get the next token. The last token will be an EofToken unless there's an unterminated quote or unterminated block comment and Settings.DoUntermCheck is true, in which case this throws an exception of type StreamTokenizerUntermException or sub-class. More...
|
|
bool | Tokenize (IList< Token > tokens) |
| Parse the rest of the stream and put all the tokens in the input ArrayList. This resets the line number to 1. More...
|
|
bool | TokenizeReader (TextReader tr, IList< Token > tokens) |
| Parse all tokens from the specified TextReader, put them into the input ArrayList. More...
|
|
bool | TokenizeFile (string fileName, IList< Token > tokens) |
| Parse all tokens from the specified file, put them into the input ArrayList. More...
|
|
Token[] | TokenizeFile (string fileName) |
| Tokenize a file completely and return the tokens in a Token[]. More...
|
|
bool | TokenizeString (string str, IList< Token > tokens) |
| Parse all tokens from the specified string, put them into the input ArrayList. More...
|
|
bool | TokenizeStream (Stream s, IList< Token > tokens) |
| Parse all tokens from the specified Stream, put them into the input ArrayList. More...
|
|
IEnumerator< Token > | GetEnumerator () |
| Gibt einen Enumerator zurück, der die Auflistung durchläuft. More...
|
|
A StreamTokenizer similar to Java's. This breaks an input stream (coming from a TextReader) into Tokens based on various settings. The settings are stored in the TokenizerSettings property, which is a StreamTokenizerSettings instance.
This is configurable in that you can modify TokenizerSettings.CharTypes[] array to specify which characters are which type, along with other settings such as whether to look for comments or not.
WARNING: This is not internationalized. This treats all characters beyond the 7-bit ASCII range (decimal 127) as Word characters.
There are two main ways to use this: 1) Parse the entire stream at once and get an ArrayList of Tokens (see the Tokenize* methods), and 2) call NextToken() successively. This reads from a TextReader, which you can set directly, and this also provides some convenient methods to parse files and strings. This returns an Eof token if the end of the input is reached.
Here's an example of the NextToken() endCapStyle of use:
tokenizer.GrabWhitespace = true;
tokenizer.TextReader = File.OpenText(fileName);
Token token;
while (tokenizer.NextToken(out token)) log.
Info(
"Token = '{0}'", token);
Here's an example of the Tokenize... endCapStyle of use:
ArrayList tokens = new ArrayList();
if (!tokenizer.Tokenize(tokens))
{
}
foreach (Token t in tokens) Console.WriteLine("t = {0}", t);
Comment delimiters are hardcoded (// and /*), not affected by char type table.
This sets line numbers in the tokens it produces. These numbers are normally the line on which the token starts. There is one known caveat, and that is that when GrabWhitespace setting is true, and a whitespace token contains a newline, that token's line number will be set to the following line rather than the line on which the token started.