Class LexingStream

java.lang.Object
nz.org.riskscape.dsl.LexingStream

public class LexingStream extends Object

Provides an API for the Lexer to consume characters. Tracks the position within the source code as well as having some convenience functions that we will go on to use to support token matching.

  • Constructor Details

    • LexingStream

      public LexingStream(String source)
  • Method Details

    • asCharSequence

      public CharSequence asCharSequence()

      Offers a view over the source string without having to copy the string. This is important for lexing using the Pattern class, which wants a substring to match against. If we use String.substring(int) or String.subSequence(int, int), it clones the string, which for our source code can be quite a lot of bytes, and will end up being done numerous times as it attempts to match a token. This approach is much cheaper.

    • peek

      public char peek()

      Look at the next character without consuming it, returning 0 if it's EOF.

    • next

      public char next()

      Consume the next character from the string, advancing the position as we go.

    • nextIf

      public boolean nextIf(char charAt)

      Consume the next character if it matches charAt

      Returns:
      true if it matched.
    • nextIfOneOf

      public char nextIfOneOf(char[] chars)

      Consume the next character if it matches any of the given chars

      Returns:
      the matching char, or '\0' if none matched.
    • skipWhile

      public void skipWhile(char[] chars)

      Consume characters from the stream while the next one matches any of the given array of chars.

    • nextIf

      public SourceLocation nextIf(String string)

      Advance the source if the next characters match the given string. This method is useful to call at the beginning of pattern matching to only bother returning a SourceLocation if it looks like the token is going to match (most/many tokens will have specific characters they start with.)

      Returns:
      the SourceLocation before the match occurred, use this with newToken(TokenType, SourceLocation, String)
    • getIndex

      public int getIndex()
      Returns:
      the current zero-based index in to the source string.
    • isEof

      public boolean isEof()
      Returns:
      true if the entire string has been read
    • rewind

      public void rewind(SourceLocation moveTo)

      Move the parse source to the given location

    • getRemaining

      public String getRemaining()
      Returns:
      the unparsed substring of the source
    • getLocation

      public SourceLocation getLocation()
      Returns:
      the current location within the source
    • newToken

      public Token newToken(TokenType tokenType, int length, String tokenValue)

      Construct a new token from this parse source assuming the characters have not yet been consumed (you would have used asCharSequence) to read the next available characters.

      Parameters:
      tokenType - the type of the token
      length - how many characters forward of the current location to consume as part of this token
      tokenValue - the value of the token - might not be a substring, could be somewhat processed.
    • newToken

      public Token newToken(TokenType tokenType, SourceLocation startLocation, String tokenValue)

      Construct a new token from this parse source, assuming the characters were already consumed using next().

      Parameters:
      tokenType - the type of the token
      startLocation - the location where this token starts
      tokenValue - the value of the token - might not be a substring, could be somewhat processed.
    • getSource

      public String getSource()

      The string being parsed

      Implementation note - we could probably avoid holding the whole string in memory and instead use a buffer, but it's probably not worth it (unless we start dealing with massive projects)

    • getLastChar

      public char getLastChar()