package parser
The algorithm for parsing MMT content (strings to MMT data structures). See api for an overview of the algorithms.
The main interfaces are - Parser: the main interface for parser (combining a structure and an object parser) - StructureParser: parsing structural elements - ObjectParser: parsing objects
The main implementations are - KeywordBasedParser for structural elements in .mmt files - NotationBasedParser for objects
The latter creates a Scanner for each string, which applies notations to parse user-defined mixifx syntax.
Structure parsing is extensible using ParserExtensions. Object parsing is extensible using notations or LexerExtensions.
- Source
- package.scala
- Alphabetic
- By Inheritance
- parser
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
class
ActiveNotation extends ActiveParsingRule
An ActiveNotation is a notation whose firstDelimToken has been scanned An ActiveNotation maintains state about the found and still-expected markers
-
abstract
class
ActiveParsingRule extends AnyRef
meant to replace ActiveNotation as a more general interface in Scanner and related classes That would computational notations.
- case class Ambiguous(notations: List[ParsingRule]) extends Error with Product with Serializable
- class AnnotatedComment extends AnnotatedText with Child
- class AnnotatedCommentToken extends AnnotatedText with Child
- class AnnotatedConstant extends SemanticText with Child
- class AnnotatedDPath extends AnnotatedPath
- abstract class AnnotatedDelimiter extends AnnotatedText with Child
- class AnnotatedDerived extends SemanticText with Child
- class AnnotatedDocument extends SemanticText
- class AnnotatedInclude extends AnnotatedText with Child
- class AnnotatedKeyword extends AnnotatedText with Child
- class AnnotatedName extends AnnotatedText with Child
- class AnnotatedNotation extends AnnotatedText with Child
- class AnnotatedOpaque extends SemanticText with Child
- class AnnotatedOpaqueText extends SemanticText with Child
- abstract class AnnotatedPath extends AnnotatedText with Child
- class AnnotatedTerm[A <: Obj] extends AnnotatedText with Child
- class AnnotatedTermToken extends AnnotatedText with Child
- abstract class AnnotatedText extends AnyRef
- class AnnotatedTheory extends SemanticText with Child
-
class
AsymmetricEscapeLexer extends LexFunction
an EscapeHandler that detects tokens delimited by begin and end
an EscapeHandler that detects tokens delimited by begin and end
nested escapes are allowed
typical example: AsymmetricEscapeHandler(/*, */)
-
abstract
class
BasicTheoryParser extends Parser
parses input streams as theory bodies, i.e., without MMT style module system
parses input streams as theory bodies, i.e., without MMT style module system
the stream is parsed into a single DeclaredTheory the URI of the stream is split such that the last part (e.g., the file name) becomes the theory name, the rest (e.g., the path to the folder) becomes the namespace
-
case class
BoundName(name: LocalName, isOML: Boolean) extends Product with Serializable
names are locally meaningful: variables bound in the contexts (to be parsed as OMV), fields of dependent OML sequences (to be parsed as OML)
-
case class
CFExternalToken(text: String, firstPosition: SourcePosition, term: Term) extends ExternalToken with Product with Serializable
A convenience class for an ExternalToken whose parsing is context-free so that it can be parsed immediately
A convenience class for an ExternalToken whose parsing is context-free so that it can be parsed immediately
- term
the result of parsing
- trait Child extends AnnotatedText
- class DeclarationDelimiter extends AnnotatedDelimiter
- class ErrorText extends AnnotatedText with Child
-
class
EscapeManager extends AnyRef
An EscapeManager handles escaping between languages during tokenization
-
abstract
class
ExternalToken extends PrimitiveTokenListElem
An ExternalToken can be anything produced by a LexerExtension
-
case class
ExternalTokenParsingInput(outer: ParsingUnit, boundNames: List[BoundName], parser: ObjectParser, errorCont: ErrorHandler) extends Product with Serializable
bundles arguments passed into ExternalToken
- case class FiniteKeywordsLexer(keys: List[String]) extends LexFunction with Product with Serializable
- class FixedLengthLiteralLexer extends LexerExtension
-
sealed abstract
class
Found extends AnyRef
Objects of type Found represent TokenSlices that were found in the input
Objects of type Found represent TokenSlices that were found in the input
The subclasses correspond to the subclasses of Marker.
-
sealed abstract
class
FoundArg extends FoundContent
represents a notations.Arg that was found
-
sealed abstract
class
FoundContent extends Found
anything but a delimiter
-
case class
FoundDelim(pos: Int, delim: Delimiter) extends Found with Product with Serializable
represents a notations.Delimiter that was found
- case class FoundOML(slice: TokenSlice, number: Int, info: LabelInfo) extends FoundArg with Product with Serializable
-
sealed abstract
class
FoundSeqArg extends FoundContent
represents an notations.SeqArg that was found
- case class FoundSeqOML(number: Int, args: List[FoundOML], info: LabelInfo) extends FoundSeqArg with Product with Serializable
- case class FoundSimpArg(slice: TokenSlice, number: Int) extends FoundArg with Product with Serializable
- case class FoundSimpSeqArg(number: Int, args: List[FoundSimpArg]) extends FoundSeqArg with Product with Serializable
-
class
FoundVar extends FoundContent
represents a notations.Var that was found
represents a notations.Var that was found
the sequence variable parser is a state machine
|---------------- next delim -----------------------------------------| state: 0 1 | 2 -1 V expect: name --- pick name ---> type, sep., next delim --- else ---> sep, next delim --- next delim ---> skip delim, end ^ | sep sep | | else ^ | ------------ skip sep -------------|------------------------------ |------|
-
sealed abstract
class
HasParentInfo extends ParentInfo
abstraction to unify operations inside a document and inside a module
-
case class
IsDoc(docParent: DPath) extends HasParentInfo with Product with Serializable
the content is located inside a document
-
case class
IsMod(modParent: MPath, relDocParent: LocalName) extends HasParentInfo with Product with Serializable
the content is located inside a ModuleOrLink
the content is located inside a ModuleOrLink
- modParent
the parent module
- relDocParent
the path of the parent document relative to the parent module
-
case class
IsRootDoc(path: DPath) extends RootInfo with Product with Serializable
the content is a root document
-
case class
IsRootMod(path: MPath) extends RootInfo with Product with Serializable
the content is a root module
-
class
KeywordBasedParser extends Parser
A StructureParser reads MMT declarations (but not objects) and defers to continuation functions for the found StructuralElement, ParsingUnits, and SourceErrors.
A StructureParser reads MMT declarations (but not objects) and defers to continuation functions for the found StructuralElement, ParsingUnits, and SourceErrors.
This class provides 3 things
1) High-level read methods that read MMT-related entities from a stream, which implementing classes can use. These methods throw do not read more than necessary from the stream and throw SourceError where appropriate.
2) It is stateless and maintains the parse state via an implicit argument of type ParserState in most functions.
3) It leaves processing of MMT entities application-independently via high-level continuation functions.
-
abstract
class
LexFunction extends AnyRef
the lexing part of a LexParseExtension
-
class
LexParseExtension extends LexerExtension
A LexParseExtension is a LexerExtension with a lex and a parse component.
-
abstract
class
LexerExtension extends Rule
A LexerExtension bypasses the default lexing algorithm
- class LiteralParser extends ParseFunction
- case class MMTPart(unparsed: String, reg: SourceRegion, term: Term) extends StringInterpolationPart with Product with Serializable
-
trait
MMTStructureEstimator extends AnyRef
estimates the archives.BuildResult of an mmt Interpreter by using the StructureParser superficially
-
class
MatchedList extends TokenListElem
A MatchedList is a SubTermToken resulting by reducing a sublist using a notation
A MatchedList is a SubTermToken resulting by reducing a sublist using a notation
invariant: every FoundArg (including nested ones) found when traversing an.getFound corresponds to one TokenListElem in tokens
TokenSlice's in an.getFound are invalid.
- class ModuleDelimiter extends AnnotatedDelimiter
- class NSImportDeclaration extends SemanticText with Child
- class NamespaceDeclaration extends SemanticText with Child
-
class
NotationBasedParser extends ObjectParser
A notation based parser
-
class
NumberLiteralLexer extends LexFunction
A LexerExtension that lexes undelimited number literals
A LexerExtension that lexes undelimited number literals
always accepts digit* after non-(letter or connector)
- class ObjectDelimiter extends AnnotatedDelimiter
-
trait
ObjectParser extends FormatBasedExtension
an ObjectParser handles ParsingUnits
an ObjectParser handles ParsingUnits
Instances are maintained by the ExtensionManager and retrieved and called by the structural parser.
see also Parser
-
sealed abstract
class
ParentInfo extends AnyRef
passed to Parsers and checking.Interpreters to indicate the place inside a larger element the input is located
-
abstract
class
ParseFunction extends AnyRef
the parsing part of a LexParseExtension
-
case class
ParseResult(unknown: Context, free: Context, term: Term) extends Product with Serializable
encapsulates the output of an ObjectParser
encapsulates the output of an ObjectParser
- unknown
the unknown variables that must be solved
-
abstract
class
Parser extends StructureParser with ObjectParser with LeveledExtension
the designated super class for all parsers
- case class ParserContext(state: ParserState, docContext: List[InterpretationInstruction]) extends Product with Serializable
-
abstract
class
ParserExtension extends Extension
classes implementing InDocParser may be registered with a StructureParser to extend MMT's concrete syntax with new keywords
-
case class
ParserExtensionArguments(parser: KeywordBasedParser, state: ParserState, se: StructuralElement, keyword: String, context: Context = Context.empty) extends Product with Serializable
arguments passed to a ParserExtension
arguments passed to a ParserExtension
- se
the current structural element (Document, DeclaredModule, or Constant)
- keyword
the keyword that was read
-
class
ParserState extends AnyRef
This class bundles all state that is maintained by a StructureParser
-
case class
ParsingRule(name: ContentPath, alias: List[LocalName], notation: TextNotation) extends Product with Serializable
couples an identifier with its notation
-
case class
ParsingRuleGroup(precedence: Precedence, rules: List[ParsingRule]) extends Product with Serializable
a set of parsing rules with the same precedence, see NotationOrder
-
case class
ParsingRuleTable(groups: List[ParsingRuleGroup]) extends Product with Serializable
a set of parsing rule groups, ordered by increasing precedence, used by NotationBasedParser to apply notations in precedence-order
-
case class
ParsingStream(source: URI, parentInfo: ParentInfo, nsMap: NamespaceMap, format: String, stream: BufferedReader) extends MMTTask with Product with Serializable
ParsingStream encapsulates the input of a StructureParser
ParsingStream encapsulates the input of a StructureParser
- source
the URI of the stream
- parentInfo
information about the parent (if any) of the content in the stream (which - if given - must exist)
- nsMap
defined namespaces
- format
the format of the stream
- stream
the stream to parse
-
case class
ParsingUnit(source: SourceRef, context: Context, term: String, iiContext: InterpretationInstructionContext, top: Option[ParsingRule] = None) extends MMTTask with Product with Serializable
ParsingUnit encapsulates the input of an ObjectParser
ParsingUnit encapsulates the input of an ObjectParser
the top parameter can be used to parse a term that is known/required to have a certain form
- source
the source reference of the string to parse
- context
the context against which to parse
- term
the term to parse
- top
an optional notation that the whole input must match;
-
abstract
class
PrefixedTokenLexer extends LexerExtension
a LexerExtension that detects id (letter sequences) Tokens prefixed by delim
-
abstract
class
PrimitiveTokenListElem extends TokenListElem
subtype of TokenListElem that defines some methods generally
-
class
QuotationLexer extends StringInterpolationLexer
quotation of MMT terms
-
class
Reader extends AnyRef
a Java-style reader that provides MMT-specific read methods
-
sealed abstract
class
RootInfo extends ParentInfo
abstraction to unify operations inside a root document or module
-
class
Scanner extends Logger
scans a TokenList against some notations
- case class ScannerBacktrack(currentIndex: Int, numCurrentTokens: Int) extends ScannerBacktrackInfo with Product with Serializable
-
abstract
class
ScannerBacktrackInfo extends AnyRef
remembers the relevant Scanner state at the time when an ActiveNotation was created
- abstract class SemanticText extends AnnotatedText
- class SemiFormalParser extends ParseFunction
-
case class
SingleFoundVar(pos: Int, name: Token, tp: Option[FoundArg]) extends Product with Serializable
helper class for representing a single found variable
helper class for representing a single found variable
- pos
first Token
- name
the variable name
- tp
the optional type
-
case class
SourcePosition(offset: Int, line: Int, column: Int) extends Product with Serializable
position in a source block; both one and two-dimensional coordinates are maintained
position in a source block; both one and two-dimensional coordinates are maintained
- offset
one-dimensional coordinate, -1 if omitted
- line
vertical two-dimensional coordinate
- column
horizontal two-dimensional coordinate all coordinates start from 0 Unicode code points above FFFF count as 2 characters. Arguably that is wrong. But it corresponds to the implementation in Java Strings (see http://docs.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode). Therefore, the current source references work better with, e.g., jEdit.
-
case class
SourceRef(container: URI, region: SourceRegion) extends Product with Serializable
region in an identified source block
region in an identified source block
- container
URI of the source document
- region
in that document
-
case class
SourceRegion(start: SourcePosition, end: SourcePosition) extends Product with Serializable
region in a source block
region in a source block
- start
inclusive start position
- end
inclusive end position
-
abstract
class
StringInterpolationLexer extends LexerExtension
A LexerExtension for string interpolation.
-
abstract
class
StringInterpolationPart extends AnyRef
auxiliary class for StringInterpolationToken: part of the interpolated string
-
case class
StringInterpolationToken(text: String, firstPosition: SourcePosition, parts: List[StringInterpolationPart], lexer: StringInterpolationLexer) extends ExternalToken with Product with Serializable
the result of a StringInterpolationLexer, carrying the continuation for parsing
- case class StringPart(text: String, reg: SourceRegion, ref: SourceRef) extends StringInterpolationPart with Product with Serializable
-
trait
StructureParser extends FormatBasedExtension
the type of structural parsers
the type of structural parsers
see also Parser
-
class
StructureParserContinuations extends AnyRef
continuations that may be called by StructureParsers
-
class
SymmetricEscapeLexer extends LexFunction
an EscapeHandler that detects tokens delimited by delim
-
case class
Token(word: String, firstPosition: SourcePosition, whitespaceBefore: Boolean, text: Option[String] = None) extends PrimitiveTokenListElem with Product with Serializable
A Token is the basic TokenListElem
A Token is the basic TokenListElem
- word
the characters making up the Token (excluding whitespace)
- firstPosition
the SourcePosition of the first character
- whitespaceBefore
true iff the Token was preceded by whitespace (true for the first Token, too)
- text
the text lexed into this token, if different from word
-
class
TokenList extends AnyRef
A TokenList is a wrapper for a mutable list of TokenListElem with a few special methods for parsing
A TokenList is a wrapper for a mutable list of TokenListElem with a few special methods for parsing
TokenListElem's are always indexed starting from 0.
-
trait
TokenListElem extends AnyRef
the type of objects that may occur in a info.kwarc.mmt.api.parser.TokenList
-
case class
TokenSlice(tokens: TokenList, start: Int, next: Int) extends Product with Serializable
A reference to a sublist of a TokenList that does not copy the element
A reference to a sublist of a TokenList that does not copy the element
- tokens
the underlying TokenList
- start
the first Token (inclusive)
- next
the last Token (exclusive)
-
class
UnmatchedList extends TokenListElem
An UnmatchedList is a SubTermToken resulting by reducing a sublist without using a notation
An UnmatchedList is a SubTermToken resulting by reducing a sublist without using a notation
Other notations have determined that this sublist must be parsed into a subtree, but it is not known (yet) which notation should be used.
- class Whitespace extends AnnotatedText with Child
-
abstract
class
WordReplacer extends LexerExtension
replaces words during lexing
Value Members
-
object
ActiveNotation
a helper object
- object AnnotatedText
- object BoundName extends Serializable
- object CommentIgnorer extends ParserExtension
-
object
DefaultObjectParser extends ObjectParser
A default parser that parses any string into an OMSemiFormal object.
-
object
FoundVar
helper object
- object MMTURILexer extends PrefixedTokenLexer
-
object
MetadataParser extends ParserExtension
A parser component for the keywords 'tag', 'meta', and 'link' to be parsed into the corresponding MetaDatum classes
A parser component for the keywords 'tag', 'meta', and 'link' to be parsed into the corresponding MetaDatum classes
It also treats various keys starting with @_ as abbreviations of the respective Dublin core keys
The parse results are added directly to the containing element
-
object
ObjectParser
helper object
- object ParseResult extends Serializable
- object ParsingStream extends Serializable
- object QuoteLexer extends LexParseExtension
-
object
Reader
helper object
-
object
ScannerNoBacktrack extends ScannerBacktrackInfo with Product with Serializable
cut, i.e, fail if this notation is not applicable
-
object
SourcePosition extends Serializable
helper object
- object SourceRef extends Linker[SourceRef] with Serializable
-
object
SourceRegion extends Serializable
helper object
-
object
TokenList
helper object
- object UnicodeMap
-
object
UnicodeReplacer extends WordReplacer
replaces typical multi-ascii-symbol operators (e.g., arrows and double brackets) with the corresponding Unicode symbol defined by the file
-
object
ViewKey
matches the keyword for a view