• Home
  • History
  • Annotate
  • only in this directory
Name Date Size

..11-Jan-20224 KiB

.circleci/H11-Jan-20224 KiB

.golangci.ymlH A D11-Jan-2022976

_examples/H11-Jan-20224 KiB

api.goH A D11-Jan-2022565

context.goH A D11-Jan-20222.7 KiB

COPYINGH A D11-Jan-20221 KiB

doc.goH A D11-Jan-20222.4 KiB

go.modH A D11-Jan-2022186

go.sumH A D11-Jan-2022521

grammar.goH A D11-Jan-20227.8 KiB

lexer/H11-Jan-20224 KiB

lookahead_test.goH A D11-Jan-20229.4 KiB

map.goH A D11-Jan-20222.8 KiB

map_test.goH A D11-Jan-20221.8 KiB

nodes.goH A D11-Jan-202214.1 KiB

options.goH A D11-Jan-2022975

parser.goH A D11-Jan-20225.7 KiB

parser_test.goH A D11-Jan-202222.2 KiB

README.mdH A D11-Jan-202211.2 KiB

stringer.goH A D11-Jan-20222.4 KiB

struct.goH A D11-Jan-20222.8 KiB

struct_test.goH A D11-Jan-20221.9 KiB

TUTORIAL.mdH A D11-Jan-20226 KiB

README.md

1# A dead simple parser package for Go
2
3[![Godoc](https://godoc.org/github.com/alecthomas/participle?status.svg)](http://godoc.org/github.com/alecthomas/participle) [![CircleCI](https://img.shields.io/circleci/project/github/alecthomas/participle.svg)](https://circleci.com/gh/alecthomas/participle)
4 [![Go Report Card](https://goreportcard.com/badge/github.com/alecthomas/participle)](https://goreportcard.com/report/github.com/alecthomas/participle) [![Gitter chat](https://badges.gitter.im/alecthomas.png)](https://gitter.im/alecthomas/Lobby)
5
6<!-- TOC -->
7
81. [Introduction](#introduction)
92. [Limitations](#limitations)
103. [Tutorial](#tutorial)
114. [Overview](#overview)
125. [Annotation syntax](#annotation-syntax)
136. [Capturing](#capturing)
147. [Streaming](#streaming)
158. [Lexing](#lexing)
169. [Options](#options)
1710. [Examples](#examples)
1811. [Performance](#performance)
19
20<!-- /TOC -->
21
22<a id="markdown-introduction" name="introduction"></a>
23## Introduction
24
25The goal of this package is to provide a simple, idiomatic and elegant way of
26defining parsers in Go.
27
28Participle's method of defining grammars should be familiar to any Go
29programmer who has used the `encoding/json` package: struct field tags define
30what and how input is mapped to those same fields. This is not unusual for Go
31encoders, but is unusual for a parser.
32
33<a id="markdown-limitations" name="limitations"></a>
34## Limitations
35
36Participle parsers are recursive descent. Among other things, this means that they do not support left recursion.
37
38There is an experimental lookahead option for using precomputed lookahead
39tables for disambiguation. You can enable this with the parser option
40`participle.UseLookahead()`.
41
42Left recursion must be eliminated by restructuring your grammar.
43
44<a id="markdown-tutorial" name="tutorial"></a>
45## Tutorial
46
47A [tutorial](TUTORIAL.md) is available, walking through the creation of an .ini parser.
48
49<a id="markdown-overview" name="overview"></a>
50## Overview
51
52A grammar is an annotated Go structure used to both define the parser grammar,
53and be the AST output by the parser. As an example, following is the final INI
54parser from the tutorial.
55
56 ```go
57 type INI struct {
58   Properties []*Property `{ @@ }`
59   Sections   []*Section  `{ @@ }`
60 }
61
62 type Section struct {
63   Identifier string      `"[" @Ident "]"`
64   Properties []*Property `{ @@ }`
65 }
66
67 type Property struct {
68   Key   string `@Ident "="`
69   Value *Value `@@`
70 }
71
72 type Value struct {
73   String *string  `  @String`
74   Number *float64 `| @Float`
75 }
76 ```
77
78> **Note:** Participle also supports named struct tags (eg. <code>Hello string &#96;parser:"@Ident"&#96;</code>).
79
80A parser is constructed from a grammar and a lexer:
81
82```go
83parser, err := participle.Build(&INI{})
84```
85
86Once constructed, the parser is applied to input to produce an AST:
87
88```go
89ast := &INI{}
90err := parser.ParseString("size = 10", ast)
91// ast == &INI{
92//   Properties: []*Property{
93//     {Key: "size", Value: &Value{Number: &10}},
94//   },
95// }
96```
97
98<a id="markdown-annotation-syntax" name="annotation-syntax"></a>
99## Annotation syntax
100
101- `@<expr>` Capture expression into the field.
102- `@@` Recursively capture using the fields own type.
103- `<identifier>` Match named lexer token.
104- `( ... )` Group.
105- `"..."` Match the literal (note that the lexer must emit tokens matching this literal exactly).
106- `"...":<identifier>` Match the literal, specifying the exact lexer token type to match.
107- `<expr> <expr> ...` Match expressions.
108- `<expr> | <expr>` Match one of the alternatives.
109
110The following modifiers can be used after any expression:
111
112- `*` Expression can match zero or more times.
113- `+` Expression must match one or more times.
114- `?` Expression can match zero or once.
115- `!` Require a non-empty match (this is useful with a sequence of optional matches eg. `("a"? "b"? "c"?)!`).
116
117Supported but deprecated:
118- `{ ... }` Match 0 or more times (**DEPRECATED** - prefer `( ... )*`).
119- `[ ... ]` Optional (**DEPRECATED** - prefer `( ... )?`).
120
121Notes:
122
123- Each struct is a single production, with each field applied in sequence.
124- `@<expr>` is the mechanism for capturing matches into the field.
125- if a struct field is not keyed with "parser", the entire struct tag
126  will be used as the grammar fragment. This allows the grammar syntax to remain
127  clear and simple to maintain.
128
129<a id="markdown-capturing" name="capturing"></a>
130## Capturing
131
132Prefixing any expression in the grammar with `@` will capture matching values
133for that expression into the corresponding field.
134
135For example:
136
137```go
138// The grammar definition.
139type Grammar struct {
140  Hello string `@Ident`
141}
142
143// The source text to parse.
144source := "world"
145
146// After parsing, the resulting AST.
147result == &Grammar{
148  Hello: "world",
149}
150```
151
152For slice and string fields, each instance of `@` will accumulate into the
153field (including repeated patterns). Accumulation into other types is not
154supported.
155
156A successful capture match into a boolean field will set the field to true.
157
158For integer and floating point types, a successful capture will be parsed
159with `strconv.ParseInt()` and `strconv.ParseBool()` respectively.
160
161Custom control of how values are captured into fields can be achieved by a
162field type implementing the `Capture` interface (`Capture(values []string)
163error`).
164
165<a id="markdown-streaming" name="streaming"></a>
166## Streaming
167
168Participle supports streaming parsing. Simply pass a channel of your grammar into
169`Parse*()`. The grammar will be repeatedly parsed and sent to the channel. Note that
170the `Parse*()` call will not return until parsing completes, so it should generally be
171started in a goroutine.
172
173```go
174type token struct {
175  Str string `  @Ident`
176  Num int    `| @Int`
177}
178
179parser, err := participle.Build(&token{})
180
181tokens := make(chan *token, 128)
182err := parser.ParseString(`hello 10 11 12 world`, tokens)
183for token := range tokens {
184  fmt.Printf("%#v\n", token)
185}
186```
187
188<a id="markdown-lexing" name="lexing"></a>
189## Lexing
190
191Participle operates on tokens and thus relies on a lexer to convert character
192streams to tokens.
193
194Three lexers are provided, varying in speed and flexibility. The fastest lexer
195is based on the [text/scanner](https://golang.org/pkg/text/scanner/) package
196but only allows tokens provided by that package. Next fastest is the regexp
197lexer (`lexer.Regexp()`). The slowest is currently the EBNF based lexer, but it has a large potential for optimisation through code generation.
198
199To use your own Lexer you will need to implement two interfaces:
200[Definition](https://godoc.org/github.com/alecthomas/participle/lexer#Definition)
201and [Lexer](https://godoc.org/github.com/alecthomas/participle/lexer#Lexer).
202
203<a id="markdown-options" name="options"></a>
204## Options
205
206The Parser's behaviour can be configured via [Options](https://godoc.org/github.com/alecthomas/participle#Option).
207
208<a id="markdown-examples" name="examples"></a>
209## Examples
210
211There are several [examples](https://github.com/alecthomas/participle/tree/master/_examples) included:
212
213Example | Description
214--------|---------------
215[BASIC](https://github.com/alecthomas/participle/tree/master/_examples/basic) | A lexer, parser and interpreter for a [rudimentary dialect](https://caml.inria.fr/pub/docs/oreilly-book/html/book-ora058.html) of BASIC.
216[EBNF](https://github.com/alecthomas/participle/tree/master/_examples/ebnf) | Parser for the form of EBNF used by Go.
217[Expr](https://github.com/alecthomas/participle/tree/master/_examples/expr) | A basic mathematical expression parser and evaluator.
218[GraphQL](https://github.com/alecthomas/participle/tree/master/_examples/graphql) | Lexer+parser for GraphQL schemas
219[HCL](https://github.com/alecthomas/participle/tree/master/_examples/hcl) | A parser for the [HashiCorp Configuration Language](https://github.com/hashicorp/hcl).
220[INI](https://github.com/alecthomas/participle/tree/master/_examples/ini) | An INI file parser.
221[Protobuf](https://github.com/alecthomas/participle/tree/master/_examples/protobuf) | A full [Protobuf](https://developers.google.com/protocol-buffers/) version 2 and 3 parser.
222[SQL](https://github.com/alecthomas/participle/tree/master/_examples/sql) | A *very* rudimentary SQL SELECT parser.
223[Thrift](https://github.com/alecthomas/participle/tree/master/_examples/thrift) | A full [Thrift](https://thrift.apache.org/docs/idl) parser.
224[TOML](https://github.com/alecthomas/participle/blob/master/_examples/toml/main.go) | A [TOML](https://github.com/toml-lang/toml) parser.
225
226Included below is a full GraphQL lexer and parser:
227
228```go
229package main
230
231import (
232  "os"
233
234  "github.com/alecthomas/kong"
235  "github.com/alecthomas/repr"
236
237  "github.com/alecthomas/participle"
238  "github.com/alecthomas/participle/lexer"
239  "github.com/alecthomas/participle/lexer/ebnf"
240)
241
242type File struct {
243  Entries []*Entry `{ @@ }`
244}
245
246type Entry struct {
247  Type   *Type   `  @@`
248  Schema *Schema `| @@`
249  Enum   *Enum   `| @@`
250  Scalar string  `| "scalar" @Ident`
251}
252
253type Enum struct {
254  Name  string   `"enum" @Ident`
255  Cases []string `"{" { @Ident } "}"`
256}
257
258type Schema struct {
259  Fields []*Field `"schema" "{" { @@ } "}"`
260}
261
262type Type struct {
263  Name       string   `"type" @Ident`
264  Implements string   `[ "implements" @Ident ]`
265  Fields     []*Field `"{" { @@ } "}"`
266}
267
268type Field struct {
269  Name       string      `@Ident`
270  Arguments  []*Argument `[ "(" [ @@ { "," @@ } ] ")" ]`
271  Type       *TypeRef    `":" @@`
272  Annotation string      `[ "@" @Ident ]`
273}
274
275type Argument struct {
276  Name    string   `@Ident`
277  Type    *TypeRef `":" @@`
278  Default *Value   `[ "=" @@ ]`
279}
280
281type TypeRef struct {
282  Array       *TypeRef `(   "[" @@ "]"`
283  Type        string   `  | @Ident )`
284  NonNullable bool     `[ @"!" ]`
285}
286
287type Value struct {
288  Symbol string `@Ident`
289}
290
291var (
292  graphQLLexer = lexer.Must(ebnf.New(`
293    Comment = ("#" | "//") { "\u0000"…"\uffff"-"\n" } .
294    Ident = (alpha | "_") { "_" | alpha | digit } .
295    Number = ("." | digit) {"." | digit} .
296    Whitespace = " " | "\t" | "\n" | "\r" .
297    Punct = "!"…"/" | ":"…"@" | "["…`+"\"`\""+` | "{"…"~" .
298
299    alpha = "a"…"z" | "A"…"Z" .
300    digit = "0"…"9" .
301`))
302
303  parser = participle.MustBuild(&File{},
304    participle.Lexer(graphQLLexer),
305    participle.Elide("Comment", "Whitespace"),
306    )
307
308  cli struct {
309    Files []string `arg:"" type:"existingfile" required:"" help:"GraphQL schema files to parse."`
310  }
311)
312
313func main() {
314  ctx := kong.Parse(&cli)
315  for _, file := range cli.Files {
316    ast := &File{}
317    r, err := os.Open(file)
318    ctx.FatalIfErrorf(err)
319    err = parser.Parse(r, ast)
320    r.Close()
321    repr.Println(ast)
322    ctx.FatalIfErrorf(err)
323  }
324}
325
326```
327
328<a id="markdown-performance" name="performance"></a>
329## Performance
330
331One of the included examples is a complete Thrift parser
332(shell-style comments are not supported). This gives
333a convenient baseline for comparing to the PEG based
334[pigeon](https://github.com/PuerkitoBio/pigeon), which is the parser used by
335[go-thrift](https://github.com/samuel/go-thrift). Additionally, the pigeon
336parser is utilising a generated parser, while the participle parser is built at
337run time.
338
339You can run the benchmarks yourself, but here's the output on my machine:
340
341    BenchmarkParticipleThrift-4        10000      221818 ns/op     48880 B/op     1240 allocs/op
342    BenchmarkGoThriftParser-4           2000      804709 ns/op    170301 B/op     3086 allocs/op
343
344On a real life codebase of 47K lines of Thrift, Participle takes 200ms and go-
345thrift takes 630ms, which aligns quite closely with the benchmarks.
346