ANTLR v4
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface (or visitor) that makes it easy to respond to the recognition of phrases of interest.
Dev branch build status
Versioning
ANTLR 4 supports 10 target languages (Cpp, CSharp, Dart, Java, JavaScript, PHP, Python3, Swift, TypeScript), and ensuring consistency across these targets is a unique and highly valuable feature. To ensure proper support of this feature, each release of ANTLR is a complete release of the tool and the 10 runtimes, all with the same version. As such, ANTLR versioning does not strictly follow semver semantics:
- a component may be released with the latest version number even though nothing has changed within that component since the previous release
- major version is bumped only when ANTLR is rewritten for a totally new "generation", such as ANTLR3 -> ANTLR4 (LL(*) -> ALL(*) parsing)
- minor version updates may include minor breaking changes, the policy is to regenerate parsers with every release (4.11 -> 4.12)
- backwards compatibility is only guaranteed for patch version bumps (4.11.1 -> 4.11.2)
If you use a semver verifier in your CI, you probably want to apply special rules for ANTLR, such as treating minor change as a major change.
Repo branch structure
The default branch for this repo is master
, which is the latest stable release and has tags for the various releases; e.g., see release tag 4.9.3. Branch dev
is where development occurs between releases and all pull requests should be derived from that branch. The dev
branch is merged back into master
to cut a release and the release state is tagged (e.g., with 4.10-rc1
or 4.10
.) Visually our process looks roughly like this:
The Go target now has its own dedicated repo:
$ go get github.com/antlr4-go/antlr
Note
The dedicated Go repo is for go get
and import
only. Go runtime development is still performed in the main antlr/antlr4
repo.
Authors and major contributors
- Terence Parr, [email protected] ANTLR project lead and supreme dictator for life University of San Francisco
- Sam Harwell (Tool co-author, Java and original C# target)
- Eric Vergnaud (Javascript, Python2, Python3 targets and maintenance of C# target)
- Peter Boyer (Go target)
- Mike Lischke (C++ completed target)
- Dan McLaughlin (C++ initial target)
- David Sisson (C++ initial target and test)
- Janyou (Swift target)
- Ewan Mellor, Hanzhou Shi (Swift target merging)
- Ben Hamilton (Full Unicode support in serialized ATN and all languages' runtimes for code points > U+FFFF)
- Marcos Passos (PHP target)
- Lingyu Li (Dart target)
- Ivan Kochurkin has made major contributions to overall quality, error handling, and Target performance.
- Justin King has done a huge amount of work across multiple targets, but especially for C++.
- Ken Domino has a knack for finding bugs/issues and analysis; also a major contributor on the grammars-v4 repo.
- Jim Idle has contributed to previous versions of ANTLR and recently jumped back in to solve a major problem with the Go target.
Useful information
- Release notes
- Getting started with v4
- Official site
- Documentation
- FAQ
- ANTLR code generation targets
(Currently: Java, C#, Python3, JavaScript, Go, C++, Swift, Dart, PHP) - Note: As of version 4.14, we are dropping support for Python 2. We love the Python community, but Python 2 support was officially halted in Jan 2020. More recently, GiHub also dropped support for Python 2, which has made it impossible for us to maintain a consistent level of quality across targets (we use GitHub for our CI). Long live Python 3!
- Java API
- ANTLR v3
- v3 to v4 Migration, differences
You might also find the following pages useful, particularly if you want to mess around with the various target languages.
The Definitive ANTLR 4 Reference
Programmers run into parsing problems all the time. Whether it’s a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language—ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features.
You can buy the book The Definitive ANTLR 4 Reference at amazon or an electronic version at the publisher's site.
You will find the Book source code useful.
Additional grammars
This repository is a collection of grammars without actions where the root directory name is the all-lowercase name of the language parsed by the grammar. For example, java, cpp, csharp, c, etc...