2026-04-18 20:19:30 +01:00
2026-04-18 20:19:30 +01:00
2026-01-10 22:04:18 +00:00
2026-01-10 22:04:18 +00:00
2026-01-10 22:04:18 +00:00
2026-04-18 20:01:55 +01:00
2026-01-10 22:04:18 +00:00
2026-04-18 20:01:55 +01:00

bibstd

Logo

A small BibTeX parser and formatter built with Flex and Bison.

It reads BibTeX from standard input, validates key required fields for common entry types, normalizes selected fields, and prints canonicalized BibTeX to standard output.

Requirements

  • GNU Make
  • Bash
  • Flex
  • Bison
  • A C++17 compiler (for example, g++)

Build

From the repository root:

make

This generates:

  • dist/bibtex_lexer.cpp
  • dist/bibtex_parser.cpp
  • dist/bibtex_parser.hpp
  • dist/bibtex_compiler

To clean generated files:

make clean

Run

Use stdin redirection with a .bib file:

./dist/bibtex_compiler < test/sample.bib

Test

Run the repository script:

./build_and_test.sh

The script rebuilds the project and runs the compiler on:

  • test/sample.bib
  • test/A_Theory_of_Justice.bibtex
  • test/big_file.bib

What The Tool Does

  • Parses entries of the form @type{key, field=value, ...}.
  • Reports parse errors with source location (line:column).
  • Validates required fields for selected entry types.
  • Canonicalizes output entry type to lowercase.
  • Regenerates entry IDs from content when possible.
  • Orders fields using a preferred field list, then alphabetically for unknown fields.
  • Wraps long braced field values to 80 columns.

Field Normalization

  • author values are normalized to First Last form per author and emitted in braces.
  • journaltitle is renamed to journal.
  • date={YYYY} is normalized to year={YYYY}.
  • date={YYYY-MM} is split into:
    • year={YYYY}
    • month={M} (non-zero-padded)
  • Existing numeric month values are de-zero-padded (01 -> 1).

Field Filtering

Some advertising/aggregator links are suppressed when found in url or note, including matches such as:

  • books.google.*
  • jstor.org
  • researchgate.net
  • openresearchlibrary.org
  • semanticscholar.org

Required Field Checks

The parser checks required fields for these types:

  • article: author, title, year
  • book: title, year, and one of author or editor
  • inproceedings: author, title, booktitle, year
  • incollection: author, title, booktitle, publisher, year
  • phdthesis and mastersthesis: author, title, school, year
  • techreport: author, title, institution, year
  • booklet: title

If any required fields are missing, the program prints an error and exits with status 1.

Output And Exit Codes

  • Success: prints normalized BibTeX to stdout, exits 0.
  • Validation failure: prints missing-field error to stderr, exits 1.
  • Parse failure: prints parse error with location to stderr.

Project Layout

  • src/bibtex_lexer.l: Flex lexer
  • src/bibtex_parser.y: Bison grammar, validation, normalization, and output
  • test/: sample inputs
  • build_and_test.sh: convenience build and test script
  • Makefile: build rules

License

This project is licensed under the GNU General Public License, version 3. See LICENSE for the full text.

Description
Standardise your bibtex files.
Readme GPL-3.0 533 KiB
Languages
TeX 79.8%
Yacc 16.9%
Lex 2%
Makefile 0.7%
Shell 0.6%