pepolang --report based overview

University Coursework

PepoLang

A multipurpose language project developed for CI601, intended as a beginner-friendly midpoint between Java-style structure and Python-like accessibility.

> approach: grammar-first language design

> stack: Java / Gradle / LLVM exploration

> delivery: working interpreter pipeline

Repository Stats

Total Commits

39

Open Issues

0

Last Updated

1 year ago

May 20, 2025, 11:25 PM UTC

Stars

1

Languages

Java 100%

Language Requirements

Project Objectives

Core components: lexer, parser, analyzer, interpreter/compiler path, runtime support, and STL.

Sample application objective: read current time, append to file, print file contents.

Strong error handling and clear diagnostics for beginner usability.

Research and Design Decisions

Grammar and Formalism

Research covered BNF/EBNF approaches, with grammar-first design used to anchor later implementation.

Compiled vs Interpreted Trade-off

Performance and flexibility were evaluated; project progressed with an interpreter-first path under time constraints.

Tooling Choices

Java + Gradle + JUnit were selected for familiarity, modular project structure, and testability.

Technical Findings from the Report

Lexical Analysis

  • Compared regex-driven tokenization with FSM-based approaches (DFA/NFA trade-offs).
  • Identified token-order sensitivity as a practical lexer pitfall (e.g., overlaps between token classes).
  • Implemented lexer iteration with explicit EOF handling and comment skipping behavior.

Parsing Strategy

  • Adopted recursive descent parsing for implementation speed and direct grammar mapping.
  • Researched LL/LR, top-down vs bottom-up parsing, and parser-generator alternatives.
  • Built AST generation workflow to support downstream semantic and execution phases.

Semantic Analysis

  • Focused on type consistency, scope resolution, and identifier declaration checks.
  • Explored symbol-table responsibilities (type metadata, scope, downstream compiler data).
  • Found data-structure constraints in symbol handling that later impacted codegen progress.

Implementation Timeline

Lexer

Tokenization implemented as the first complete stage, with tests used to validate valid/invalid symbol combinations.

Parser

Recursive descent parser and AST construction were implemented, heavily informed by compiler engineering resources.

Semantic Analysis

Type/scoping analysis and symbol-table concerns were explored, including design limitations discovered during integration.

LLVM Code Generation Attempt

IR generation work began, but symbol table design and binding complexity made full compilation infeasible within schedule.

Interpreter Delivery

Project finalized with interpreter-focused execution path to ensure a working end-to-end language implementation.

Methodology and Validation

Implementation Stack

Java was selected for familiarity and ecosystem support; Gradle was used for modular sub-project structure and isolated compilation.

Testing Approach

JUnit-backed tests were used during development, especially to validate lexer behavior across valid/invalid symbol combinations and edge cases.

Sample Program Objective

A measurable objective was defined: read current time, append to a file, then print file content to verify practical language capability.

Risk Analysis Highlights

Outcome and Next Steps

Project Poster

PepoLang Showcase Poster

The coursework poster captures the project summary, implementation path, and the final interpreter-focused delivery in a single visual artifact.

Open the full-size poster

PepoLang project poster