Code Review and Suggestions

Hey, I saw your [issue on Dingo asking for a code review](https://github.com/MadAppGang/dingo/issues/29). I've been working on my own "compiles to Go" language - https://github.com/halcyonnouveau/soppo, so I think I have some relevant experience here.

I think this is a good start, and honestly has a much better foundation for this kind of thing than Dingo (but that's a story for another time). I've structured this into three parts: language design (semantic issues with the language itself), architecture (compiler structure), and code (implementation bugs). I've focused on issues with the current approach rather than features that just aren't implemented yet - I get that missing features are expected for an early-stage project.

Anyway, hope you'll find this helpful !

## Language Design

The `?string` syntax suggests a nullable string, but in Go, string is a value type that cannot be nil - only empty (""). Only pointers, slices, maps, channels, interfaces, and functions can be nil.

```go
type User struct {
    email: ?string  // This can't actually be nil in Go
}
```

For this to work, you'd need to compile `?string` to `*string` (pointer to string). Currently the codegen doesn't transform nullable types, so either the `?` is stripped and it's just string (which can't be nil), or it emits `?string` literally (which is invalid Go syntax).

The same issue applies to the null coalescing operator `?:`. If email is a string (without a pointer), the generated `email == nil` check won't compile because you can't compare a value type to nil.

For reference, Soppo handles this by only allowing `?` on types that are actually nilable in Go (pointers, slices, maps, etc.). Writing `?string` would be a compile error - you'd need `?*string` (nullable pointer to string) instead.

## Architecture

The type checker uses a simple symbol table that stores types as strings:
```go
scopes []map[string]string
```
This processes declarations in source order, which means forward references won't work - you can't call a function that's defined later in the file. Most compilers use two passes: first collect all declarations (function signatures, type definitions), then check the bodies. This lets you reference things in any order.

The string-based representation also limits what's possible. There's no type inference - `x := getValue()` just returns "interface{}" instead of inferring the actual return type. Generics aren't possible since there's no way to represent type variables or solve constraints. Return types also aren't validated - return expressions are type-checked, but never compared against the function's declared return type.

I've recently come across this document: https://jimmyhmiller.com/easiest-way-to-build-type-checker which looks like a good introduction to building a more robust type checker using bidirectional type checking.

## Code

**expect() return value is ignored**

Throughout the parser, expect() returns an error but the return value is never checked:
```go
p.expect(lexer.TOKEN_RPAREN)  // Returns error, never checked
```

This means parse errors are silently swallowed. If a closing paren is missing, parsing continues with garbage.

**Unterminated strings don't error**
```go
for l.pos < len(l.input) && l.current() != '"' {
    // Loops to EOF if no closing quote
}
```

An unterminated string like "hello will loop to EOF and produce a garbage token instead of a proper error.

**:= declarations skip type checking**

ShortAssignStmt isn't handled in checkStatement, so := declarations bypass type checking entirely:
```go
case *parser.VarDecl:
    return tc.checkVar(s)
// case *parser.ShortAssignStmt: <- missing
```

**Byte-based lexer breaks UTF-8**

The lexer indexes by byte instead of rune, which breaks UTF-8:
```go
func (l *Lexer) current() byte {  // byte, not rune
```
UTF-8 characters can be 1-4 bytes, so indexing by byte gives you partial characters:
```
s := "héllo"
s[1]  // Returns 0xC3 (first byte of 'é'), not a valid character

unicode.IsLetter(rune(0xC3))  // false - so 'héllo' won't parse as an identifier
```
Identifiers and strings containing non-ASCII characters (é, 日, emoji) won't work correctly.

## Suggestions

I'd prioritise fixing `expect()` error handling first - silent parse failures make everything else harder to debug. 

For testing, snapshot tests (also called golden tests) work well for compilers - save the expected Go output and compare against it. This makes it easy to add test cases and catch regressions. This is what Soppo does, along with fuzzing the parser to find edge cases like unterminated strings or unexpected token sequences. Go has built-in fuzz testing with `go test -fuzz` that would work well for this.

For the nullable types issue, I'd step back and decide on the semantics before building more on top of it. Either `?T` compiles to `*T` for value types, or `?` is only allowed on already-nilable types. The current design doesn't map to valid Go.

I'm not sure what your process for writing this language is, but I'd suggest doing the design work first - sketch out all the features you want, how they interact, and what Go code they should generate before implementing them. It's easier to spot issues like the nullable types problem on paper than after you've written the compiler. Having a spec or design doc also makes it clearer what's a bug vs what's just not implemented yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Review and Suggestions #1

Language Design

Architecture

Code

Suggestions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Code Review and Suggestions #1

Description

Language Design

Architecture

Code

Suggestions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions