Skip to content

fix(deps): remove programming-language-detection dependency#695

Open
skevetter wants to merge 3 commits intomainfrom
chore/remove-lang-detection-dep
Open

fix(deps): remove programming-language-detection dependency#695
skevetter wants to merge 3 commits intomainfrom
chore/remove-lang-detection-dep

Conversation

@skevetter
Copy link
Copy Markdown
Owner

@skevetter skevetter commented Apr 5, 2026

Summary

  • Removes github.com/loft-sh/programming-language-detection external dependency
  • Replaces it with an inline implementation using filepath.WalkDir and simple extension-based detection
  • The new implementation is thread-safe (no global mutable state unlike the external package)

Summary by CodeRabbit

  • Chores

    • Removed external programming language detection dependency.
  • Refactor

    • Language detection now uses an improved internal heuristic method based on file extension analysis instead of an external library.

Replace github.com/loft-sh/programming-language-detection with inline
implementation. The external package used global mutable state and was
not thread-safe. The new implementation uses filepath.WalkDir with
simple extension-based detection.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 5, 2026

📝 Walkthrough

Walkthrough

The pull request removes an external programming language detection dependency and replaces it with an internal heuristic that analyzes file extensions by walking the directory tree and counting occurrences, excluding configured directories.

Changes

Cohort / File(s) Summary
Dependency Removal
go.mod
Removed explicit github.com/loft-sh/programming-language-detection dependency from module requirements.
Language Detection Implementation
pkg/language/language.go
Replaced external detector call with internal detectLanguageByExtension() function; added extension-to-language mappings, directory skip logic, and helper function shouldSkipDir(). Changed return behavior for regular files from error to (None, nil); updated language mapping lookup to use computed values directly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: removing the programming-language-detection external dependency and replacing it with an internal implementation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/remove-lang-detection-dep

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the size/m label Apr 5, 2026
Extract shouldSkipDir helper to bring complexity from 10 to under the
cyclop limit of 8.
@skevetter skevetter marked this pull request as ready for review April 5, 2026 20:12
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/language/language.go (2)

191-194: Consider logging or propagating WalkDir errors for debuggability.

Errors from filepath.WalkDir and the callback are silently discarded. While this "best effort" approach is acceptable for a heuristic, it could mask permission errors or filesystem issues that would be useful to surface during debugging.

💡 Optional: Log errors for debugging

If a logger is available in this context:

-	_ = filepath.WalkDir(root, func(_ string, d fs.DirEntry, err error) error {
+	walkErr := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
 		if err != nil {
+			// Log but continue - best effort detection
 			return nil
 		}
 		// ... rest of callback
 	})
+	// Optionally log walkErr if non-nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/language/language.go` around lines 191 - 194, The WalkDir callback
currently swallows filesystem errors (the err parameter) and the outer
filepath.WalkDir return value is ignored; change the callback to return the
received err instead of nil when err != nil, and capture the error returned by
filepath.WalkDir to either propagate it up or log it; if a logger (e.g., a
passed-in logger or package logger) is available, log a descriptive message
including the error when WalkDir returns an error. Target the anonymous func
passed to filepath.WalkDir and the surrounding call site to implement this
change.

214-221: Non-deterministic result when multiple languages have equal file counts.

Go map iteration order is undefined, so when two languages have the same count, the "winner" is unpredictable across runs. This is acceptable for a heuristic but worth noting.

If deterministic behavior is needed (e.g., for testing), consider adding a tie-breaker such as alphabetical ordering.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/language/language.go` around lines 214 - 221, The current loop over the
counts map (variables best, max, counts, lang, count) can yield
non-deterministic results when two languages have equal counts; update the tie
logic so when count > max OR (count == max AND lang is alphabetically smaller
than the current best) you set best = lang and max = count, ensuring
deterministic alphabetical tie-breaking (handle the initial best value
accordingly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/language/language.go`:
- Around line 103-116: The extension-to-language lookup currently uses
case-sensitive keys in extensionToLanguage, so detectLanguageByExtension should
normalize the extension to lowercase before lookup; update the
detectLanguageByExtension function to call strings.ToLower on the result of
filepath.Ext (or otherwise lowercase the key) and then use that normalized key
to query extensionToLanguage to ensure .PY, .Js, etc. are matched.

---

Nitpick comments:
In `@pkg/language/language.go`:
- Around line 191-194: The WalkDir callback currently swallows filesystem errors
(the err parameter) and the outer filepath.WalkDir return value is ignored;
change the callback to return the received err instead of nil when err != nil,
and capture the error returned by filepath.WalkDir to either propagate it up or
log it; if a logger (e.g., a passed-in logger or package logger) is available,
log a descriptive message including the error when WalkDir returns an error.
Target the anonymous func passed to filepath.WalkDir and the surrounding call
site to implement this change.
- Around line 214-221: The current loop over the counts map (variables best,
max, counts, lang, count) can yield non-deterministic results when two languages
have equal counts; update the tie logic so when count > max OR (count == max AND
lang is alphabetically smaller than the current best) you set best = lang and
max = count, ensuring deterministic alphabetical tie-breaking (handle the
initial best value accordingly).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df949891-069a-45d9-9a8d-4119ad7046e9

📥 Commits

Reviewing files that changed from the base of the PR and between f2b481f and eb84890.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • go.mod
  • pkg/language/language.go
💤 Files with no reviewable changes (1)
  • go.mod

Comment on lines +103 to +116
// extensionToLanguage maps file extensions to programming languages.
var extensionToLanguage = map[string]ProgrammingLanguage{
".js": JavaScript,
".ts": TypeScript,
".py": Python,
".c": C,
".cpp": Cpp,
".cs": DotNet,
".go": Go,
".php": PHP,
".java": Java,
".rs": Rust,
".rb": Ruby,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Case-sensitive extension matching may miss files.

filepath.Ext() preserves the original case (e.g., .PY for file.PY), but the map keys are all lowercase. Files with uppercase extensions will not be counted.

Consider normalizing the extension to lowercase before lookup:

🔧 Proposed fix

In detectLanguageByExtension, change line 207:

-		if lang, ok := extensionToLanguage[filepath.Ext(d.Name())]; ok {
+		if lang, ok := extensionToLanguage[strings.ToLower(filepath.Ext(d.Name()))]; ok {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/language/language.go` around lines 103 - 116, The extension-to-language
lookup currently uses case-sensitive keys in extensionToLanguage, so
detectLanguageByExtension should normalize the extension to lowercase before
lookup; update the detectLanguageByExtension function to call strings.ToLower on
the result of filepath.Ext (or otherwise lowercase the key) and then use that
normalized key to query extensionToLanguage to ensure .PY, .Js, etc. are
matched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant