Skip to content

Add ability to detect SPDXIDs by reserved keyname #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

mrutkows
Copy link
Contributor

@mrutkows mrutkows commented Jun 7, 2024

It is a more-and-more popular convention to use the reserved keyname for declaring SPDX license IDs i.e., "SPDX-License-Identifier:" typically placed at the top of source files. The scanner must be able to look for these key-values explicitly (within the first X lines) and use the keyword to distinguish such SPDX IDs like "MIT" from words that contain "mit" in some normalized part of their text.

@mrutkows mrutkows self-assigned this Jun 7, 2024
@mrutkows mrutkows marked this pull request as ready for review June 13, 2024 17:59
@mrutkows mrutkows requested a review from markstur June 13, 2024 17:59
@mrutkows mrutkows marked this pull request as draft June 13, 2024 17:59
@mrutkows mrutkows added the enhancement New feature or request label Jun 13, 2024
if fi.Size() > 1000000 {
Logger.Errorf("file too large (%v > 1000000)", fi.Size()) // log error, but return nil
return IdentifierResults{}, nil
err = Logger.Errorf("file too large (%v > 1000000)", fi.Size()) // log error, but return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you removed the skipping of large files here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

NormalizedText: normalizedData.NormalizedText,
Hash: normalizedData.Hash,
}
// ret := IdentifierResults{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't leave in the commented out code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more cleanup ;)

Copy link
Contributor

@markstur markstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments inline and I realize it's a draft.

It would be an easier review if there were not refactor/renames that are not necessary. Most of the diff I reviewed is not specifically relevant to this PR.

@mrutkows
Copy link
Contributor Author

mrutkows commented Jun 13, 2024

some comments inline and I realize it's a draft.

It would be an easier review if there were not refactor/renames that are not necessary. Most of the diff I reviewed is not specifically relevant to this PR.

pardon the rename of results to “identifierResults” (uniformly, hopefully) it was necessary for my sanity to assure we only allocated the return struct once and passed it across all pattern matching/find functions and wanted make sure I saw all the places with the type implied by the var name).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants