Skip to content

OAK-10952: improve autogenerated namespace prefixes - test with prototype #2237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: trunk
Choose a base branch
from

Conversation

reschke
Copy link
Contributor

@reschke reschke commented Apr 17, 2025

PoC, hopefully illustrating the idea.

(needs more documentation and support for edge cases, final location in project to be decided)

Verified

This commit was signed with the committer’s verified signature.
reschke Julian Reschke
…type
@reschke reschke self-assigned this Apr 17, 2025
@reschke reschke marked this pull request as draft April 17, 2025 12:02
Copy link

Commit-Check ✔️

reschke added 2 commits April 17, 2025 13:27

Verified

This commit was signed with the committer’s verified signature.
reschke Julian Reschke

Verified

This commit was signed with the committer’s verified signature.
reschke Julian Reschke
@stefan-egli
Copy link
Contributor

  • could the makeUpPrefixTryEvenHarder make a somewhat shorter (perhaps sha-hash) suggestion?
  • generally I think this is useful indeed, to avoid later namespace remorse. How impossible would it be to look into support for changing namespaces (I know it's a challenge)

@reschke
Copy link
Contributor Author

reschke commented Apr 17, 2025

could the makeUpPrefixTryEvenHarder make a somewhat shorter (perhaps sha-hash) suggestion?

SHA-1 would be 160 bits, thus 20 bytes, right? That's not always less than the actual namespace name. I could however special case that case.

generally I think this is useful indeed, to avoid later namespace remorse. How impossible would it be to look into support for changing namespaces (I know it's a challenge)

That's a separate issue. I believe we already found out that it can be done using the namespace registry, although risky. The point of this is to reduce the number of cases where it actually comes up.

@stefan-egli
Copy link
Contributor

I was thinking of something like the cca2a4f part of a cca2a4f1dbde1e0f7e337615763ac20a64e39160 git commit

Verified

This commit was signed with the committer’s verified signature.
reschke Julian Reschke
@reschke
Copy link
Contributor Author

reschke commented Apr 17, 2025

I was thinking of something like the cca2a4f part of a cca2a4f1dbde1e0f7e337615763ac20a64e39160 git commit

-> 7024038

@reschke
Copy link
Contributor Author

reschke commented Apr 17, 2025

re valid prefixes: the JCR spec allows many characters here; but checking exactly might be hard and overkill.

As valid URIs, strictly speaking, are only ASCII, we should filter out all non-ASCII. That would leaves with a short "allow" list of characters.

Verified

This commit was signed with the committer’s verified signature.
reschke Julian Reschke
…ix - starting with 7 chars
@stefan-egli
Copy link
Contributor

+1, "s-1578eb4" looks nice!

Copy link

Copy link
Member

@kwin kwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, some minor remarks only.


// public API: suggest an available prefix for the provided namespace
// (while considering pre-existing mappings)
private static @NotNull String suggestPrefix(@NotNull String namespace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest is a bit fuzzy to me, why not getUnusedPrefix(...)

String prefix = namespace.toLowerCase(Locale.ENGLISH);

// https://www.w3.org/guide/editor/namespaces.html
if (prefix.startsWith("http://www.w3c.org/ns/")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us use String constants and their length method to get rid of the magical size.

// in the namespace name
// (while considering pre-existing mappings)
private static @NotNull String makeUpPrefix(@NotNull String namespace,
@NotNull Function<String, String> lookupNamespace) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookupNamespace is hard to grasp why not getRegisteredNamespaceForPrefix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be rather named getRegisteredPrefix() here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this returns a namespace URI for a given prefix (not the other way around). But this definitely calls for a non ambiguous name.

@kwin
Copy link
Member

kwin commented May 7, 2025

I would place this in org.apache.jackrabbit.oak.namepath.impl.GlobalNameMapper and then call this from according getOrCreateOakName(orNull) which should be used from all JCR methods potentially creating items (with expanded names). This should be imho the only place dealing with name resolving in Oak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants