Skip to content

Would you accept a PR to search inside archives? #1707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mqudsi opened this issue Apr 9, 2025 · 10 comments
Open

Would you accept a PR to search inside archives? #1707

mqudsi opened this issue Apr 9, 2025 · 10 comments

Comments

@mqudsi
Copy link
Contributor

mqudsi commented Apr 9, 2025

Hello,

Would you be open to accepting a PR that searched within archives (starting with .zip files)?

This could be easily implemented using the zip crate, which would introduce dependencies on crc32fast, crossbeam-utils (already a crossbeam dependency in fd's own), indexmap, and memchr. It can obviously be gated behind a feature flag, but you'd have to figure out whether you'd want it supported by default or not (with the primary difference being how it ends up packaged by distros).

@tmccombs
Copy link
Collaborator

tmccombs commented Apr 9, 2025

What would be the benefit of that over something like

unzip -l  file.zip| grep ...

fd's parallelization wouldn't help that much, since the archive file has to be read in series. And the ignore functionality probably isn't as useful for inspecting archives.

Also, if we added support for searching zip files, we would probably also want support for tarballs (with various kinds of compression). And possibly other archive formats, such as 7zip as well.

@tavianator
Copy link
Collaborator

I think the benefit would be if you don't know exactly which .zip file contains the file you're looking for. But I'm not sure that potential benefit outweighs the added complexity to fd.

@mqudsi
Copy link
Contributor Author

mqudsi commented Apr 9, 2025

Yes, as @tavianator says, it would be to search for a file you know exists in a directory of uncompressed and compressed files, where the file could be in either. In my case, the Downloads directory.

@sergeevabc
Copy link

sergeevabc commented May 5, 2025

There is a folder full of *.tar.zst archives. Somewhere within these archives there is a FOO file I need. What's the easiest way to find it? If we were talking about finding a piece of text within the archives, I would use rg -z FOO or ug -z FOO, that is so simple.

$ fd -V
fd 10.2.0

$ fd -z FOO
error: unexpected argument '-z' found

@tavianator
Copy link
Collaborator

@sergeevabc Something like this should work:

$ fd -e tar.zst -x tar tf | grep FOO

@sergeevabc
Copy link

sergeevabc commented May 5, 2025

$ fd -e tar.zst -x tar tf | grep FOO
usr/bin/FOO

File is found, good to know it exists, but I still don't know which archive it is in. Ideas?

@tavianator
Copy link
Collaborator

Oh right. Uh, I'd probably do something like this:

$ fd -e tar.zst -x sh -c 'printf "\\n%s\\n" "$1" >&2 && tar tf "$1"' sh | grep FOO

@sergeevabc
Copy link

sergeevabc commented May 15, 2025

Meet zfind by @laktak. It searches for files, including inside tar, zip, 7z and rar archives, via SQL-syntax. Alas, it lacks zst and gz and xz support, some progress indicator, colors to distinguish the components of the path. It also fully utilizes 1 CPU core and is full of quotation marks. Nevertheless, it does the job. Better than nothing.

$ zfind "name like 'cygintl-8.dll'"
cygwin\x86_64\release\gettext\libintl8\libintl8-0.22.4-1.tar.xz//usr/bin/cygintl-8.dll

@sergeevabc
Copy link

Meet WinRAR. Its search is faster than zfind and supports more archive formats, including zst. However, this is a GUI app.

@sergeevabc
Copy link

sergeevabc commented May 16, 2025

Meet ugrep by @genivia-inc. It is bloody fast.

$ ug -z -l -g cygintl-8.dll ""
x86_64\
╰╴release\
│ ╰╴gettext\
│ │ ╰╴libintl8\
│ │ │ ╰╴libintl8-0.22.4-1.tar.xz{usr/bin/cygintl-8.dll}
▔ ▔ ▔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants