Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ein tool organize does not recognize and is unable to organize reftable git repositories #1819

Open
DianaNites opened this issue Jan 30, 2025 · 15 comments
Labels
acknowledged an issue is accepted as shortcoming to be fixed help wanted Extra attention is needed

Comments

@DianaNites
Copy link
Contributor

Current behavior 😯

I have a directory of git repositories cloned with git clone --mirror and rely on ein tool organize to organize them. Recently I have started using git clone --ref-format=reftable, and unexpectedly found ein tool organize to no recognize them, reporting no errors or warnings.

ein 0.41.0
git version 2.48.1

Expected behavior 🤔

ein tool organize should recognize these as git repositories, and move their entire contents regardless of ref-format or whatever else they are under the --destination-directory, organized by their URL, as is done normally.

Based on the requirements git lays out, for this specific case I believe ein tool organize should safely be able to declare all current extensions as recognized and proceed normally, without needing to handle them specially, as none of them affect determining the clone URL or moving all repository contents, whatever they may be, to the destination.

Git behavior

I have not specifically tried git remote -v with an older version of git, however I have tried that command on the current version with a repository with a config modified to have a fakeabc = true extension, and git does fail due to the unknown extension in this case.

Steps to reproduce 🕹

  1. cd $(mktemp -d)
  2. git clone --ref-format=reftable --mirror https://github.com/GitoxideLabs/gitoxide.git
  3. ein tool organize -t $(mktemp -d) -f $(realpath .) reports nothing
@Byron Byron added help wanted Extra attention is needed acknowledged an issue is accepted as shortcoming to be fixed labels Jan 31, 2025
@Byron
Copy link
Member

Byron commented Jan 31, 2025

Thanks a lot for reporting!

I took a look and believe it should be possible to teach it to deal with such repositories as it's only using plumbing.

fn is_repository(path: &Path) -> Option<gix::repository::Kind> {
// Can be git dir or worktree checkout (file)
if path.file_name() != Some(OsStr::new(".git")) && path.extension() != Some(OsStr::new("git")) {
return None;
}
if path.is_dir() {
if path.join("HEAD").is_file() && path.join("config").is_file() {
gix::discover::is_git(path).ok().map(Into::into)
} else {
None
}
} else {
// git files are always worktrees
Some(gix::repository::Kind::WorkTree { is_linked: true })
}
}

It feels like git::discover::is_git() fails on such repositories as it may try to read a valid HEAD which probably isn't available anymore.

It's an interesting question if it should start to consider it valid while gix::open() would fail though, so probably it's best to match on the error and double-check in organize.rs that it's indeed seemingly a ref-table to allow it to proceed nonetheless.

@DianaNites
Copy link
Contributor Author

DianaNites commented Jan 31, 2025

It feels like git::discover::is_git() fails on such repositories as it may try to read a valid HEAD which probably isn't available anymore.

Yeah, reftable backend puts ref: refs/heads/.invalid in HEAD for backwards compatibility, and .git/refs/heads is a regular file instead of a directory.

More generally, a valid git repository no matter the style may have a non-existent HEAD because "It is legal if the named branch name does not (yet) exist.", such as a empty repository without any commits yet. (which ein tool organize does seem to detect fine, despite HEAD being ref: refs/heads/main and refs/heads/ being an empty directory)

@Byron
Copy link
Member

Byron commented Feb 1, 2025

I thought I could quickly try it out and make it work, but realized my Git as of version 2.39.5 (from 2024-05-30) doesn't yet support --ref-format anywhere.

On CI that's probably less of a problem, but for convenience I'd wait until the flag is more mainstream.

Meanwhile, here is the patch with a test that I thought would fail (if the fixture can be created).

diff --git a/gix-discover/tests/discover/is_git/mod.rs b/gix-discover/tests/discover/is_git/mod.rs
index 09f8576a4..cfa2e95f7 100644
--- a/gix-discover/tests/discover/is_git/mod.rs
+++ b/gix-discover/tests/discover/is_git/mod.rs
@@ -110,6 +110,14 @@ fn missing_configuration_file_is_not_a_dealbreaker_in_nonbare_repo() -> crate::R
     Ok(())
 }
 
+#[test]
+fn reftable() -> crate::Result {
+    let repo = repo_path()?.join("reftable-clone");
+    let kind = gix_discover::is_git(&repo)?;
+    assert_eq!(kind, gix_discover::repository::Kind::WorkTree { linked_git_dir: None });
+    Ok(())
+}
+
 #[test]
 fn split_worktree_using_configuration() -> crate::Result {
     for name in [
diff --git a/gix-discover/tests/fixtures/make_basic_repo.sh b/gix-discover/tests/fixtures/make_basic_repo.sh
index 4bc41221a..0e3c35500 100755
--- a/gix-discover/tests/fixtures/make_basic_repo.sh
+++ b/gix-discover/tests/fixtures/make_basic_repo.sh
@@ -103,3 +103,6 @@ mkdir $worktree && touch $worktree/file
   git add file
   git commit -m "make sure na index exists"
 )
+
+git clone --ref-format=reftable . reftable-clone
+

@DianaNites
Copy link
Contributor Author

Reftable, and the --ref-format flag for git-init and git-clone, was introduced in Git 2.45.0 (29-Apr-2024)

@Byron
Copy link
Member

Byron commented Feb 1, 2025

That's interesting!

❯ git --version
git version 2.39.5 (Apple Git-154)

It seems then that either I checked out the wrong tag or the version given here is something different. Or maybe… it's not compiled in?

❯ gco v2.39.5
M       sha1collisiondetection
HEAD is now at cc7d11c167 Git 2.39.5

git ( HEAD) (cc7d11c) [!] via 🐍
❯ git show
commit cc7d11c16782041a6bb73e2fb56417b7d4c6d186 (HEAD, tag: v2.39.5)
Author: Junio C Hamano <[email protected]>
Date:   Thu May 30 16:52:52 2024 -0700

    Git 2.39.5

But it seems that this version doesn't come with reftable by default, or it's not yet available as argument.

git ( HEAD) (cc7d11c) [!] via 🐍 took 5s
❯ ./git init --ref-format=reftable
error: unknown option `ref-format=reftable'
usage: git init [-q | --quiet] [--bare] [--template=<template-directory>]
                [--separate-git-dir <git-dir>] [--object-format=<format>]
                [-b <branch-name> | --initial-branch=<branch-name>]
                [--shared[=<permissions>]] [<directory>]

    --template <template-directory>
                          directory from which templates will be used
    --bare                create a bare repository
    --shared[=<permissions>]
                          specify that the git repository is to be shared amongst several users
    -q, --quiet           be quiet
    --separate-git-dir <gitdir>
                          separate git dir from working tree
    -b, --initial-branch <name>
                          override the name of the initial branch
    --object-format <hash>
                          specify the hash algorithm to use

@DianaNites
Copy link
Contributor Author

DianaNites commented Feb 1, 2025

Well, git 2.39, even with some slightly newer support patches, is not git 2.45, so?

I applied the patch adding the reftable test but I cannot figure out how to run it. just unit-tests reports no failures, or even the text "reftable" anywhere, and cargo nextest run -p gix-discover reports it "There are multiple gix-discover packages in your project, and the specification gix-discover is ambiguous."

@Byron
Copy link
Member

Byron commented Feb 1, 2025

I found the date confusing, it said: (29-Apr-2024), while 2.39.5 is from 2024-05-30. Hence I thought 2.39.5 should have it already.

cargo test -p [email protected] should do the trick.

@DianaNites
Copy link
Contributor Author

DianaNites commented Feb 1, 2025

That worked, and the reftable test indeed fails, with Error: MissingHead in stderr

on the surface level this seems odd to me, I would think that an empty repository without any commits also has a missing head, since in that scenario HEAD is ref: refs/heads/main and refs/heads/ is an empty directory, but those work fine.

@Byron
Copy link
Member

Byron commented Feb 1, 2025

A .tar file should have been written. If the test along with it is submitted as a PR I can take it from there.

@DianaNites
Copy link
Contributor Author

Which generated tar file should I be looking for? I don't see any that seem relevant / contain a reftable-clone directory (specifically checked gix/tests/fixtures/generated-archives/make_basic_repo.tar, gix-submodule/tests/fixtures/generated-archives/basic.tar, gix-discover/tests/fixtures/generated-archives/make_exfat_repo_darwin.tar, and gix-discover/tests/fixtures/generated-archives/make_submodules.tar. gix-discover had no other generated archives.

A PR with the patch you posted applied, with the tar file?

@DianaNites
Copy link
Contributor Author

DianaNites commented Feb 1, 2025

gix-discover/tests/fixtures/generated-do-not-edit/make_basic_repo/1722706925-unix/reftable-clone/ does however exist, but no corresponding tar file?

note: I applied the patch on top of tag v0.41.0. I'll see if main helps?

edit: applying on top of main/aa05ef0d143d7ca14272f6cd36a40d2ed839fe76 did not result seem to change anything. I tried cargo nextest run -p [email protected], cargo test run -p [email protected], and just unit-tests.

@Byron
Copy link
Member

Byron commented Feb 1, 2025

Sorry for the confusion - I missed that make_basic_repo.sh has been excluded from the tar-file generation. Thus the change was only made locally, and what's worse is that I won't get an archive locally either so it will run the script and fail for MacOS people that don't have a newer Git in their path.

So let's hold off until that changes.

One day the test-suite might need to be upgraded to be able to control the Git version it's using, both on CI and locally.

@DianaNites
Copy link
Contributor Author

Well, I really need ein tool organize to work as I extensively rely on it, I have thousands of repositories organized through it taking a collective 6 terabytes, and hundreds more waiting to be organized before i realized it stopped working. So if theres anything I can do

I can manually make a tar archive of gix-discover/tests/fixtures/generated-do-not-edit/make_basic_repo/1722706925-unix/ and add the __gitoxide_meta__/git-version and identity if that helps?

I'm not familiar with the MacOS landscape, but both homebrew and macports have at least git 2.48.0

@Byron
Copy link
Member

Byron commented Feb 1, 2025

Actually there is a way. The script the test was in originally is not producing a tar file, but any other file will. Hence, the test could move into make_reftable_repo.sh for now, and that tar file will then do the trick.

The test needs to execute once and reference that new shell-script for the tar file to be created automatically.

@DianaNites
Copy link
Contributor Author

Done and PR made at #1825

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants