-
Notifications
You must be signed in to change notification settings - Fork 418
Is there any way to get last commit of a certain file? #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This might be best to ask libgit2 itself since this library only wraps libgit2. I'm not personally familiar myself with an API to do this, but I don't have an encyclopedic knowledge of the API. |
@kevinzheng I was looking for something similar and turns out revwalk is the way to go, its how TortoiseGit does it aswell: https://github.com/TortoiseGit/TortoiseGit/blob/master/src/TortoiseShell/GITPropertyPage.cpp#L369 here is a good reference issue in libgit2: libgit2/libgit2#495 |
@kevinzheng although I am kind of intrigued to benchmark this approach against using the |
@extrawurst @alexcrichton would you pls take a look at my implementation? It looks like working, but I haven't tested the performance, and I don't know how to handle the commits with multiple parents, thank you! #[derive(Debug, Deserialize, Serialize, PartialEq, Clone)]
pub struct Commit {
pub commit_id: String,
pub message: String,
pub time: NaiveDateTime,
pub author: Signature,
pub committer: Signature,
} pub fn last_commit_of_file_or_dir(
repo: &Repository,
file_path: &str,
from_commit_id: Option<&str>,
) -> Result<crate::beans::Commit, AppError> {
let mut revwalk = repo.revwalk()?;
revwalk.set_sorting(git2::Sort::TIME)?;
match from_commit_id {
Some(from_cid) => match Oid::from_str(from_cid) {
Ok(oid) => revwalk.push(oid)?,
Err(e) => return Err(AppError::Git2Error(e)),
},
None => revwalk.push_head()?,
}
while let Some(oid) = revwalk.next() {
let oid = oid?;
if let cmt = repo.find_commit(oid)? {
let tree = cmt.tree()?;
let old_tree = if cmt.parent_count() > 0 {
// TODO: multiple parents???
let parent_commit = cmt.parent(0)?;
Some(parent_commit.tree()?)
} else {
None
};
let mut opts = DiffOptions::new();
let diff = repo.diff_tree_to_tree(old_tree.as_ref(), Some(&tree), Some(&mut opts))?;
let mut deltas = diff.deltas();
let contains = deltas.any(|dd| {
let new_file_path = dd.new_file().path().unwrap();
// File || Dir
new_file_path.eq(Path::new(&file_path)) || new_file_path.starts_with(&file_path)
});
if contains {
let c = git2_commit_to_our_commit(&cmt)?;
return Ok(c);
}
}
}
return Err(AppError::CommandError(format!(
"Failed to get last commit of file {}!",
&file_path
)));
} fn git2_commit_to_our_commit(commit: &git2::Commit) -> Result<crate::beans::Commit, AppError> {
let message = commit.message().unwrap_or("").to_string();
let author = crate::beans::Signature {
user_id: None,
name: commit.author().name().unwrap_or("".as_ref()).to_string(),
email: commit.author().email().unwrap_or("".as_ref()).to_string(),
};
let committer = crate::beans::Signature {
user_id: None,
name: commit.committer().name().unwrap_or("".as_ref()).to_string(),
email: commit
.committer()
.email()
.unwrap_or("".as_ref())
.to_string(),
};
let time = git2_time_to_chrono_time(commit.time());
Ok(crate::beans::Commit {
commit_id: commit.id().to_string(),
message,
time,
committer,
author,
})
} |
It appears that this is a widely requested feature - nearly every language wrapper has a feature request for it - e.g. libgit2/pygit2#231. However, it's not implemented in git2 - here's the upstream feature request: libgit2/libgit2#495. Someone has contributed a custom implementation for the C# bindings, although I haven't looked at it in detail: libgit2/libgit2sharp#963 I've rolled my own implementation, but it reports different timestamps compared to For ease of testing I list the timestamps for all the files that ever existed in the repository, rather than attempting to filter further. Here's my code: // Copyright 2021 Google, inc.
// SPDX-License-identifier: Apache-2.0
use std::{collections::HashMap, path::PathBuf};
use git2::{Commit, Repository, Tree, Error};
fn main() -> Result<(), Error> {
let mut mtimes: HashMap<PathBuf, i64> = HashMap::new();
let repo = Repository::open(".")?;
let mut revwalk = repo.revwalk()?;
revwalk.set_sorting(git2::Sort::TIME)?;
revwalk.push_head()?;
let mut newer_commit: Option<Commit> = None;
let mut newer_commit_tree: Option<Tree> = None;
for commit_id in revwalk {
let commit_id = commit_id?;
let commit = repo.find_commit(commit_id)?;
if commit.parent_count() > 1 {
// ignore merge commits because they touch lots of files
// without any of them being actually modified
continue;
}
let tree = commit.tree()?;
// check if this is not the very first commit, then we have nothing to diff
if let Some(newer_commit_tree) = newer_commit_tree {
let diff= repo.diff_tree_to_tree(Some(&tree), Some(&newer_commit_tree), None)?;
for delta in diff.deltas() {
let file_path = delta.new_file().path().unwrap();
let file_mod_time = newer_commit.as_ref().unwrap().time();
let unix_time = file_mod_time.seconds();
mtimes.entry(file_path.to_owned()).or_insert(unix_time);
}
}
newer_commit = Some(commit);
newer_commit_tree = Some(tree);
}
for (path, time) in mtimes.iter() {
println!("{:?}: {}", path, time);
}
Ok(())
} Here's a (slower) reference BASH implementation using #!/bin/bash
git ls-files | while read FILENAME; do
TIME=$( git log -1 --format="%ct" -- "$FILENAME" )
echo "\"${FILENAME#./}\": $TIME"
done The BASH version aligns with the output of Fixes I've attempted:
¯\_(ツ)_/¯ Edit: Ah, that's probably because I'm walking the commit log chronologically using |
Okay, this works: // Copyright 2021 Google, inc.
// SPDX-License-identifier: Apache-2.0
use std::{cmp::max, collections::HashMap, path::PathBuf};
use git2::{Repository, Error};
fn main() -> Result<(), Error> {
let mut mtimes: HashMap<PathBuf, i64> = HashMap::new();
let repo = Repository::open(".")?;
let mut revwalk = repo.revwalk()?;
revwalk.set_sorting(git2::Sort::TIME)?;
revwalk.push_head()?;
for commit_id in revwalk {
let commit_id = commit_id?;
let commit = repo.find_commit(commit_id)?;
// Ignore merge commits (2+ parents) because that's what 'git whatchanged' does.
// Ignore commit with 0 parents (initial commit) because there's nothing to diff against
if commit.parent_count() == 1 {
let prev_commit = commit.parent(0)?;
let tree = commit.tree()?;
let prev_tree = prev_commit.tree()?;
let diff= repo.diff_tree_to_tree(Some(&prev_tree), Some(&tree), None)?;
for delta in diff.deltas() {
let file_path = delta.new_file().path().unwrap();
let file_mod_time = commit.time();
let unix_time = file_mod_time.seconds();
mtimes.entry(file_path.to_owned())
.and_modify(|t| *t = max(*t, unix_time) )
.or_insert(unix_time);
}
}
}
for (path, time) in mtimes.iter() {
println!("{:?}: {}", path, time);
}
Ok(())
} A MIT/Apache licensed version can be found here. Edit: although it looks like this code will miss files only touched in the initial commit. A solution can be found here. |
Uh oh!
There was an error while loading. Please reload this page.
It should be something like the command
git log --follow FILENAME
.revwalk
might work but would lead to tons of computing. Do we have another ways? thank you!The text was updated successfully, but these errors were encountered: