Skip to content

Issue #1011: Resolve StackOverflowErrors on page serialization #1141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 3, 2021

Conversation

tomaswolf
Copy link
Contributor

They are caused by Gitblit pages holding references to JGit RevCommits.

JGit commit objects are a recursive data structure; they have links to
their parent commits. Serializing a JGit commit will try to recursively
serialize all reachable ancestors as faras they have been loaded. If
that ancestor chain is too long, a StackOverflowError is thrown during
Wicket's page serialization if a page has a reference to sucha JGit
commit.

Fixed by making sure that pages o not contain references to JGit
commits. Use the (existing) wrapper object RepositoryCommit instead.

  • RepositoryCommit has a transient reference to the JGit commit and
    reads the commit from the repository upon de-serialization.
  • RefModel is a similar case (JGit tags/branches may also have links
    to the commits they point to). Solved a bit differently by making it
    a pure data object by transferring the interesting data from the JGit
    object in the constructor.
  • Change DataViews instantiated with RevCommit to use RepositoryCommit
    instead.
  • Change inner anonymous DataViews to ensure they do not have a
    synthesized field referencing the "allRefs" map. Such a synthesized
    field would also get serialized, and then serialize JGit commits
    again.

Finally, remove non-transient logger instances in Wicket classes. Those
might lead to NotSerializableException.

These StackOverflowErrors have been reported in several places since
2014:

JGit commit objects are a recursive data structure; they have links to
their parent commits. Serializing a JGit commit will try to recursively
serialize all reachable ancestors as faras they have been loaded. If
that ancestor chain is too long, a StackOverflowError is thrown during
Wicket's page serialization if a page has a reference to sucha JGit
commit.

Fixed by making sure that pages o not contain references to JGit
commits. Use the (existing) wrapper object RepositoryCommit instead.

* RepositoryCommit has a transient reference to the JGit commit and
  reads the commit from the repository upon de-serialization.
* RefModel is a similar case (JGit tags/branches may also have links
  to the commits they point to). Solved a bit differently by making it
  a pure data object by transferring the interesting data from the JGit
  object in the constructor.
* Change DataViews instantiated with RevCommit to use RepositoryCommit
  instead.
* Change inner anonymous DataViews to ensure they do not have a
  synthesized field referencing the "allRefs" map. Such a synthesized
  field would also get serialized, and then serialize JGit commits
  again.
  
Finally, remove non-transient logger instances in Wicket classes. Those
might lead to NotSerializableException.

These StackOverflowErrors have been reported in several places since
2014:

* https://groups.google.com/forum/#!topic/gitblit/GH1d8WSlR6Q
* https://bugs.chromium.org/p/gerrit/issues/detail?id=3316
* https://groups.google.com/d/msg/repo-discuss/Kcl0JIGNiGk/0DjH4mO8hA8J
* https://groups.google.com/d/msg/repo-discuss/0_P6A3fjTec/2kcpVPIUAQAJ
* gitblit-org#1011
* tomaswolf/gerrit-gitblit-plugin#21
@tomaswolf
Copy link
Contributor Author

More details and analysis can be found at tomaswolf/gerrit-gitblit-plugin#21 (comment) and following comments.

Although it seems strange to have a RefModel with a referenced object
but a null Ref, Gitblit uses such RefModels for instance in
JGitUtils.getNotesOnCommit().

Be careful to do something sensible when that Ref is null.
chalstrick pushed a commit to chalstrick/jgit that referenced this pull request Nov 24, 2016
ObjectId is serializable, and so are its subtypes. Ensure that
serialization does not follow the hash collision chain internal to the
ObjectIdOwnerMap, otherwise completely unrelated objects may get
serialized when a RevObject is serialized.

Note that serializing a RevCommit or RevTag may serialize quite a few
objects due to the parent/object links they contain. A user has no real
control over how many objects will be written when a RevCommit is
serialized. C.f [1]. This change does not resolve that, but in any case
this internal hash collision chain link should not participate in
serialization.

[1] gitblit-org/gitblit#1141

Change-Id: Ice331a9dc80a59ca360fcc04adaff8b5e750d847
Signed-off-by: Thomas Wolf <[email protected]>
@flaix flaix added this to the 1.9.2 milestone Jul 16, 2021
@flaix flaix merged commit 9142f1f into gitblit-org:master Nov 3, 2021
@flaix flaix linked an issue Nov 25, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception when detaching/serializing page
2 participants