Skip to content

Generated siva file contains corrupted objects #264

@erizocosmico

Description

@erizocosmico

ML team reported ArrayIndexOutOfBoundsException in the engine, which was strange because it seemed to be jgit's fault. After a bit of debugging I found the offending siva file, which is: /apps/borges/10k/0a0bfaa46954437548fbaeb0e19237f84e968511.siva and the object is bd7cb56b3bc934acda47089175733f625c6cdb37.

I made a reproduction case with jgit: https://github.com/erizocosmico/jgit-outofbounds, but it also happens with plain git.

$ siva unpack 0a0bfaa46954437548fbaeb0e19237f84e968511.siva
$ git show bd7cb56b3bc934acda47089175733f625c6cdb37
error: delta replay has gone wild
error: failed to apply delta
error: failed to read delta base object 543a4825f17b54479ee422a77f5fdd4f866eb839 at offset 469927 from ./objects/pack/pack-4c91f5bcbe51b8c71101ba25d1f06f441aba6650.pack
fatal: packed object bd7cb56b3bc934acda47089175733f625c6cdb37 (stored in ./objects/pack/pack-4c91f5bcbe51b8c71101ba25d1f06f441aba6650.pack) is corrupt

So, it seems we're writing siva files with corrupted objects and/or deltas. Which is weird is the fact that go-git is perfectly able to read that siva file.

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"os"

	git "gopkg.in/src-d/go-git.v4"
	"gopkg.in/src-d/go-git.v4/plumbing"
)

func main() {
	wd, err := os.Getwd()
	assert(err)

	r, err := git.PlainOpen(wd)
	assert(err)

	blob, err := r.BlobObject(plumbing.NewHash("bd7cb56b3bc934acda47089175733f625c6cdb37"))
	assert(err)

	rd, err := blob.Reader()
	assert(err)

	bytes, err := ioutil.ReadAll(rd)
	assert(err)

	fmt.Println(len(bytes), "bytes")
}

func assert(err error) {
	if err != nil {
		log.Fatal(err)
	}
}

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions