Skip to content

Conversation

ethantkoenig
Copy link
Member

@ethantkoenig ethantkoenig commented Feb 3, 2018

Reduces disk usage of the repo (i.e. code) indexer:

I saw as roughly 3x (1.5GB -> 500MB) reduction in disk usage as a result of these changes (of course, mileage will vary depending on what type of text/code you are indexing).

Also introduces a migration-like versions to the issue and repo indexers to facilitate changes (which will typically require rebuilding the index).

Yes, this PR shamelessly pulls in https://github.com/ethantkoenig/rupture as a dependency to facilitate tracking indexer versions and migrations; I am aware of no other alternatives.

@@ -70,9 +73,15 @@ func createIssueIndexer() error {
mapping := bleve.NewIndexMapping()
docMapping := bleve.NewDocumentMapping()

numericFieldMapping := bleve.NewNumericFieldMapping()
numericFieldMapping.Store = false
numericFieldMapping.IncludeInAll = false
docMapping.AddFieldMappingsAt("RepoID", bleve.NewNumericFieldMapping())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use numericFieldMapping?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed

@tboerger tboerger added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Feb 3, 2018
@lafriks lafriks added the type/enhancement An improvement of existing functionality label Feb 3, 2018
@lafriks lafriks added this to the 1.5.0 milestone Feb 3, 2018
@lafriks lafriks added the type/changelog Adds the changelog for a new Gitea version label Feb 3, 2018
@ethantkoenig ethantkoenig force-pushed the repo_indexer_disk_usage branch 3 times, most recently from 43870d9 to c90f7af Compare February 4, 2018 03:57
@codecov-io
Copy link

codecov-io commented Feb 4, 2018

Codecov Report

Merging #3452 into master will decrease coverage by <.01%.
The diff coverage is 54.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3452      +/-   ##
==========================================
- Coverage   35.67%   35.67%   -0.01%     
==========================================
  Files         281      281              
  Lines       40697    40671      -26     
==========================================
- Hits        14519    14508      -11     
+ Misses      24031    24020      -11     
+ Partials     2147     2143       -4
Impacted Files Coverage Δ
models/issue_indexer.go 67.81% <0%> (ø) ⬆️
modules/indexer/indexer.go 63.26% <40.9%> (-14.24%) ⬇️
models/repo_indexer.go 43.85% <50%> (-3.58%) ⬇️
modules/indexer/repo.go 63.47% <53.33%> (+2.6%) ⬆️
modules/indexer/issue.go 67.56% <78.94%> (+8.35%) ⬆️
models/repo_list.go 65.62% <0%> (-1.57%) ⬇️
models/error.go 32.73% <0%> (-0.4%) ⬇️
models/repo.go 42.98% <0%> (+0.18%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 283e87d...55a3db8. Read the comment docs.

@tboerger tboerger added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Feb 4, 2018
"strconv"

"code.gitea.io/gitea/modules/setting"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add empty line

@ethantkoenig
Copy link
Member Author

@appleboy Done

@tboerger tboerger added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Feb 5, 2018
@lafriks
Copy link
Member

lafriks commented Feb 5, 2018

@ethantkoenig please resolve conflicts

@ethantkoenig ethantkoenig force-pushed the repo_indexer_disk_usage branch from b2d1420 to 55a3db8 Compare February 5, 2018 17:38
@ethantkoenig
Copy link
Member Author

@lafriks Resolved

@lafriks lafriks merged commit a89592d into go-gitea:master Feb 5, 2018
@ethantkoenig ethantkoenig deleted the repo_indexer_disk_usage branch February 21, 2018 05:50
aswild added a commit to aswild/gitea that referenced this pull request Jul 6, 2018
* SECURITY
  * Limit uploaded avatar image-size to 4096x3072 by default (go-gitea#4353)
  * Do not allow to reuse TOTP passcode (go-gitea#3878)
* FEATURE
  * Add cli commands to regen hooks & keys (go-gitea#3979)
  * Add support for FIDO U2F (go-gitea#3971)
  * Added user language setting (go-gitea#3875)
  * LDAP Public SSH Keys synchronization (go-gitea#1844)
  * Add topic support (go-gitea#3711)
  * Multiple assignees (go-gitea#3705)
  * Add protected branch whitelists for merging (go-gitea#3689)
  * Global code search support (go-gitea#3664)
  * Add label descriptions (go-gitea#3662)
  * Add issue search via API (go-gitea#3612)
  * Add repository setting to enable/disable health checks (go-gitea#3607)
  * Emoji Autocomplete (go-gitea#3433)
  * Implements generator cli for secrets (go-gitea#3531)
* ENHANCEMENT
  * Add more webhooks support and refactor webhook templates directory (go-gitea#3929)
  * Add new option to allow only OAuth2/OpenID user registration (go-gitea#3910)
  * Add option to use paged LDAP search when synchronizing users (go-gitea#3895)
  * Symlink icons (go-gitea#1416)
  * Improve release page UI (go-gitea#3693)
  * Add admin dashboard option to run health checks (go-gitea#3606)
  * Add branch link in branch list (go-gitea#3576)
  * Reduce sql query times in retrieveFeeds (go-gitea#3547)
  * Option to enable or disable swagger endpoints (go-gitea#3502)
  * Add missing licenses (go-gitea#3497)
  * Reduce repo indexer disk usage (go-gitea#3452)
  * Enable caching on assets and avatars (go-gitea#3376)
  * Add repository search ordered by stars/forks. Forks column in admin repo list (go-gitea#3969)
  * Add Environment Variables to Docker template (go-gitea#4012)
  * LFS: make HTTP auth period configurable (go-gitea#4035)
  * Add config path as an optionial flag when changing pass via CLI (go-gitea#4184)
  * Refactor User Settings sections (go-gitea#3900)
  * Allow square brackets in external issue patterns (go-gitea#3408)
  * Add Attachment API (go-gitea#3478)
  * Add EnableTimetracking option to app settings (go-gitea#3719)
  * Add config option to enable or disable log executed SQL (go-gitea#3726)
  * Shows total tracked time in issue and milestone list (go-gitea#3341)
* TRANSLATION
  * Improve English grammar and consistency (go-gitea#3614)
* DEPLOYMENT
  * Allow Gitea to run as different USER in Docker (go-gitea#3961)
  * Provide compressed release binaries (go-gitea#3991)
  * Sign release binaries (go-gitea#4188)
@go-gitea go-gitea locked and limited conversation to collaborators Nov 23, 2020
@delvh delvh removed the type/changelog Adds the changelog for a new Gitea version label Oct 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants