Skip to content

Conversation

secretdataz
Copy link
Member

@secretdataz secretdataz commented Jun 3, 2021

  • Addressed Issue(s): -

  • Server Mode: both

  • Description of Pull Request:

This PR replaces yaml-cpp as a YAML document parser with rapidyaml library.
An optimization was also made to itemdb_searchname1 function which reduced its complexity to constant.
Item database parsing function also has reduced complexity to linear complexity.
This results in reduced loading time in debug mode to a fifth compared to current master branch (0dd318e).
Loading time comparison is available in the table below. Loading time is measured in milliseconds.

Function Master Master with itemdb optimization This PR
do_init_itemdb (debug) 154323 75269 24282
do_init (debug) 268201 N/A 53118
do_init (release) 40562 N/A 20644

Known issues & caveats

  • Reporting invalid node or invalid script will mark the line and column as 1
  • The optimization made itemdb_searchname1 case-sensitive.
  • Only compiles on MSBuild / Visual Studio atm. CMakeLists.txt needs some modification to build.
  • SQL DB loading is untested

Building

  • Install vcpkg
  • Run ./vcpkg install ryml:x86-windows-static or ./vcpkg install ryml:x64-windows-static inside the directory you installed vcpkg.
  • On an elevated terminal, run vcpkg integrate install
  • Build with MSBuild or Visual Studio normally

thanks to @vstumpf for his contribution on the CMake update.

Feedbacks and suggestions are welcome.

Lemongrass3110 added a commit that referenced this pull request Aug 6, 2021
Partial takeover from #5997
Did some further cleanup and took it out of the pull request until secret has time to finish it.

All credits to @secretdataz

Co-authored-by: secretdataz <secretdataz@users.noreply.github.com>
Lemongrass3110 added a commit that referenced this pull request Aug 6, 2021
Taken from #5997

Co-authored-by: secretdataz secretdataz@users.noreply.github.com
Lemongrass3110 added a commit that referenced this pull request Aug 8, 2021
Partial takeover from #5997
Did some further cleanup and took it out of the pull request until secret has time to finish it.

All credits to @secretdataz

Co-authored-by: secretdataz <secretdataz@users.noreply.github.com>
@Lemongrass3110
Copy link
Member

Lemongrass3110 commented Oct 24, 2021

Open TODOs that I see so far:

  • Fix cmake
  • Migrate the tools to also use ryml
  • Remove yaml-cpp
  • Add support for line number output in case of errors

@Lemongrass3110
Copy link
Member

Hello @biojppm!

Sorry for tagging you out of nowhere, but we could really use your help right now.
We are trying to switch from yaml-cpp to ryml with this pull request.

One issue we have is that we currently tell our users the file and the line number where a problem with a node occurs (not structural, but on data content level).
Sadly we were unable to find an equivalent to yaml-cpp's Mark function in your library.
Is this correct or are we just missing something?

Thank you in advance for your help!

@biojppm
Copy link

biojppm commented Oct 27, 2021

You can definitely get the error location when using rapidyaml. You can provide an error handler callback, which takes a ryml::Location argument containing the file, line, column and byte offset. Here's a working example from the quickstart, and here it is in the original source. Be sure to call the parse() overload that provides the filename.

Also as far as I recall, the default error handler will print the filename:line:col indication, or line:col if a filename is not given.

If any of what I wrote above does not hold, please open an issue to fix it.

@Lemongrass3110
Copy link
Member

Lemongrass3110 commented Oct 27, 2021

Hey!

Thank you for the quick response. We integrated that parse time error handler already.
This is working perfectly and just as you described.

The thing is that even if the YAML tree was parsed successfully it can be that our users entered wrong data and we report that to them so that they can correct it. For example if they provided a number that it out of range or whatever kind of validation you could think of.
So code wise we are not inside your parse routine anymore, but we already work with the ryml::Tree and are iterating over the NodeRefs and whatsoever.
And here we did not find a way to access the ryml::Location from where a special node inside the tree was read.
Hope this makes our usecase more clear now.

@biojppm
Copy link

biojppm commented Oct 27, 2021

OK, I see. Sorry, but ryml does not keep that info after parsing. But it could be done; please file an issue to track that.

In the meantime, if it is important enough, maybe you have a workaround. Since ryml keeps only views to the source buffer, if you pass a mutable buffer you can determine the line number given a key or a value by iterating through that buffer and counting newlines to find the line number. But note this can only work if your file has no multiline scalars (ie has no multiline scalars in any of the plain, single-quoted, double-quoted, literal or folded flavors). Since YAML dictates that multiline scalars must be transformed, the newlines inside the scalar are filtered out and this workaround will no longer work.

Edit: there is also an alternative to the scalar filtering problem if you prepare an offset->line lookup structure; @cschreib provided a very good example here.

aleos89 added 8 commits March 4, 2022 11:06
* Removes some remaining yaml-cpp pieces.
* Adds rapidyaml to common-minicore.
* Fixes the nodeExists check to make sure the node has children before attempting to deserialize.
* Adds a check to asType to make sure the node has a value assigned to it and report otherwise.
* Fixes the general error reporting from not displaying the line/column location.
* Fixes a potential crash with the achievement reward script parse.
* Fixes a couple potential crashes with the item combo database parsing.
* Restores yaml-cpp 3rdpartytool.
* Utilize yaml-cpp for the emitter features as it's more robust.
@biojppm
Copy link

biojppm commented Mar 9, 2022

@aleos89 just a quick note regarding the commit above saying yaml-cpp is more robust than rapidyaml. I'm curious to hear about your difficulties issues emitting with rapidyaml. Feel free to open an issue at https://github.com/rapidyaml/issues to discuss.

EDIT: it is really the right moment because I've already started work on controlling the emitter, and it will be improved in the coming weeks/months.

@aleos89
Copy link
Contributor

aleos89 commented Mar 9, 2022

@biojppm So what we use the emitter for is to load up some old CSV files to convert to YAML. After I converted all of our formatting from yaml-cpp to rapidyaml I ran into several issues with the way the CSV content is stored into NodeRef. yaml-cpp is directly assignable of any type: string, char, int, etc. When messing with rapidyaml I could only get my work to compile via c4::to_csubstr() which of course was converting to base64 and when I dumped it to the file I was getting Key: <hash>.

Here's an example of what I was playing around with:

static bool instance_readdb_sub(char* str[], int columns, int current) {
	body |= ryml::MAP;
	body["Id"] = c4::to_csubstr(str[0]);
	body["Name"] = c4::to_csubstr(str[1]);
	if (atoi(str[2]) != 3600)
		body["TimeLimit"] = c4::to_csubstr(str[2]);
	if (atoi(str[3]) != 300)
		body["IdleTimeOut"] = c4::to_csubstr(str[3]);

	ryml::NodeRef enter = body["Enter"];
	enter |= ryml::MAP;

	enter["Map"] = c4::to_csubstr(str[4]);
	enter["X"] = c4::to_csubstr(str[5]);
	enter["Y"] = c4::to_csubstr(str[6]);

	if (columns > 7) {
		ryml::NodeRef map = body["AdditionalMaps"];
		map |= ryml::MAP;

		for (int i = 7; i < columns; i++) {
			if (!strlen(str[i]))
				continue;

			if (strcmpi(str[4], str[i]) == 0)
				continue;

			map[c4::to_csubstr(str[i])] = "true";
		}
	}

	return true;
}

Unless I'm completely missing something from the quickstart.cpp, I couldn't find a way for us to dump our content properly. Also, is there a way to just create a NodeRef without a tree? I also had to create an empty parse_in_place call to generate a tree but I have no use for the tree since all I want to do is fill up a NodeRef variable and then dump that variable to a file. Thanks!

@lgtm-com
Copy link

lgtm-com bot commented Mar 10, 2022

This pull request introduces 21 alerts when merging f994bed into f28d207 - view on LGTM.com

new alerts:

  • 18 for Wrong type of arguments to formatting function
  • 1 for Comparison result is always the same
  • 1 for Short global name
  • 1 for Comparison of narrow type with wide type in loop condition

@lgtm-com
Copy link

lgtm-com bot commented Mar 11, 2022

This pull request introduces 3 alerts when merging 1af54a5 into 0ca5f45 - view on LGTM.com

new alerts:

  • 1 for Wrong type of arguments to formatting function
  • 1 for Comparison result is always the same
  • 1 for Comparison of narrow type with wide type in loop condition

@aleos89 aleos89 marked this pull request as ready for review March 11, 2022 16:35
@lgtm-com
Copy link

lgtm-com bot commented Mar 12, 2022

This pull request introduces 3 alerts when merging 88434e5 into 3be7377 - view on LGTM.com

new alerts:

  • 1 for Wrong type of arguments to formatting function
  • 1 for Comparison result is always the same
  • 1 for Comparison of narrow type with wide type in loop condition

@lgtm-com
Copy link

lgtm-com bot commented Mar 30, 2022

This pull request introduces 3 alerts when merging 8cd939b into 9bc1c53 - view on LGTM.com

new alerts:

  • 1 for Wrong type of arguments to formatting function
  • 1 for Comparison result is always the same
  • 1 for Comparison of narrow type with wide type in loop condition

@aleos89 aleos89 merged commit d1b7061 into master Mar 30, 2022
@aleos89 aleos89 deleted the feature/ryml branch March 30, 2022 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants