Skip to content

Conversation

curlymorphic
Copy link
Contributor

Added a static function fileTypeFromData to the DataFile class. This opens the given file and checks the xml for its file type, as opposed to relying on the file extension. I have implemented the use of this check in the file browser, checking files before they are previewed.

fixes #1404

…eTypeFromData to the DataFile class. This opens the given file and checks the xml for its file type, as oposed to relying on the file extension
@tresf
Copy link
Member

tresf commented Feb 23, 2015

This looks like it will work, but I fear it might be in the wrong place structurally...

Won't this add a redundant file read to every file opened?

@curlymorphic
Copy link
Contributor Author

Won't this add a redundant file read to every file opened?

yes it will. I am unsure where else i can put the check, and keep the function reusable. I will have another look if i can integrate it so we only read the file once. wish me luck, have been trying that for about 5 hours now, so any suggestions would be greatly appreciated :)

@curlymorphic
Copy link
Contributor Author

I have an alternative, but am going to sleep on it,

@curlymorphic
Copy link
Contributor Author

I have looked into this more, with the aid of @badosu.

The alternative that i came up with, to validate the data, while reading the file only once. is a total hack.
https://github.com/curlymorphic/lmms/compare/xmlTypeCheck?expand=1
I feel it is much cleaner to do the validation before we start creating tracks and playhandles. Over time i feel it would be a good idea to do validation on all loaded xml files, preferably at a stage where canceling the operation can be handled cleanly.
The fileTypeFromData() would be an ideal place to add validation code, returning DataFile::Type::UnknownType for incorrect xml, to stop attempts to load these files, resulting in segfaults.

This reading the data twice, may not have a large impact on performance. The data should be stored in a cache by the os, and the xml files are tiny, so the mechanical action of the read from disk is only performed once.

@tresf
Copy link
Member

tresf commented Feb 25, 2015

Right... but the underlying problem is that the software shouldn't be forced to instantiate items that are invalid or unwanted. This is the exact problem we had with VST previews, so I find this type of solution to be far superior.

We don't really know what our XML files are going to look like. Some of the VSTs store quite a large chuck of VST data in them, so although the footprint of the XML read seems minor, I feel it could cause performance issues down the road. Edit: Besides, the problem XML files likely aren't even created with LMMS, so reading the entire file into memory for a sanity check that can be checked later is quite a brute force way to address this problem.

Question... Since this only addresses playback... what happens when this invalid XML file is then dragged into the project or added as a new track?

@curlymorphic
Copy link
Contributor Author

what happens when this invalid XML file is then dragged into the project or added as a new track?

segfault

I feel it is important to do some validation on all xml files before loading. As for performance, the loading of xml files only takes up a tiny percentage of the time lmms is running, never done during rendering, usually on project load, or when trying out new instruments were a tiny delay is not noticeable.

The other option is to rewrite the code where xml is loaded to load the data at a more suitable moment, so the process can be aborted cleanly if the checks fail. a quick grep of out codebase shows DataFile(), the function that reads the data, to be used in 67 locations. I have not looked at all these locations, but to change that much code is likely to bring a whole new collections of bugs.

@curlymorphic
Copy link
Contributor Author

Or maybe i just find a better way to read only what we need from the file.....

@tresf
Copy link
Member

tresf commented Feb 25, 2015

Or maybe i just find a better way to read only what we need from the file.....

That is problematic too because you either 1. Parse the entire DOM for sanity checking (which reads the whole file), or 2. You read it line by line, which would help filter out a high percentage of bad formatting, but could still break it if the document were corrupt in other places. What we need to do is read it all in at once but have some way to break out of the processing logic when it fails.

In other languages, I've usually raised exceptions for these things to throw the problem up the stack without crashes, but that's a very high-level (and hypothetical) solution to a low-level problem. :)

-Tres

@curlymorphic
Copy link
Contributor Author

Im out of ideas for this one for now. I will have to have another think

@curlymorphic
Copy link
Contributor Author

I had another go, this time only reading the xml once, then passing the DataFile onto the functions where the data used to be loaded.

This is not complete, but i would like some feedback if this is going in the right direction. Atm this checks all xml files loaded via the FileBrowser, iirc the only 2 places left would be the menu options, open, and open recent.
It will also be good to add further checks in the validate method , @badosu mentioned about XSD, so i think i might have to have a look into this,

case Type::SongProject:
if( extension == "mmp" || extension == "mmpz" )
{
result = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return true; on these would make the select/case a bit cleaner. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this one :-)

@tresf
Copy link
Member

tresf commented Feb 26, 2015

this time only reading the xml once, then passing the DataFile onto the functions where the data used to be loaded.

Yes! 👍

@badosu mentioned about XSD, so i think i might have to have a look into this,

Hmm... Don't beat yourself up over it just yet. Amadeus is right in the fact that XSDs are good for establishing standards and performing validations, but IIRC they are more appropriate when the XML is provided by various 3rd parties (i.e. supply one to Ardour for an "Export to LMMS" feature). Please correct me if I'm wrong here.

techtarget.com "XSD (XML Schema Definition), a Recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. This description can be used to verify that each item of content in a document adheres to the description of the element in which the content is to be placed."

@curlymorphic
Copy link
Contributor Author

image

Can I have some feedback on the dialogbox please?

@Sti2nd
Copy link
Contributor

Sti2nd commented Feb 26, 2015

What does invalid mean? I know what the English word mean, but does invalid mean that it is not a mmpz or mmp, for example? Or is the xml file actually broken and no xml viewer can open it?

@curlymorphic
Copy link
Contributor Author

These checks are looking at the filename extension, then reading in the following xml

<lmms-project version="1.0" creator="LMMS" creatorversion="1.1.0" type="instrumenttracksettings">

and checking the type flag matches the file extension, this is not full validation, but stops incorrectly named files being opened. This would also be the ideal place in the code to put future validations.

@tresf
Copy link
Member

tresf commented Feb 26, 2015

I would recommend:

  • Title: "Error"
  • Message: "<filename> does not appear to be a valid <project|preset|etc>"

We should also be able to get away with just the file name (no path, no extension) but as long as we're consistent with other dialogs, it is a moot point.

-Tres

@tresf
Copy link
Member

tresf commented Feb 26, 2015

To clarify a bit.... "corrupt" is incorrect. "Invalid project" or "Invalid preset" is more accurate as it might be a valid and non-corrupted file that was just accidentally opened with LMMS or had the extension mixed up/misassociated.

@curlymorphic
Copy link
Contributor Author

Cheers Tres, thats what i was hoping for :)

@@ -461,7 +462,16 @@ void FileBrowserTreeWidget::mousePressEvent(QMouseEvent * me )
( f->handling() == FileItem::LoadAsPreset ||
f->handling() == FileItem::LoadByPlugin ) )
{
m_previewPlayHandle = new PresetPreviewPlayHandle( f->fullName(), f->handling() == FileItem::LoadByPlugin );
DataFile dataFile( f->fullName() );
if( !dataFile.validate( f->extension() ) )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am glad you were able to put this logic where it belongs to. 👍

@curlymorphic
Copy link
Contributor Author

Please dont merge this yet, im looking into that redundant boxing, you commented between me looking, and doing the push

return true;
}
break;
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is case we're unaware of, wouldn't it be better actually to return false? Just wondering.

Also, if this is the case, we can delete this case statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reinstated the default: case statement, to remove warnings that were preventing travis from building

@curlymorphic
Copy link
Contributor Author

This should address all the above issues

@badosu
Copy link
Contributor

badosu commented Feb 26, 2015

👍

@@ -58,6 +58,8 @@ class EXPORT DataFile : public QDomDocument

virtual ~DataFile();

bool validate( QString extension );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already have the extension information when the object is instantiated via the full name?

I guess we don't really need this argument, unless you want to split this logic into instance and class methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the DataFile in it's current state does not store the filename, I am trying to be as non invasive as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough, this can be changed later if it's an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this further, It is possible to create a DataFile, directly from a ByteArray of the xml. in this case the DataFile would never know the filename.

@badosu
Copy link
Contributor

badosu commented Feb 26, 2015

@curlymorphic Just something that I noticed is that even though the commit messages improved a lot (thanks) you are using really long strings.

The recommended would be to make them short and any additional information can be added on a new paragraph separated by a new line. Here's a nice article on this: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

Note that this is not really the biggest issue in the world, but a nice thing to take a look at.

@curlymorphic
Copy link
Contributor Author

Note that this is not really the biggest issue in the world, but a nice thing to take a look at.

I am up for improving :)

@tresf
Copy link
Member

tresf commented Mar 7, 2015

@curlymorphic is this ready to merge? Travis doesn't look too happy per:

/home/travis/build/LMMS/lmms/src/core/DataFile.cpp: In member function ‘bool DataFile::validate(QString)’:
/home/travis/build/LMMS/lmms/src/core/DataFile.cpp:168:8: error: enumeration value ‘DragNDropData’ not handled in switch [-Werror=switch]
/home/travis/build/LMMS/lmms/src/core/DataFile.cpp:168:8: error: enumeration value ‘ClipboardData’ not handled in switch [-Werror=switch]
/home/travis/build/LMMS/lmms/src/core/DataFile.cpp:168:8: error: enumeration value ‘JournalData’ not handled in switch [-Werror=switch]
/home/travis/build/LMMS/lmms/src/core/DataFile.cpp:168:8: error: enumeration value ‘EffectSettings’ not handled in switch [-Werror=switch]
/home/travis/build/LMMS/lmms/src/core/DataFile.cpp:168:8: error: enumeration value ‘TypeCount’ not handled in switch [-Werror=switch]
cc1plus: all warnings being treated as errors
make[2]: *** [src/CMakeFiles/lmmsobjs.dir/core/DataFile.cpp.o] Error 1
make[1]: *** [src/CMakeFiles/lmmsobjs.dir/all] Error 2

@curlymorphic
Copy link
Contributor Author

hmm. I don't think these git --amend --no-edits are helping. There seems to be a sync issue with what i think i'm pushing, and what travis is building.

@curlymorphic
Copy link
Contributor Author

Any ideas why this passes on travis on my fork, but not on LMMS/lmms?

https://travis-ci.org/curlymorphic/lmms/jobs/53462684

@curlymorphic
Copy link
Contributor Author

After some research, i know what i have done wrong. I tried to use git commit --amend -- no--edit to add changes to the pull request, with out having another entry in the log. I have now found out you should not use this option once a pull request has been issued. In my attempt to keep things tidy, I appear to have made a mess. I am not sure of a way out of this, other than issuing this again as another pull request. is that how you want me to proceed?

@tresf
Copy link
Member

tresf commented Mar 8, 2015

Can't you just add another bogus commit?

@curlymorphic
Copy link
Contributor Author

Can't you just add another bogus commit?

hoping so 👍

@curlymorphic
Copy link
Contributor Author

looks like a no :( , that last commit passed travis on all builds on my fork

@tresf
Copy link
Member

tresf commented Mar 8, 2015

No worries... It's just Travis. If you say it passes on your fork, no need beating yourself up over it.

@curlymorphic
Copy link
Contributor Author

image

Not sure how to proceed?

@tresf
Copy link
Member

tresf commented Mar 8, 2015

Does the code appear fine in the PR or is stuff missing?

@curlymorphic
Copy link
Contributor Author

I would be much happier with a pass from travis, I think the --amend --no-edit is messing with the diffs.
Im going close this pull request, and issue a fresh one, I will #link them

@tresf
Copy link
Member

tresf commented Mar 8, 2015

👍

@tresf
Copy link
Member

tresf commented Mar 17, 2015

@curlymorphic looks like this is possibly related to some issues with Zyn's preview feature... #1180

Can you help us isolate the cause?

@curlymorphic
Copy link
Contributor Author

Correct, possibly causing

I will use the word probably :), Taking a look now

@curlymorphic
Copy link
Contributor Author

Going to have to comeback to this, it's taken more time than I have tonight, To put minds at rest, it is this pull request thats causing the issue with previewing zyn files. It looks my efforts to read the data before creating the PresetPreviewPlayHandle is going to need some more work.

@tresf
Copy link
Member

tresf commented Mar 18, 2015

per @softrabbit this might be as easy as checking for XIZ file extension and treating as XML (rather than compressed XML).

-Tres

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants