It seems this the parser is creating errors even when none are expected:
=== INCOMING HTML ===
<math><mi></mi></math>
=== EXPECTED ERRORS ===
(none)
=== ACTUAL ERRORS ===
(1,12): unexpected-null-character
(1,1): expected-doctype-but-got-start-tag
(1,11): invalid-codepoint
This "passes" because the output tree still matches the expected output, but it is clearly not correct.
The test suite also doesn't seem to be checking errors for large swaths of the html5 test suite even with --check-errors, so it's hard to say how many would pass if those were checked.
Thanks for flagging this. Found multiple errors that are now fixed:
- The quoted test comes from justhtml-tests, a custom test suite added to make sure all parts of the algorithm are tested. It is not part of html5lib-tests.
- html5lib-tests does not support control characters in tests, which is why some of the tests in justhtml-tests exist in the first place. In my test suite I have added that ability to our test runner to make sure we handle control character correctly.
- In the INCOMING HTML block above, we are not printing control characters, they get filtered away in the terminal
- Both the treebuilder and the tokenizer are outputting errors for the found control character. None of them are in the right location (at flush instead of where found), and they are also duplicate.
- This being my own test suite, I haven't specified the correct errors. I should. expected-doctype-but-got-start-tag is reasonable in this case.
All of the above bugs are now fixed, and the test suite is in a better shape. Thanks again!
Hi! The expected errors are not standardized enough for it to make sense to enable --check-errors by default. If you look at the readme, you'll see that the only thing they're checking is that the _numbers of errors_ are correct.
run_tests.py does not appear to be checking the number of errors or the errors themselves for the tokenizer, encoding or serializer tests from html5lib-tests - which represent the majority of tests.
There's also something off about your benchmark comparison. If one runs pytest on html5lib, which uses html5lib-test plus its own unit tests and does check if errors match exactly, the pass rate appears to be much higher than 86%:
These numbers are inflated because html5lib-tests/tree-construction tests are run multiple times in different configurations. Many of the expected failures appear to be script tests similar to the ones JustHTML skips.
I've checked the numbers for html5lib, and they are correct. They are skipping a load of tests for many different reasons, one being that namespacing of svg/math fragments are not implemented. The 88% number listed is correct.
The test suite also doesn't seem to be checking errors for large swaths of the html5 test suite even with --check-errors, so it's hard to say how many would pass if those were checked.