Insufficient Regex
In the UrlParser::parseRows method, there is a regex which checks for URLs. URLs like https://www.hwk-muenchen.de/artikel/exporteinsteiger-74,4028,7642.html are not extracted correctly as in the last group in the regex, there is no comma (,) included. By adding it there, the URL gets extracted correctly and not trimmed to https://www.hwk-muenchen.de/artikel/exporteinsteiger-74 like it is done now. Using another generic scheme at least matches this case: https://regex101.com/r/8Nw0Mh/1 But maybe there are other cases it needs to match, too.