Google compliant option with -gc. Header Reference, Logos. Fixes #775, #594 , #564 , #563, #555 , #597 , #725 , #610#888
Conversation
merg Dev 19012026
…aptureAndDiscovery#775, KnowledgeCaptureAndDiscovery#594, KnowledgeCaptureAndDiscovery#564, KnowledgeCaptureAndDiscovery#563, KnowledgeCaptureAndDiscovery#555, KnowledgeCaptureAndDiscovery#597, KnowledgeCaptureAndDiscovery#725, KnowledgeCaptureAndDiscovery#610 555 Solved in previous commits 725 dockerile parse solved previosly. 610 type in documentation solved previously
…ils in result ordering.
…lready checked locally
| "programmingLanguage", | ||
| "releaseNotes", | ||
| "releaseDate" | ||
| } |
There was a problem hiding this comment.
Please move this to constants
There was a problem hiding this comment.
added to constants with a small comment:
Schema.org properties accepted by Google for software metadata.
Any property not in this set will be prefixed as codemeta.
Just for -gc or --google_codemeta_out flag
SCHEMA_ORG_PROPERTIES = {
"@type",
.......
|
|
||
| if constants.CAT_REQUIREMENTS in repo_data: | ||
| structured_sources = ["pom.xml", "requirements.txt", "setup.py", "environment.yml"] | ||
| structured_sources = ["pom.xml", "requirements.txt", "setup.py", "environment.yml", "pyproject.toml"] |
There was a problem hiding this comment.
This should be declared in the constants, not buried here
| if "affiliation" in result_owner and result_owner["affiliation"]: | ||
| author_obj["affiliation"] = result_owner["affiliation"] | ||
| if "email" in result_owner and result_owner["email"]: | ||
| author_obj["email"] = result_owner["email"] |
There was a problem hiding this comment.
The strings "email", "affiliation" etc should also be in constants. For example constants.PROP_AUTH_NAME
| if "username" in result_maint and result_maint["username"]: | ||
| maint_obj["identifier"] = result_maint["username"] | ||
| if "email" in result_maint and result_maint["email"]: | ||
| maint_obj["email"] = result_maint["email"] |
| return "SoftwareSourceCode" | ||
| if "system" in t: | ||
| return "SoftwareSystem" | ||
| return "SoftwareApplication" |
There was a problem hiding this comment.
All these strings should be categories in constants
| if value_type == "Release": | ||
| return value | ||
|
|
||
| if value_type == "Url": |
There was a problem hiding this comment.
Use constants for values (URL, releases)
| "reference design", | ||
| "reference to this repository", | ||
| "reference to this library", | ||
| "reference in your project", |
There was a problem hiding this comment.
I don't understand the one above. Remove it
|
|
||
| # false positives for bibliographic citations | ||
| if category == constants.CAT_CITATION: | ||
| negative_patterns = [ |
There was a problem hiding this comment.
negative patterns should be in constants
| if user_info.get("company"): | ||
| maintainer_data["affiliation"] = user_info.get("company") | ||
| if user_info.get("email"): | ||
| maintainer_data["email"] = user_info.get("email") |
There was a problem hiding this comment.
Again, all the constants should be in constants
| readme_source = "README.md" | ||
|
|
||
|
|
||
| print("Extracting regular expressions...") |
There was a problem hiding this comment.
No prints. Add it to the log
555 Solved in previous commits
725 dockerile parse solved previosly.
610 type in documentation solved previously