You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running a setup where my site is deployed to http://localhost:8081/ during my CI runs. Unlighthouse is configured to run against this url.
I also provide a sitemap which is automatically generated during build and points to the actual url https://example.com/.
The issue now is that unlighthouse will refuse to parse and use the sitemap because the origin is different (as stated by the logs). This results in all routes that nested deeper to be missed due to unlighthouse not knowing these exist.
Suggested solution
I would love to have the option to reconfigure unlighthouse so that any routes in the sitemap get re-written to match a given override. It may look similar to this:
The parameter name could be anything in this regard (sitemap_origin, sitemap_override, ...).
Alternatively, unlighthouse could automatically try to use the sitemap, replacing the origin with the given site. However, I would prefer an explicit solution over some implicit replacement.
Alternative
A current workaround is to add all possible routes to the config in advance. However, this is very tedious and doesn't scale well with sites that generate routes on the fly, for example when content collections are being used.
Additional context
Dynamically reading the links from the pages is not an option in this context, as the links are already pointing to https://example.com/ when the site finishes building.
Feel free to reach out for further details, I can provide the project infos if necessary.
The text was updated successfully, but these errors were encountered:
I've pushed up a fix to support async functions for the config definition so you can just fetch the sitemap, parse the URLs and transform them however you like. I think this is the best solution to reduce the maintenance of the project.
@harlan-zw While the above workaround works, IMO, it is still beneficial to expose an option to disable the sitemap origin checking.
One of the common use cases of Unlighthouse is to in corporate it into the CI/CD pipeline and auto trigger Unlighthouse after a successful deployment had been made. Most deployment platforms will generate a unique URL for each deployment and naturally that will be the URL targeted by Unlighthouse. However, that URL is not necessary the main entry point of the site, and the sitemap URL specified in robots.txt generally uses the main entry URL.
E.g. the main entry point can be example.com, while the auto generated one is example-.vercel.app.
So the sitemap origin different from the site origin can be fairly common
Clear and concise description of the problem
I am running a setup where my site is deployed to
http://localhost:8081/
during my CI runs. Unlighthouse is configured to run against this url.I also provide a sitemap which is automatically generated during build and points to the actual url
https://example.com/
.The issue now is that unlighthouse will refuse to parse and use the sitemap because the origin is different (as stated by the logs). This results in all routes that nested deeper to be missed due to unlighthouse not knowing these exist.
Suggested solution
I would love to have the option to reconfigure unlighthouse so that any routes in the sitemap get re-written to match a given override. It may look similar to this:
The parameter name could be anything in this regard (
sitemap_origin
,sitemap_override
, ...).Alternatively, unlighthouse could automatically try to use the sitemap, replacing the origin with the given site. However, I would prefer an explicit solution over some implicit replacement.
Alternative
A current workaround is to add all possible routes to the config in advance. However, this is very tedious and doesn't scale well with sites that generate routes on the fly, for example when content collections are being used.
Additional context
Dynamically reading the links from the pages is not an option in this context, as the links are already pointing to
https://example.com/
when the site finishes building.Feel free to reach out for further details, I can provide the project infos if necessary.
The text was updated successfully, but these errors were encountered: