The Google Search Console is reporting broken image link errors for some of the pages on my legacy website. Many of these pages seem to be not indexed by Google, and I suspect the broken links may be the cause.
Here's the evidence: In the Console, I select one of the pages in which Googlebot has found an error. Then I click "Fetch as Google", and the following error is displayed: "Googlebot couldn't get all resources for this page". It lists one or more external image links from the page that are "not found". Actually, it's true that some of the external image paths are broken.
If I click "View as Search Result" for each defective page, the Console typically displays a blank search results page. I assume that means these pages have not been indexed by Google.
Here's the problem: Correcting a broken image path might seem easy, but in this case, it's not. My website has over 70,000 pages, with data pulled from a MySQL database containing hundreds of thousands of items. Each web page has multiple images linked from a product supplier's website. Most of the images are stored in the default image folder on supplier's website, but some of the images are stored in various other locations. Their locations are not predictable, and that is causing the problem.
This problem was anticipated from the start. Assuming that a percentage of the external image paths would inevitably be broken, each path is already coded with the following Javascript, to hide any ugly error messages:
<img src="http://www.product-supplier.com/default-image-folder/12345678.gif" alt="Image not available." onerror="javascript:this.style.display='none';" width="150">
This javascript allows all the product images to be displayed correctly on the webpage if their path is correct. But if the image path is faulty, then only a white space is displayed. Visually, this is acceptable for humans, but Googlebot doesn't understand the javascript, so it thinks the broken link is a required resource.
Here are my questions: Is there any way to prevent Googlebot from attempting to verify all the external image links? Can I indicate to Googlebot that the external image links don't matter? Is there any way to hide the image links from Googlebot?
If it's true that Google tends to not index any page with a broken link to an external image, then will it also not index a page with a broken link to an external web site? If so, that would create a powerful incentive not to link to external web pages, since we have no control over external web pages, and they are occasionally deleted.
Constraints:
- The supplier does not explain their criteria for storing some of their product images in non-standard locations on their website.
- The supplier does not provide a link for each image.
- Given the vast amount of product data, it's not feasible to comb through it to find each individual broken link.
- It would not be feasible to host all the images relevant to the supplier's constantly changing product catalog, as that would require too much ongoing maintenance.
- Therefore, a percentage of the image links will always be broken.
- My web pages are generated programmatically from my MySQL database, which is updated regularly with new data from the supplier.
- My programming knowledge is limited to some php and very little javascript. So please answer in simple terms. Thanks.