How to get around blocking access to pages with Chrome in headless mode

Some sites block Chrome in headless mode, and we'll look at how to get around that block.



Diagnostics is the key to all aspects of computers and programming. This article begins by showing you how to deal with this blocking problem yourself. If you are not interested, then you can go directly to the "Solution" item at the end of the article.



If you run into problems with headless mode, remember to take a screenshot via page.screenshot () to see what happens. This will, at a minimum, let you know if you are dealing with the same visible content that is displayed in "normal" (managed) browser mode, and also know if you are stuck in place due to a broken script without understanding anything.



image



In this example, the server itself did not even send the corresponding web page.



The initial answer is the Access Denied page, and that’s all you can get with Chrome in headless mode. Which doesn't happen in a controlled manner.



, , . , , , . , , , . , , .



? , , , , . , - . , -, , , . , , . – , .



HTTP-



( ) Chrome, headless-, Chrome, «» , , , , . , . , HTTP- (-), , headless-, , . http://scooterlabs.com/echo.json JSON-, , .



const puppeteer = require('puppeteer');

(async() => {
  const browser = await puppeteer.launch({
  });

  const page = (await browser.pages())[0];

  const response = await page.goto('http://scooterlabs.com/echo.json');

  console.log(await response.json());

  await browser.close();
})()


headless- ( ), «» ( headless:false ), , , .



image



time_utc – , . , , – , .



Accept-Language headless-. , - ( ), , . , – User-Agent.



User-Agent . , headless- :



image



Chrome , «Headless». User-Agent , . , , .



User-Agent - . , , .



( , )



, User-Agent. page.setUserAgent(). Chrome «» , , , : «Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36».



That's all there is to it. This is why the diagnostic approach itself is more important than this decision. These various kinds of obstacles always come up when trying to automate sites and often do not find specific answers on the Internet, so you will have to deal with them on your own. Good luck and feel free to contact me with any questions!




All Articles